Re: [R] How to join the two tables based on one overlapped column

ss Thu, 09 Oct 2008 05:56:11 -0700

Hi Dan,

Please ignore my previous question. I figured it out by specifying
all.x=FALSE.


Have a great day!

Allen

On Wed, Oct 8, 2008 at 8:28 PM, ss <[EMAIL PROTECTED]> wrote:

> Hi Dan,
>
> Thanks a lot for this! It works but I have some minor issues.
> Let's take the working example for instance, which x is of
> 100 rows and y of 120 rows.
>
> After merge(), z is of 120 rows and is sorted by the ascending order
> of "SEARCH_KEY1".
>
> The following is what I got out of the working example:
>
> > z[1:4,1:3]
>   SEARCH_KEY1   PROBE_ID1  PROBE_ID2
> 1           1  0.02816032  0.3202740
> 2           2 -1.96321867 -0.7636817
> 3           3  0.01571277 -0.9202248
> 4           4  0.39734092  0.4306223
>
>
>
>
> By any chance, I could get a "merged" z but it does not contain
> any "SEARCH_KEY" inherited from data frame "y"? Because ideally, I just
> want to substitute x$PROBE_ID1 with y$PROBE_ID2 based on the
> common "SEARCH_KEY" (meaning when "x$SEARCH_KEY1=y$SEARCH_KEY2")
> but I don't want to increase the rows of x. Any thoughts on this?
>
> I appreciate your help and have a good evening!
>
> Best,
>      Allen
>
>
>
>
>
> On Wed, Oct 8, 2008 at 6:00 PM, Daniel Malter <[EMAIL PROTECTED]> wrote:
>
>>  Hi Allen, look at the following working example:
>>
>> x=data.frame(rnorm(100),1:100)
>> names(x)=c("PROBE_ID1","SEARCH_KEY1")
>> y=data.frame(rnorm(120),1:120)
>> names(y)=c("PROBE_ID2","SEARCH_KEY2")
>>
>> z=merge(x,y,by.x="SEARCH_KEY1",by.y="SEARCH_KEY2",all.x=T,all.y=T)
>> z # check the resulting data frame it should have 100 rows and contain
>> SEARCH_KEY1, PROBE_ID1,  and PROBE_ID2
>>
>> That is, you have to put SEARCH_KEY1 and SEARCH_KEY2 in quotes, which I
>> forgot in my previous email.
>>
>> Cheers,
>> Daniel
>>
>>
>> -------------------------
>> cuncta stricte discussurus
>> -------------------------
>>
>>
>>  ------------------------------
>> *Von:* ss [mailto:[EMAIL PROTECTED]
>> *Gesendet:* Wednesday, October 08, 2008 5:00 PM
>> *An:* Daniel Malter
>> *Cc:* R help
>> *Betreff:* Re: [R] How to join the two tables based on one overlapped
>> column
>>
>>
>> Dear Daniel,
>>
>> Thank you very much for the help!
>>
>> I tried the code and got the following:
>>
>> >
>> John<-read.table(file="John_probe.txt",header=TRUE,row.names=NULL,fill=TRUE)
>> >
>> Susan<-read.table(file="Susan_probe.txt",header=TRUE,row.names=NULL,fill=TRUE)
>> > dim(John)
>> [1] 48701     2
>> > dim(Susan)
>> [1] 46713     2
>> > dataToMerge=data.frame(John$PROBE_ID2, John$SEARCH_KEY2)
>> >
>> mergedData=merge(Susan,dataToMerge,by.x=SEARCH_KEY1,by.y=SEARCH_KEY2,all.x=T,all.y=F)
>> Error in fix.by(by.x, x) : object "SEARCH_KEY1" not found
>> >
>>
>>
>> I modified the last one by specifying the file name and got :
>>
>> >
>> mergedData=merge(Susan,dataToMerge,by.x=Susan$SEARCH_KEY1,by.y=John$SEARCH_KEY2,all.x=T,all.y=F)
>> Error in fix.by(by.x, x) : 'by' must specify valid column(s)
>> >
>>
>> Have you got any clue about this?
>>
>> Thanks much,
>>        Allen
>>
>>
>> On Wed, Oct 8, 2008 at 11:46 AM, Daniel Malter <[EMAIL PROTECTED]> wrote:
>>
>>> dataToMerge=data.frame(yourtablename2$PROBE_ID2,
>>> yourtablename2$SEARCH_KEY2)
>>> ##Puts the two columns of interest in dataset 2 in a separate data frame.
>>>
>>>
>>> mergedData=merge(yourtablename1,dataToMerge,by.x=SEARCH_KEY1,by.y=SEARCH_KEY
>>> 2,all.x=T,all.y=F)
>>> ##merges the first table with the data frame just created looking for
>>> matches between SEARCH_KEY1 and SEARCH_KEY2, all entries in dataset 1 are
>>> retained (whether matched or not), entries in the dataToMerge dataframe
>>> that
>>> do not match any entries in dataset 1 are dropped.
>>>
>>> You then have an additional column in "mergedData" that contains the
>>> PROBE_ID2 and you can just assign them to PROBE_ID1 (i.e. replace
>>> PROBE_ID1
>>> by the values in this column).
>>>
>>> Cheers,
>>> Daniel
>>>
>>> -------------------------
>>> cuncta stricte discussurus
>>> -------------------------
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
>>> Im
>>> Auftrag von ss
>>> Gesendet: Wednesday, October 08, 2008 11:33 AM
>>> An: R help
>>> Betreff: [R] How to join the two tables based on one overlapped column
>>>
>>> Dear list,
>>>
>>> I need some clues on this.  I have two excel files and I basically want
>>> to
>>> map one to the other one. Can you give me some hints how to do it?
>>>
>>> The first excel file, named as "Susan_probe.xls", there are two columns,
>>> "PROBE_ID1" and "SEARCH_KEY1"
>>>
>>>  PROBE_ID1 SEARCH_KEY1  ILMN_30212 ILMN_30212  ILMN_1285 ILMN_1285
>>> ILMN_137964 ILMN_137964  ILMN_138109 ILMN_138109 ...
>>>
>>> The second excel file, named as "John_probe.xls", there are two columns
>>> as
>>> well, "PROBE_ID2" and "SEARCH_KEY2".
>>>
>>>  PROBE_ID2 SEARCH_KEY2  ILMN_1809034 ILMN_16367  ILMN_1660305 ILMN_16583
>>> ILMN_1792173 ILMN_19158 ...
>>>
>>> There are 46713 rows in the first excel file and 49702 rows in the second
>>> file.
>>>
>>> Probes in the first columns of two excel files are different but they can
>>> be
>>> matched based on the second column "SEARCH_KEY". So what I want to do is
>>> to
>>> substitute the "PROBE_ID1" in the "Susan_probe.xls" file with the
>>> "PROBE_ID2" in the "John_probe.xls" based on their common "SEARCH_KEY".
>>>
>>> Thank you so much for your help. I really appreciate.
>>>
>>> All the best,
>>>      Allen
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to join the two tables based on one overlapped column

Reply via email to