Hi Dan, Please ignore my previous question. I figured it out by specifying all.x=FALSE.
Have a great day! Allen On Wed, Oct 8, 2008 at 8:28 PM, ss <[EMAIL PROTECTED]> wrote: > Hi Dan, > > Thanks a lot for this! It works but I have some minor issues. > Let's take the working example for instance, which x is of > 100 rows and y of 120 rows. > > After merge(), z is of 120 rows and is sorted by the ascending order > of "SEARCH_KEY1". > > The following is what I got out of the working example: > > > z[1:4,1:3] > SEARCH_KEY1 PROBE_ID1 PROBE_ID2 > 1 1 0.02816032 0.3202740 > 2 2 -1.96321867 -0.7636817 > 3 3 0.01571277 -0.9202248 > 4 4 0.39734092 0.4306223 > > > > > By any chance, I could get a "merged" z but it does not contain > any "SEARCH_KEY" inherited from data frame "y"? Because ideally, I just > want to substitute x$PROBE_ID1 with y$PROBE_ID2 based on the > common "SEARCH_KEY" (meaning when "x$SEARCH_KEY1=y$SEARCH_KEY2") > but I don't want to increase the rows of x. Any thoughts on this? > > I appreciate your help and have a good evening! > > Best, > Allen > > > > > > On Wed, Oct 8, 2008 at 6:00 PM, Daniel Malter <[EMAIL PROTECTED]> wrote: > >> Hi Allen, look at the following working example: >> >> x=data.frame(rnorm(100),1:100) >> names(x)=c("PROBE_ID1","SEARCH_KEY1") >> y=data.frame(rnorm(120),1:120) >> names(y)=c("PROBE_ID2","SEARCH_KEY2") >> >> z=merge(x,y,by.x="SEARCH_KEY1",by.y="SEARCH_KEY2",all.x=T,all.y=T) >> z # check the resulting data frame it should have 100 rows and contain >> SEARCH_KEY1, PROBE_ID1, and PROBE_ID2 >> >> That is, you have to put SEARCH_KEY1 and SEARCH_KEY2 in quotes, which I >> forgot in my previous email. >> >> Cheers, >> Daniel >> >> >> ------------------------- >> cuncta stricte discussurus >> ------------------------- >> >> >> ------------------------------ >> *Von:* ss [mailto:[EMAIL PROTECTED] >> *Gesendet:* Wednesday, October 08, 2008 5:00 PM >> *An:* Daniel Malter >> *Cc:* R help >> *Betreff:* Re: [R] How to join the two tables based on one overlapped >> column >> >> >> Dear Daniel, >> >> Thank you very much for the help! >> >> I tried the code and got the following: >> >> > >> John<-read.table(file="John_probe.txt",header=TRUE,row.names=NULL,fill=TRUE) >> > >> Susan<-read.table(file="Susan_probe.txt",header=TRUE,row.names=NULL,fill=TRUE) >> > dim(John) >> [1] 48701 2 >> > dim(Susan) >> [1] 46713 2 >> > dataToMerge=data.frame(John$PROBE_ID2, John$SEARCH_KEY2) >> > >> mergedData=merge(Susan,dataToMerge,by.x=SEARCH_KEY1,by.y=SEARCH_KEY2,all.x=T,all.y=F) >> Error in fix.by(by.x, x) : object "SEARCH_KEY1" not found >> > >> >> >> I modified the last one by specifying the file name and got : >> >> > >> mergedData=merge(Susan,dataToMerge,by.x=Susan$SEARCH_KEY1,by.y=John$SEARCH_KEY2,all.x=T,all.y=F) >> Error in fix.by(by.x, x) : 'by' must specify valid column(s) >> > >> >> Have you got any clue about this? >> >> Thanks much, >> Allen >> >> >> On Wed, Oct 8, 2008 at 11:46 AM, Daniel Malter <[EMAIL PROTECTED]> wrote: >> >>> dataToMerge=data.frame(yourtablename2$PROBE_ID2, >>> yourtablename2$SEARCH_KEY2) >>> ##Puts the two columns of interest in dataset 2 in a separate data frame. >>> >>> >>> mergedData=merge(yourtablename1,dataToMerge,by.x=SEARCH_KEY1,by.y=SEARCH_KEY >>> 2,all.x=T,all.y=F) >>> ##merges the first table with the data frame just created looking for >>> matches between SEARCH_KEY1 and SEARCH_KEY2, all entries in dataset 1 are >>> retained (whether matched or not), entries in the dataToMerge dataframe >>> that >>> do not match any entries in dataset 1 are dropped. >>> >>> You then have an additional column in "mergedData" that contains the >>> PROBE_ID2 and you can just assign them to PROBE_ID1 (i.e. replace >>> PROBE_ID1 >>> by the values in this column). >>> >>> Cheers, >>> Daniel >>> >>> ------------------------- >>> cuncta stricte discussurus >>> ------------------------- >>> >>> -----Ursprüngliche Nachricht----- >>> Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] >>> Im >>> Auftrag von ss >>> Gesendet: Wednesday, October 08, 2008 11:33 AM >>> An: R help >>> Betreff: [R] How to join the two tables based on one overlapped column >>> >>> Dear list, >>> >>> I need some clues on this. I have two excel files and I basically want >>> to >>> map one to the other one. Can you give me some hints how to do it? >>> >>> The first excel file, named as "Susan_probe.xls", there are two columns, >>> "PROBE_ID1" and "SEARCH_KEY1" >>> >>> PROBE_ID1 SEARCH_KEY1 ILMN_30212 ILMN_30212 ILMN_1285 ILMN_1285 >>> ILMN_137964 ILMN_137964 ILMN_138109 ILMN_138109 ... >>> >>> The second excel file, named as "John_probe.xls", there are two columns >>> as >>> well, "PROBE_ID2" and "SEARCH_KEY2". >>> >>> PROBE_ID2 SEARCH_KEY2 ILMN_1809034 ILMN_16367 ILMN_1660305 ILMN_16583 >>> ILMN_1792173 ILMN_19158 ... >>> >>> There are 46713 rows in the first excel file and 49702 rows in the second >>> file. >>> >>> Probes in the first columns of two excel files are different but they can >>> be >>> matched based on the second column "SEARCH_KEY". So what I want to do is >>> to >>> substitute the "PROBE_ID1" in the "Susan_probe.xls" file with the >>> "PROBE_ID2" in the "John_probe.xls" based on their common "SEARCH_KEY". >>> >>> Thank you so much for your help. I really appreciate. >>> >>> All the best, >>> Allen >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.