Re: [R] Merging Issue
Don't use HTML on sending email- messes up the data. What do you mean that you get lots of duplicates? If you have duplicated entries in df2 this will lead to dups because of the way merge works (here is the help file): If there is more than one match, all possible matches contribute one row each. For the precise meaning of ‘match’, see ‘match’. So you need to define the problem that you want to solve in going the merge. Here is what happens in your data if I duplicate some entries in df2; is this what you are seeing: > #Data A > Subject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "5", "5") > dates<-seq(as.Date('2011-01-01'),as.Date('2011-01-12'),by = 1) > deps<-c("A", "B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D") > df <- data.frame(Subject, dates, deps) > ## > #Data B > loc<-c("CA","NY", "CA", "NY", "WA", "WA", 'yy') > grp<-c("DE", "OC", "DE", "OT", "DE", "OC", "xx") > deps<-c("A","B","C", "D", "F","G", "A") > df2<-data.frame(loc, grp, deps ) > dat<-merge(df, df2, by="deps") > > dat deps Subject dates loc grp 1 A 2 2011-01-01 CA DE 2 A 2 2011-01-01 yy xx 3 A 3 2011-01-06 CA DE 4 A 3 2011-01-06 yy xx 5 A 5 2011-01-11 CA DE 6 A 5 2011-01-11 yy xx 7 A 5 2011-01-09 CA DE 8 A 5 2011-01-09 yy xx 9 B 2 2011-01-02 NY OC 10C 3 2011-01-04 CA DE 11C 2 2011-01-03 CA DE 12D 5 2011-01-12 NY OT 13D 3 2011-01-05 NY OT 14F 5 2011-01-10 WA DE 15F 4 2011-01-07 WA DE 16G 4 2011-01-08 WA OC Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Fri, Jun 17, 2016 at 8:33 PM, Farnoosh Sheikhi via R-help < r-help@r-project.org> wrote: > Hi all, > I have two data sets similar like below and wanted to merge them with > variable "deps". As this is a sample data with small sample size, I don't > have any problem using command merge. However, the actual data set has > ~60,000 observations with a lot of repeated measures. For example, for a > given ID I have 100 different dates and groups. Thee problem is using > "merge" command gives me a lot of duplicates that I can't even track. I was > wondering if there is any other way to merge such a data.Any help is > appreciated. Thanks. > ## Data ASubject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", > "5", "5")dates<-seq(as.Date('2011-01-01'),as.Date('2011-01-12'),by = > 1) deps<-c("A", "B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D")df <- > data.frame(Subject, dates, deps) > ## Data Bloc<-c("CA","NY", "CA", "NY", "WA", "WA")grp<-c("DE", "OC", "DE", > "OT", "DE", "OC")deps<-c("A","B","C", "D", "F","G")df2<-data.frame(loc, > grp, deps ) > dat<-merge(df, df2, by="deps") > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Merging Issue
Hi all, I have two data sets similar like below and wanted to merge them with variable "deps". As this is a sample data with small sample size, I don't have any problem using command merge. However, the actual data set has ~60,000 observations with a lot of repeated measures. For example, for a given ID I have 100 different dates and groups. Thee problem is using "merge" command gives me a lot of duplicates that I can't even track. I was wondering if there is any other way to merge such a data.Any help is appreciated. Thanks. ## Data ASubject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "5", "5")dates<-seq(as.Date('2011-01-01'),as.Date('2011-01-12'),by = 1) deps<-c("A", "B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D")df <- data.frame(Subject, dates, deps) ## Data Bloc<-c("CA","NY", "CA", "NY", "WA", "WA")grp<-c("DE", "OC", "DE", "OT", "DE", "OC")deps<-c("A","B","C", "D", "F","G")df2<-data.frame(loc, grp, deps ) dat<-merge(df, df2, by="deps") [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging issue.........
Try the merge function ?merge in1 = id trait1 110.2 211.1 39.7 610.2 78.9 10 9.7 11 10.2 in2 = id trait2 1 9.8 2 10.8 4 7.8 5 9.8 6 10.1 1210.2 1310.1 data1 = read.table(textConnection(in1), header=T) data2 = read.table(textConnection(in2), header=T) mymerge = merge(data1,data2,all.x=TRUE) print(mymerge) karena wrote: hi, I have a question about merging two files. For example, I have two files, the first file is like the following: id trait1 110.2 211.1 39.7 610.2 78.9 10 9.7 11 10.2 The second file is like the following: idtrait2 1 9.8 2 10.8 4 7.8 5 9.8 6 10.1 1210.2 1310.1 now I want to merge the two files by the variable id, I only want to keep the ids which show up in the first file. Even the id does not show up in the second file, it doesn't matter, I can keep the missing values. So my question is: how can I merge the two files and keep only the rows whose id show up in the first file? I know how to do it is SAS, just use the following code: merge data1(in=in1) data2(in=in2); by id; if in1; but I really have no idea about how to do it in R. thank you in advance, karean -- View this message in context: http://n4.nabble.com/merging-issue-tp1013356p1013375.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merging issue.........
hi, I have a question about merging two files. For example, I have two files, the first file is like the following: id trait1 110.2 211.1 39.7 610.2 78.9 10 9.7 11 10.2 The second file is like the following: idtrait2 1 9.8 2 10.8 4 7.8 5 9.8 6 10.1 1210.2 1310.1 now I want to merge the two files by the variable id, I only want to keep the ids which show up in the first file. Even the id does not show up in the second file, it doesn't matter, I can keep the missing values. So my question is: how can I merge the two files and keep only the rows whose id show up in the first file? I know how to do it is SAS, just use the following code: merge data1(in=in1) data2(in=in2); by id; if in1; but I really have no idea about how to do it in R. thank you in advance, karean -- View this message in context: http://n4.nabble.com/merging-issue-tp1013356p1013356.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging issue.........
Hi Karean, If your first object is called obj1 and the second called obj2, then: merge(obj1, obj2, all.x=TRUE) id trait1 trait2 1 1 10.29.8 2 2 11.1 10.8 3 39.7 NA 4 6 10.2 10.1 5 78.9 NA 6 109.7 NA 7 11 10.2 NA Hope this helps, Adrian On Wednesday 13 January 2010, karena wrote: hi, I have a question about merging two files. For example, I have two files, the first file is like the following: id trait1 110.2 211.1 39.7 610.2 78.9 10 9.7 11 10.2 The second file is like the following: idtrait2 1 9.8 2 10.8 4 7.8 5 9.8 6 10.1 1210.2 1310.1 now I want to merge the two files by the variable id, I only want to keep the ids which show up in the first file. Even the id does not show up in the second file, it doesn't matter, I can keep the missing values. So my question is: how can I merge the two files and keep only the rows whose id show up in the first file? I know how to do it is SAS, just use the following code: merge data1(in=in1) data2(in=in2); by id; if in1; but I really have no idea about how to do it in R. thank you in advance, karean -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd. 050025 Bucharest sector 5 Romania Tel.:+40 21 3126618 \ +40 21 3120210 / int.101 Fax: +40 21 3158391 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging issue.........
Did you consider to look at the help page for merge? h At 22:01 13.01.2010, karena wrote: hi, I have a question about merging two files. For example, I have two files, the first file is like the following: id trait1 110.2 211.1 39.7 610.2 78.9 10 9.7 11 10.2 The second file is like the following: idtrait2 1 9.8 2 10.8 4 7.8 5 9.8 6 10.1 1210.2 1310.1 now I want to merge the two files by the variable id, I only want to keep the ids which show up in the first file. Even the id does not show up in the second file, it doesn't matter, I can keep the missing values. So my question is: how can I merge the two files and keep only the rows whose id show up in the first file? I know how to do it is SAS, just use the following code: merge data1(in=in1) data2(in=in2); by id; if in1; but I really have no idea about how to do it in R. thank you in advance, karean -- View this message in context: http://n4.nabble.com/merging-issue-tp1013356p1013356.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging issue.........
thank you very much! -- View this message in context: http://n4.nabble.com/merging-issue-tp1013356p1013433.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.