Re: [R] Merging Issue

2016-06-18 Thread jim holtman
Don't use HTML on sending email- messes up the data.

What do you mean that you get lots of duplicates?  If you have duplicated
entries in df2 this will lead to dups because of the way merge works (here
is the help file):

 If there is more than one match, all possible matches contribute
 one row each.  For the precise meaning of ‘match’, see ‘match’.

So you need to define the problem that you want to solve in going the
merge.  Here is what happens in your data if I duplicate some entries in
df2; is this what you are seeing:

>  #Data A
>  Subject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "5", "5")
>  dates<-seq(as.Date('2011-01-01'),as.Date('2011-01-12'),by = 1)
>  deps<-c("A", "B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D")
>  df <- data.frame(Subject, dates, deps)
>  ##
>  #Data B
>  loc<-c("CA","NY", "CA", "NY", "WA", "WA", 'yy')
>  grp<-c("DE", "OC", "DE", "OT", "DE", "OC", "xx")
>  deps<-c("A","B","C", "D", "F","G", "A")
>  df2<-data.frame(loc, grp, deps )
>  dat<-merge(df, df2, by="deps")
>
> dat
   deps Subject  dates loc grp
1 A   2 2011-01-01  CA  DE
2 A   2 2011-01-01  yy  xx
3 A   3 2011-01-06  CA  DE
4 A   3 2011-01-06  yy  xx
5 A   5 2011-01-11  CA  DE
6 A   5 2011-01-11  yy  xx
7 A   5 2011-01-09  CA  DE
8 A   5 2011-01-09  yy  xx
9 B   2 2011-01-02  NY  OC
10C   3 2011-01-04  CA  DE
11C   2 2011-01-03  CA  DE
12D   5 2011-01-12  NY  OT
13D   3 2011-01-05  NY  OT
14F   5 2011-01-10  WA  DE
15F   4 2011-01-07  WA  DE
16G   4 2011-01-08  WA  OC



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Fri, Jun 17, 2016 at 8:33 PM, Farnoosh Sheikhi via R-help <
r-help@r-project.org> wrote:

> Hi all,
> I have two data sets similar like below and wanted to merge them with
> variable "deps". As this is a sample data with small sample size, I don't
> have any problem using command merge. However, the actual data set has
> ~60,000 observations with a lot of repeated measures. For example, for a
> given ID I have 100 different dates and groups. Thee problem is using
> "merge" command gives me a lot of duplicates that I can't even track. I was
> wondering if there is any other way to merge such a data.Any help is
> appreciated. Thanks.
> ## Data ASubject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5",
> "5", "5")dates<-seq(as.Date('2011-01-01'),as.Date('2011-01-12'),by =
> 1) deps<-c("A", "B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D")df <-
> data.frame(Subject, dates, deps)
> ## Data Bloc<-c("CA","NY", "CA", "NY", "WA", "WA")grp<-c("DE", "OC", "DE",
> "OT", "DE", "OC")deps<-c("A","B","C", "D", "F","G")df2<-data.frame(loc,
> grp, deps )
> dat<-merge(df, df2, by="deps")
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Merging Issue

2016-06-17 Thread Farnoosh Sheikhi via R-help
Hi all, 
I have two data sets similar like below and wanted to merge them with variable 
"deps". As this is a sample data with small sample size, I don't have any 
problem using command merge. However, the actual data set has ~60,000 
observations with a lot of repeated measures. For example, for a given ID I 
have 100 different dates and groups. Thee problem is using "merge" command 
gives me a lot of duplicates that I can't even track. I was wondering if there 
is any other way to merge such a data.Any help is appreciated. Thanks.
## Data ASubject<- c("2", "2", "2", "3", "3", "3", "4", "4", "5", "5", "5", 
"5")dates<-seq(as.Date('2011-01-01'),as.Date('2011-01-12'),by = 1) deps<-c("A", 
"B", "C", "C", "D", "A", "F", "G", "A", "F", "A", "D")df <- data.frame(Subject, 
dates, deps)
## Data Bloc<-c("CA","NY", "CA", "NY", "WA", "WA")grp<-c("DE", "OC", "DE", 
"OT", "DE", "OC")deps<-c("A","B","C", "D", "F","G")df2<-data.frame(loc, grp, 
deps )
dat<-merge(df, df2, by="deps")
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging issue.........

2010-01-13 Thread Pete B

Try the merge function
?merge

in1 = id trait1 
110.2 
211.1 
39.7 
610.2 
78.9 
10  9.7 
11  10.2 


in2 = id trait2 
1 9.8 
2 10.8 
4 7.8 
5 9.8 
6 10.1 
1210.2 
1310.1
 

data1 = read.table(textConnection(in1), header=T)
data2 = read.table(textConnection(in2), header=T)

mymerge = merge(data1,data2,all.x=TRUE)
print(mymerge)



karena wrote:
 
 hi, I have a question about merging two files.
 For example, I have two files, the first file is like the following:
 
 id   trait1
 110.2
 211.1
 39.7
 610.2
 78.9
 10  9.7
 11  10.2
 
 The second file is like the following:
 idtrait2
 1 9.8
 2 10.8
 4 7.8
 5 9.8
 6 10.1
 1210.2
 1310.1
 
 now I want to merge the two files by the variable id, I only want to
 keep the ids which show up in the first file. Even the id does not
 show up in the second file, it doesn't matter, I can keep the missing
 values. So my question is: how can I merge the two files and keep only the
 rows whose id show up in the first file?
 I know how to do it is SAS, just use the following code: 
 merge data1(in=in1) data2(in=in2);
 by id;
 if in1;
 
 but I really have no idea about how to do it in R.
 
 thank you in advance,
 
 karean 
 

-- 
View this message in context: 
http://n4.nabble.com/merging-issue-tp1013356p1013375.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merging issue.........

2010-01-13 Thread karena

hi, I have a question about merging two files.
For example, I have two files, the first file is like the following:

id   trait1
110.2
211.1
39.7
610.2
78.9
10  9.7
11  10.2

The second file is like the following:
idtrait2
1 9.8
2 10.8
4 7.8
5 9.8
6 10.1
1210.2
1310.1

now I want to merge the two files by the variable id, I only want to keep
the ids which show up in the first file. Even the id does not show up in
the second file, it doesn't matter, I can keep the missing values. So my
question is: how can I merge the two files and keep only the rows whose id
show up in the first file?
I know how to do it is SAS, just use the following code: 
merge data1(in=in1) data2(in=in2);
by id;
if in1;

but I really have no idea about how to do it in R.

thank you in advance,

karean 
-- 
View this message in context: 
http://n4.nabble.com/merging-issue-tp1013356p1013356.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging issue.........

2010-01-13 Thread Adrian Dusa
Hi Karean,

If your first object is called obj1 and the second called obj2, then:

 merge(obj1, obj2, all.x=TRUE)
  id trait1 trait2
1  1   10.29.8
2  2   11.1   10.8
3  39.7 NA
4  6   10.2   10.1
5  78.9 NA
6 109.7 NA
7 11   10.2 NA

Hope this helps,
Adrian

On Wednesday 13 January 2010, karena wrote:
 hi, I have a question about merging two files.
 For example, I have two files, the first file is like the following:
 
 id   trait1
 110.2
 211.1
 39.7
 610.2
 78.9
 10  9.7
 11  10.2
 
 The second file is like the following:
 idtrait2
 1 9.8
 2 10.8
 4 7.8
 5 9.8
 6 10.1
 1210.2
 1310.1
 
 now I want to merge the two files by the variable id, I only want to keep
 the ids which show up in the first file. Even the id does not show up
  in the second file, it doesn't matter, I can keep the missing values. So
  my question is: how can I merge the two files and keep only the rows whose
  id show up in the first file?
 I know how to do it is SAS, just use the following code:
 merge data1(in=in1) data2(in=in2);
 by id;
 if in1;
 
 but I really have no idea about how to do it in R.
 
 thank you in advance,
 
 karean
 


-- 
Adrian Dusa
Romanian Social Data Archive
1, Schitu Magureanu Bd.
050025 Bucharest sector 5
Romania
Tel.:+40 21 3126618 \
 +40 21 3120210 / int.101
Fax: +40 21 3158391

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging issue.........

2010-01-13 Thread Heinz Tuechler

Did you consider to look at the help page for merge?
h

At 22:01 13.01.2010, karena wrote:


hi, I have a question about merging two files.
For example, I have two files, the first file is like the following:

id   trait1
110.2
211.1
39.7
610.2
78.9
10  9.7
11  10.2

The second file is like the following:
idtrait2
1 9.8
2 10.8
4 7.8
5 9.8
6 10.1
1210.2
1310.1

now I want to merge the two files by the variable id, I only want to keep
the ids which show up in the first file. Even the id does not show up in
the second file, it doesn't matter, I can keep the missing values. So my
question is: how can I merge the two files and keep only the rows whose id
show up in the first file?
I know how to do it is SAS, just use the following code:
merge data1(in=in1) data2(in=in2);
by id;
if in1;

but I really have no idea about how to do it in R.

thank you in advance,

karean
--
View this message in context: 
http://n4.nabble.com/merging-issue-tp1013356p1013356.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging issue.........

2010-01-13 Thread karena

thank you very much!
-- 
View this message in context: 
http://n4.nabble.com/merging-issue-tp1013356p1013433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.