?merge
Sent from my iPad
On Jun 14, 2013, at 0:51, Yasin Gocgun entropy...@gmail.com wrote:
Hi,
I have been struggling with the issue of merging data frames that have
common columns and have different dimensions. Although I made alot of
search about it on internet, I could not find any
Thanks for your responses.
I have already found that merge function performs what I am looking for.
On Fri, Jun 14, 2013 at 12:51 AM, Yasin Gocgun entropy...@gmail.com wrote:
Hi,
I have been struggling with the issue of merging data frames that have
common columns and have different
Hi,
I have been struggling with the issue of merging data frames that have
common columns and have different dimensions. Although I made alot of
search about it on internet, I could not find any function that would
efficiently perform the required operation. So I would appreciate if anyone
type ?merge in R
-
Yasir Kaheil
--
View this message in context:
http://r.789695.n4.nabble.com/Merging-Data-Frames-in-R-tp4636781p4636962.html
Sent from the R help mailing list archive at Nabble.com.
__
R-help@r-project.org mailing list
Hello!
I am running a loop. The result of each run of the loop is a data
frame. I am merging all the data frames.
For exampe:
The dataframe from run 1:
x-data.frame(a=1,b=2,c=3)
The dataframe from run 2:
y-data.frame(a=10,b=20,d=30)
What I want to get is:
merge(x,y,all.x=T,all.y=T)
Then I
Hi Dimitri,
I have some doubts whether storing the results of a loop in a data
frame and merging it with every run is the most efficient way of doing
things, but I do not know your situation. This does what you want, I
believe, but I suspect it could be quite slow. I worked around the
Thanks a lot, Joshua.
You might be right.
I am thinking of creating a list (as a placeholder) and then merging
the elements of the list.
Dimitri
On Tue, Nov 9, 2010 at 12:11 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:
Hi Dimitri,
I have some doubts whether storing the results of a loop in a
Dimitri -
Usually the easiest way to solve problems like this
is to put all the dataframes in a list, and then use
the Reduce() function to merge them all together at the
end. You don't give many details about how the data frames
are constructed, so it's hard to be specific about the
best way
Thanks a lot, Phil.
I decided to do it via the list - as you suggested, but had to do some
gymnastics, which Reduce will greatly help me to avoid now!
Dimitri
On Tue, Nov 9, 2010 at 12:36 PM, Phil Spector spec...@stat.berkeley.edu wrote:
Dimitri -
Usually the easiest way to solve problems
Hello,
This is a semi-complicated question about comparing two datasets,
probably using merge, but I am open to other ideas. I have a large
frame of information about companies. It's over 30,000 rows and looks
something like...
df1 -
identifier1 identifier2nameother_name
Hi,
is it possible to merge two data frames while preserving the row names of
the bigger data frame?
I have two data frames which i would like to combine. While doing so I
always loose the row names. When I try to append this, I get the error
message, that I have non-unique names. This although
Put the rownames as another column in your dataframe so that it
remains with the data. After merging, you can then use it as the
rownames
On Mon, Jun 14, 2010 at 9:25 AM, Assa Yeroslaviz fry...@gmail.com wrote:
Hi,
is it possible to merge two data frames while preserving the row names of
the
If you want to keep only the rows that are unique in the first column
then do the following:
workComb1 - subset(workComb, !duplicated(ProbeID))
On Mon, Jun 14, 2010 at 11:20 AM, Assa Yeroslaviz fry...@gmail.com wrote:
well, the problem is basically elsewhere. I have a data frame with
Hi Guys
I have two data frames which I would like to merge on two conditions.
I am doing the following (abstract form)
new.data.frame - merge(df1,df2, by=c(Col1,Col2))
It is giving me a null result.
Basically I need to apply two conditions.
I also tried sqldf but it is running forever. Will
On Apr 6, 2010, at 3:54 PM, Abhishek Pratap wrote:
Hi Guys
I have two data frames which I would like to merge on two conditions.
I am doing the following (abstract form)
new.data.frame - merge(df1,df2, by=c(Col1,Col2))
What does
str(df1) ; str(df2)
... show?
It is giving me a null
Hi David
Here it is. You can ignore the bio jargon if it sounds confusing. The
corresponding data type of column (SNP, chr) on which I am applying merge is
same.
merge(data_lane6_snps, data_lane6_snps_rsid , by = c(SNP,chr))
str(data_lane6_snps)
'data.frame': 7724462 obs. of 10 variables:
And I should also add that if I merge only on one column it works fine but
the result is not what I want.
merge(data_lane6_snps, data_lane6_snps_rsid , by = c(SNP) : works as
expected.
Is the chr column being a factor creating probs here ?
-A
On Tue, Apr 6, 2010 at 4:03 PM, Abhishek Pratap
On Apr 6, 2010, at 4:03 PM, Abhishek Pratap wrote:
Hi David
Here it is. You can ignore the bio jargon if it sounds confusing.
Sometimes it is essential to have domain details.
The corresponding data type of column (SNP, chr) on which I am
applying merge is same.
merge(data_lane6_snps,
Hi David
I can understand looking the SNP data values it can be felt that they are
different values and hence no result in merge. However the columns still
have ~700K SNPs common. What I am looking for is a merge where the SNP and
Chr matches. If I match only the SNP column I get partially
Just so you know
length(intersect(data_lane6_snps$SNP, data_lane6_snps_rsid$SNP))
796120
I just need to include the chr condition now where I am stuck.
-Abhi
On Tue, Apr 6, 2010 at 4:51 PM, Abhishek Pratap abhishek@gmail.comwrote:
Hi David
I can understand looking the SNP data values
OK, not the SNP's. So look at the chr's. I will bet that you get 0
when you try :
length(intersect(data_lane6_snps$chr, data_lane6_snps_rsid$chr))
... since one is using a format of chrNN and the other is using just
NN. You need to get the chromosome naming convention straightened out.
You got the error. It is different naming convention of chr. I should be
able to fix that pretty easily.
In case the problem persists, I will contact the list.
Thanks!
-Abhi
On Tue, Apr 6, 2010 at 5:01 PM, David Winsemius dwinsem...@comcast.netwrote:
OK, not the SNP's. So look at the chr's. I
Yes, indexing will typically make a large difference.
On Tue, Apr 6, 2010 at 3:54 PM, Abhishek Pratap abhishek@gmail.com wrote:
Hi Guys
I have two data frames which I would like to merge on two conditions.
I am doing the following (abstract form)
new.data.frame - merge(df1,df2,
David,
Now the code is:
for (j in seq_along(rwy)) { # subset the data and merge them
ar4rw = ar4rw - subset(arrgnd, arrgnd$Runway==rwy[j])
if(j == 1) {
arrw = ar4rw
}
else {
arrw = merge(arrw, ar4rw)
}
}
I attach the data. I needed 500 rows to get both runways in rwy.
The suggestions did not
On 2/1/2010 5:51 PM, David Winsemius wrote:
I figured this out finally. I really believe that the R help write-ups
are sorely lacking. As soon as I looked at
http://www.statmethods.net/management/merging.html, it was obvious:
Adding Columns
To merge two dataframes (datasets) horizontally,
James Rome wrote:
On 2/1/2010 5:51 PM, David Winsemius wrote:
I figured this out finally. I really believe that the R help write-ups
are sorely lacking.
The help docs are probably not the best way to learn R, but they are
great for users of the functions. I have found that after going
I agree. I have a foot of books on R now, for example the R Book by
Michael Crowly. But so far, Googling the archives of this list has been
the most help. Nonetheless, if I cannot understand the documentation of
a function, then the documentation needs to be updated. For example,
there needs to be
Yeah, sometimes the vocabulary we bring to a task does not match up
(or merge properly) with the vocabulary that the developers use. In
this case the merge operation is one that has a precise meaning in
database lingo, which apparently you do not have background in. My
experience in
Dear kind R helpers,
I have a vector of runway names in rwy (31R, 31L,... the number is
user selectable)
arrgnd is a data frame with data for all flights and all runways, with a
Runway column.
I am trying to subset arrgnd into a dat frame for each selected runway,
and then combine them back
On Feb 1, 2010, at 5:16 PM, James Rome wrote:
Dear kind R helpers,
I have a vector of runway names in rwy (31R, 31L,... the
number is user selectable)
arrgnd is a data frame with data for all flights and all runways,
with a Runway column.
I am trying to subset arrgnd into a dat frame
Hi,
I have faced a problem with the merge() function when trying to merge
two data frames that have a common index but the second one does not
have cases for all indexes in the first one. With usual variables R
fills in the missing cases with NA if all=T is requested. But if the
variable is a
This has something to do with your data.frame structure
see
str(df1)
'data.frame': 3 obs. of 2 variables:
$ a : int 1 2 3
$ X1: 'AsIs' int [1:3, 1:2] 1 2 3 4 5 6
str(df2)
'data.frame': 2 obs. of 2 variables:
$ a : int 1 2
$ X2: 'AsIs' int [1:2, 1:2] 11 12 13 14
This seems to work
Yes, that was the original question: when a variable in a data frame is
a matrix instead of an ordinary variable merge() handles the missing
cases so that only the first column of the matrix gets NA and the rest
are recycled. If the matrix is broken to several variables everything
works fine.
Why
You are exceeding your max memory here, so R will not be able to do that.
dump both tables into a db such as mysql and then run the query either from
RMySQL or from mysql directly. then output the result and import back in R.
that will take care of the merge, but not sure what will happen when
Hello
I have two data frames, SNP4 and SNP1:
head(SNP4)
Animal MarkerY
3213 194073197 P1001 0.021088
1295 194073197 P1002 0.021088
915 194073197 P1004 0.021088
2833 194073197 P1005 0.021088
1487 194073197 P1006 0.021088
1885 194073197 P1007 0.021088
head(SNP1)
Hello
I have two data frames, SNP4 and SNP1:
head(SNP4)
Animal MarkerY
3213 194073197 P1001 0.021088
1295 194073197 P1002 0.021088
915 194073197 P1004 0.021088
2833 194073197 P1005 0.021088
1487 194073197 P1006 0.021088
1885 194073197 P1007 0.021088
Try this (where SNP1x is same as SNP1 from your post
but without the last line). If the merge below does not work
on real data set due to size then try the sqldf alternative
as it
SNP1x -
+ structure(list(Animal = c(194073197L, 194073197L, 194073197L,
+ 194073197L, 194073197L), Marker =
On Apr 22, 2009, at 5:22 AM, Johannes G. Madsen wrote:
Hello
I have two data frames, SNP4 and SNP1:
head(SNP4)
Animal MarkerY
3213 194073197 P1001 0.021088
1295 194073197 P1002 0.021088
915 194073197 P1004 0.021088
2833 194073197 P1005 0.021088
1487 194073197
Hi,
How about this:
SNP5 - merge(SNP4, SNP1[,2:3], all.x=TRUE)
SNP5
MarkerAnimal Y x
1 P1001 194073197 0.021088 2
2 P1002 194073197 0.021088 1
3 P1004 194073197 0.021088 2
4 P1005 194073197 0.021088 0
5 P1006 194073197 0.021088 2
6 P1007 194073197 0.021088 0
This
Thanks a lot, Gabor - it's perfect!
Dimitri
On Fri, Dec 19, 2008 at 6:24 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
Try this:
L - list(data.frame(A=2, B=3, C=4),
+ data.frame(A=2, B=1, C=3, D=2, E=4, F=5),
+ data.frame(A=1, B=2, C=4, D=3, E=2, F=4, G=5, H=4, I=2))
library(plyr)
Hello, everyone!
I have list L that contains 99 data frames. All data frames have only
one row, but a different number of columns. Some data frames have 3
columns, some - 6 columns, and some - 9 columns. The names of the
first 3 columns are identical in all 99 data frames (e.g., A, B, and
C). The
Try this:
L - list(data.frame(A=2, B=3, C=4),
+ data.frame(A=2, B=1, C=3, D=2, E=4, F=5),
+ data.frame(A=1, B=2, C=4, D=3, E=2, F=4, G=5, H=4, I=2))
library(plyr)
do.call(rbind.fill, L)
A B C D E F G H I
1 2 3 4 NA NA NA NA NA NA
2 2 1 3 2 4 5 NA NA NA
3 1 2 4 3 2 4 5 4 2
Dear group,
I have 3 different data frames. I want to merge all 3
data frames for which there is intersection.
Say DF 1 and DF2 has 100 common elements in Column 1.
DF3 does not have many intersection either with DF1 or
with DF2.
For names in column 1 not present in DF3 I want to
introduce
DF1- data.frame(Name=as.factor(c(A,B,C)), Age= c(21,45,30))
DF2- data.frame(Name=as.factor(c(A,B,X)), Age= c(50,20,10))
DF3- data.frame(Name=as.factor(c(B,Y,K)), Age= c(40,21,30))
merge(merge(DF1,DF2, by.x= Name, by.y=Name,
all=TRUE),DF3,by.x=Name,by.y=Name, all=TRUE);
Name Age.x Age.y
44 matches
Mail list logo