subject:"\[R\] create list of names where two df contain == values"

[R] create list of names where two df contain == values

2011-11-16 Thread Rob Griffin

Hello again... sorry to be posting yet again, but I hadn't anticipated this 
problem.


I am trying to now put the names found in one column in data frame 1 (lets 
call it df.1[,1]) in to a list from the rows where the values in df.1[,2] 
match values in a column of another dataframe (df.2[3])
I tried to write this function so that it put the list of names (called 
Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its 
too complex for a beginner R-enthusiast


ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
Iffy-apply(  df.1,  1,  FUN=ify,  x=df.1,  y=df.2,  a=2,  b=3,  c=1  )

But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) 
(newX[, i])



Here is a dataset that replicates the problem, you'll notice the h 
criteria values are different between the two dataframes and therefore it 
would produce a list  of the 9 letters where the two criteria columns 
matched (a,b,c,d,e,f,g,i,j):




df.1-data.frame(rep(letters[1:10]))
colnames(df.1)[1]-(Letters)
set.seed(1)
df.1$numb1-rnorm(10,1,1)
df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10)
df.1$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)
df.1

df.2-data.frame(rep(letters[1:10]))
colnames(df.2)[1]-(Letters)
set.seed(1)
df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10)
df.2$numb1-rnorm(10,1,1)
df.2$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)
df.2[8,3]-12

df.1
df.2




Your patience is much appreciated,
Rob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] create list of names where two df contain == values

2011-11-16 Thread R. Michael Weylandt michael.weyla...@gmail.com

I'm not at a computer now, so I can't take a close look at it, but I think the 
match() function can be helpful here. 

I'll try to get back to you with a fuller answer later. 

Michael

On Nov 16, 2011, at 8:03 AM, Rob Griffin robgriffin...@hotmail.com wrote:

 Hello again... sorry to be posting yet again, but I hadn't anticipated this 
 problem.
 
 I am trying to now put the names found in one column in data frame 1 (lets 
 call it df.1[,1]) in to a list from the rows where the values in df.1[,2] 
 match values in a column of another dataframe (df.2[3])
 I tried to write this function so that it put the list of names (called Iffy) 
 where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too 
 complex for a beginner R-enthusiast
 
 ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
 Iffy-apply(  df.1,  1,  FUN=ify,  x=df.1,  y=df.2,  a=2,  b=3,  c=1  )
 
 But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) 
 (newX[, i])
 
 
 Here is a dataset that replicates the problem, you'll notice the h criteria 
 values are different between the two dataframes and therefore it would 
 produce a list  of the 9 letters where the two criteria columns matched 
 (a,b,c,d,e,f,g,i,j):
 
 
 
 df.1-data.frame(rep(letters[1:10]))
 colnames(df.1)[1]-(Letters)
 set.seed(1)
 df.1$numb1-rnorm(10,1,1)
 df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10)
 df.1$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)
 df.1
 
 df.2-data.frame(rep(letters[1:10]))
 colnames(df.2)[1]-(Letters)
 set.seed(1)
 df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10)
 df.2$numb1-rnorm(10,1,1)
 df.2$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)
 df.2[8,3]-12
 
 df.1
 df.2
 
 
 
 
 Your patience is much appreciated,
 Rob
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] create list of names where two df contain == values

2011-11-16 Thread David Winsemius



On Nov 16, 2011, at 8:03 AM, Rob Griffin wrote:

Hello again... sorry to be posting yet again, but I hadn't  
anticipated this problem.


I am trying to now put the names found in one column in data frame 1  
(lets call it df.1[,1]) in to a list from the rows where the values  
in df.1[,2] match values in a column of another dataframe (df.2[3])
I tried to write this function so that it put the list of names  
(called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched  
but I think its too complex for a beginner R-enthusiast


ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else  
{NULL}


When you are building a helper function for use with apply, your  
should realize that tat function will be getting a vector, not a list.  
The construction [[,a]] looks pretty strange as well. Generally  
column selection is done with one of [[a]] or [ , a]. I am not  
absolutely sure that you cannot have [[,]] but I was under the  
impression you could not. AND you shouldn't be retruning NULLs if what  
yoyr really want are NA's.



Iffy-apply(  df.1,  1,  FUN=ify,  x=df.1,  y=df.2,  a=2,  b=3,   
c=1  )


So a single vector will be assigned to the x argument in the ify  
function and the rest of the arguments will be populated from the  
other arguments. You do NOT need to supply an x argument in that  
list and if you do so you will throw an error.


Furthermore you cannot expect the apply function to keep track of  
which row it's one for indexing a different data.frame. The mapply  
function might be used for this purpose but I am going to suggest a  
much cleaner solution below.





But this didn't work... Error in FUN(newX[, i], ...) : unused  
argument(s) (newX[, i])



Here is a dataset that replicates the problem, you'll notice the h  
criteria values are different between the two dataframes and  
therefore it would produce a list  of the 9 letters where the two  
criteria columns matched (a,b,c,d,e,f,g,i,j):


If you know that df.1 and df.2 have the same number of rows then use  
the ifelse function which is designed to work on vectors. The if)_else  
construct is NOT:


 ifelse( df.1[,2] ==df.2[,3], {as.character(df.1[,1])} ,  {NA} )
 [1] a b c d e f g NA  i j

The reason as.character was needed lies in that fact that you  
constructed df.1[,1] as a factor variable. AS I understand it, the  
ifelse tries to make it numeric to match the datatype of the  
comaprison. I've never understood this frankly. Maybe someoen can  
educate me.


If you wanted a function that allowed you to specify the columns and  
dataframes then consider this


ret3.m1.eq.n2 - function(df1, df2, col1, col2, col3){
ifelse( df1[,col1] ==df2[,col2],  
{as.character(df1[,col3])} ,  {NA} )







df.1-data.frame(rep(letters[1:10]))
colnames(df.1)[1]-(Letters)
set.seed(1)
df.1$numb1-rnorm(10,1,1)
df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10)
df.1$id- 
c 
(CG234 
,CG232 
,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)

df.1

df.2-data.frame(rep(letters[1:10]))
colnames(df.2)[1]-(Letters)
set.seed(1)
df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10)
df.2$numb1-rnorm(10,1,1)
df.2$id- 
c 
(CG234 
,CG232 
,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)

df.2[8,3]-12

df.1
df.2




Your patience is much appreciated,
Rob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] create list of names where two df contain == values

2011-11-16 Thread Dennis Murphy

Hi:

I think you're overthinking this problem. As is usually the case in R,
a vectorized solution is clearer and provides more easily understood
code.

It's not obvious to me exactly what you want, so we'll try a couple of
variations on the same idea. Equality of floating point numbers is a
difficult computational problem (see R FAQ 7.31), but if it makes
sense to define a threshold difference between floating numbers that
practically equates to zero, then you're in business. In your example,
the difference in numb1 for letter h in the two data frames is far
from zero, so define 'equal' to be a difference  10 ^{-6}. Then:

# Return the entire matching data frame
df.1[abs(df.1$numb1 - df.2$numb1)  0.01, ]
   Letters numb1 extra.colid
1a 0.3735462 1 CG234
2b 1.1836433 2 CG232
3c 0.1643714 3 CG441
4d 2.5952808 4 CG128
5e 1.3295078 5 CG125
6f 0.1795316 6 CG182
7g 1.4874291 7 CG982
9i 1.5757814 9 CG282
10   j 0.694611610 CG154

# Return the matching letters only as a vector:
df.1[abs(df.1$numb1 - df.2$numb1)  0.01, 'Letters' ]

If you want the latter object to remain a data frame, use drop = FALSE
as an extra argument after 'Letters'. If you want to create a list
object such that each letter comprises a different list component,
then the following will do - the as.character() part coerces the
factor Letters into a character object:

as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1)  0.01,
 'Letters' ]))

HTH,
Dennis


On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com wrote:
 Hello again... sorry to be posting yet again, but I hadn't anticipated this
 problem.

 I am trying to now put the names found in one column in data frame 1 (lets
 call it df.1[,1]) in to a list from the rows where the values in df.1[,2]
 match values in a column of another dataframe (df.2[3])
 I tried to write this function so that it put the list of names (called
 Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its
 too complex for a beginner R-enthusiast

 ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
 Iffy-apply(  df.1,  1,  FUN=ify,  x=df.1,  y=df.2,  a=2,  b=3,  c=1  )

 But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s)
 (newX[, i])


 Here is a dataset that replicates the problem, you'll notice the h
 criteria values are different between the two dataframes and therefore it
 would produce a list  of the 9 letters where the two criteria columns
 matched (a,b,c,d,e,f,g,i,j):



 df.1-data.frame(rep(letters[1:10]))
 colnames(df.1)[1]-(Letters)
 set.seed(1)
 df.1$numb1-rnorm(10,1,1)
 df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10)
 df.1$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)
 df.1

 df.2-data.frame(rep(letters[1:10]))
 colnames(df.2)[1]-(Letters)
 set.seed(1)
 df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10)
 df.2$numb1-rnorm(10,1,1)
 df.2$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154)
 df.2[8,3]-12

 df.1
 df.2




 Your patience is much appreciated,
 Rob

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] create list of names where two df contain == values

2011-11-16 Thread Rob Griffin

Ok, thanks for looking in to this so far, I seem to have confused you all a 
little though so I think I need to make this a bit clearer:


in the real situation:
df.1 is 271*13891, and contains (amongst others) columns with Flybase.CG, 
rMF, and Affyid values.
df.2  is 14*12572 and is made from subset of df.1 which removed rows with 
duplicated Flybase.CG values, and df.2 also includes the rMF column

because df.2 is made from the non-duplicated values it is shorter.

I now need to put the Affyid column from df.1 in to df.2 -

My idea is:
to match a value on each row that is unique to that row (within column) but 
shared on both datasets - rMF contains such numbers
then get R to copy the corresponding Affyid value (an alphanumeric id) from 
df.1 and place it in df.2$Affy (or at least in to a list which I could then 
put in to a column) with all shared rMF values and ignore all others


for example df.1 and df.2 both contain the rMF value 0.3393211 which 
corresponds to the same data point which in df.1 has this Affyid: 1638273_at


if you imagine the two rMF columns lined up next to each other they start 
the same and run in the same order, but df.2's has had random points 
removed as was the aim of making df.2, so as soon as you get to that point 
the rest of the list doesn't line up.
What R needs to do is go down the df.2 rMF list one by one, and for each 
df.2 rMF check the entire df.1 rMF list for a match, then take the 
corresponding Affyid.


for example df.1 and df.2 both contain the rMF value  0.3393211 
which corresponds to the same sample point which in df.1 has this 
Affyid: 1638273_at but they occur on different rows in the data frame.


is that a bit clearer? I know this is pretty complex.

David, your idea with ifelse worked for the first few lines then as soon as 
it got to a point where one of the Flybase.CG values had been removed during 
the process of making df.2 it got out of line between the data frames and 
just gave NA after there.



Rob





-Original Message- 
From: Dennis Murphy

Sent: Wednesday, November 16, 2011 4:03 PM
To: Rob Griffin
Cc: r-help@r-project.org
Subject: Re: [R] create list of names where two df contain == values

Hi:

I think you're overthinking this problem. As is usually the case in R,
a vectorized solution is clearer and provides more easily understood
code.

It's not obvious to me exactly what you want, so we'll try a couple of
variations on the same idea. Equality of floating point numbers is a
difficult computational problem (see R FAQ 7.31), but if it makes
sense to define a threshold difference between floating numbers that
practically equates to zero, then you're in business. In your example,
the difference in numb1 for letter h in the two data frames is far
from zero, so define 'equal' to be a difference  10 ^{-6}. Then:

# Return the entire matching data frame
df.1[abs(df.1$numb1 - df.2$numb1)  0.01, ]
  Letters numb1 extra.colid
1a 0.3735462 1 CG234
2b 1.1836433 2 CG232
3c 0.1643714 3 CG441
4d 2.5952808 4 CG128
5e 1.3295078 5 CG125
6f 0.1795316 6 CG182
7g 1.4874291 7 CG982
9i 1.5757814 9 CG282
10   j 0.694611610 CG154

# Return the matching letters only as a vector:
df.1[abs(df.1$numb1 - df.2$numb1)  0.01, 'Letters' ]

If you want the latter object to remain a data frame, use drop = FALSE
as an extra argument after 'Letters'. If you want to create a list
object such that each letter comprises a different list component,
then the following will do - the as.character() part coerces the
factor Letters into a character object:

as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1)  0.01,
'Letters' ]))

HTH,
Dennis


On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com 
wrote:
Hello again... sorry to be posting yet again, but I hadn't anticipated 
this

problem.

I am trying to now put the names found in one column in data frame 1 (lets
call it df.1[,1]) in to a list from the rows where the values in df.1[,2]
match values in a column of another dataframe (df.2[3])
I tried to write this function so that it put the list of names (called
Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think 
its

too complex for a beginner R-enthusiast

ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
Iffy-apply(  df.1,  1,  FUN=ify,  x=df.1,  y=df.2,  a=2,  b=3,  c=1  )

But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s)
(newX[, i])


Here is a dataset that replicates the problem, you'll notice the h
criteria values are different between the two dataframes and therefore it
would produce a list  of the 9 letters where the two criteria columns
matched (a,b,c,d,e,f,g,i,j):



df.1-data.frame(rep(letters[1:10]))
colnames(df.1)[1]-(Letters)
set.seed(1)
df.1$numb1-rnorm(10,1,1)
df.1$extra.col-c

Re: [R] create list of names where two df contain == values

2011-11-16 Thread Rob Griffin

As another potential route could I put something in to the original code 
that makes df.2 (maindata2) which picks one of the AffyIds at random for the 
duplicated FlybaseCG values (shown below)


maindata2-aggregate(maindata[,c(161,172,168,255,254,258,264,265,263,271)], 
by = maindata[,167, drop = F], mean)


Rob

-Original Message- 
From: Rob Griffin

Sent: Wednesday, November 16, 2011 4:35 PM
To: Dennis Murphy
Cc: r-help@r-project.org
Subject: Re: [R] create list of names where two df contain == values

Ok, thanks for looking in to this so far, I seem to have confused you all a
little though so I think I need to make this a bit clearer:

in the real situation:
df.1 is 271*13891, and contains (amongst others) columns with Flybase.CG,
rMF, and Affyid values.
df.2  is 14*12572 and is made from subset of df.1 which removed rows with
duplicated Flybase.CG values, and df.2 also includes the rMF column
because df.2 is made from the non-duplicated values it is shorter.

I now need to put the Affyid column from df.1 in to df.2 -

My idea is:
to match a value on each row that is unique to that row (within column) but
shared on both datasets - rMF contains such numbers
then get R to copy the corresponding Affyid value (an alphanumeric id) from
df.1 and place it in df.2$Affy (or at least in to a list which I could then
put in to a column) with all shared rMF values and ignore all others

for example df.1 and df.2 both contain the rMF value 0.3393211 which
corresponds to the same data point which in df.1 has this Affyid: 1638273_at

if you imagine the two rMF columns lined up next to each other they start
the same and run in the same order, but df.2's has had random points
removed as was the aim of making df.2, so as soon as you get to that point
the rest of the list doesn't line up.
What R needs to do is go down the df.2 rMF list one by one, and for each
df.2 rMF check the entire df.1 rMF list for a match, then take the
corresponding Affyid.

for example df.1 and df.2 both contain the rMF value  0.3393211
which corresponds to the same sample point which in df.1 has this
Affyid: 1638273_at but they occur on different rows in the data frame.

is that a bit clearer? I know this is pretty complex.

David, your idea with ifelse worked for the first few lines then as soon as
it got to a point where one of the Flybase.CG values had been removed during
the process of making df.2 it got out of line between the data frames and
just gave NA after there.


Rob





-Original Message- 
From: Dennis Murphy

Sent: Wednesday, November 16, 2011 4:03 PM
To: Rob Griffin
Cc: r-help@r-project.org
Subject: Re: [R] create list of names where two df contain == values

Hi:

I think you're overthinking this problem. As is usually the case in R,
a vectorized solution is clearer and provides more easily understood
code.

It's not obvious to me exactly what you want, so we'll try a couple of
variations on the same idea. Equality of floating point numbers is a
difficult computational problem (see R FAQ 7.31), but if it makes
sense to define a threshold difference between floating numbers that
practically equates to zero, then you're in business. In your example,
the difference in numb1 for letter h in the two data frames is far
from zero, so define 'equal' to be a difference  10 ^{-6}. Then:

# Return the entire matching data frame
df.1[abs(df.1$numb1 - df.2$numb1)  0.01, ]
  Letters numb1 extra.colid
1a 0.3735462 1 CG234
2b 1.1836433 2 CG232
3c 0.1643714 3 CG441
4d 2.5952808 4 CG128
5e 1.3295078 5 CG125
6f 0.1795316 6 CG182
7g 1.4874291 7 CG982
9i 1.5757814 9 CG282
10   j 0.694611610 CG154

# Return the matching letters only as a vector:
df.1[abs(df.1$numb1 - df.2$numb1)  0.01, 'Letters' ]

If you want the latter object to remain a data frame, use drop = FALSE
as an extra argument after 'Letters'. If you want to create a list
object such that each letter comprises a different list component,
then the following will do - the as.character() part coerces the
factor Letters into a character object:

as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1)  0.01,
'Letters' ]))

HTH,
Dennis


On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com
wrote:
Hello again... sorry to be posting yet again, but I hadn't anticipated 
this

problem.

I am trying to now put the names found in one column in data frame 1 (lets
call it df.1[,1]) in to a list from the rows where the values in df.1[,2]
match values in a column of another dataframe (df.2[3])
I tried to write this function so that it put the list of names (called
Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think 
its

too complex for a beginner R-enthusiast

ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL}
Iffy-apply(  df.1,  1,  FUN=ify,  x=df.1,  y

Re: [R] create list of names where two df contain == values

2011-11-16 Thread R. Michael Weylandt

Perhaps, if R FAQ 7.31 isn't a problem, this would work.

(df.1$AffyIds)[match(df.2$rMF, df.1$rMF)]

Michael

On Wed, Nov 16, 2011 at 1:11 PM, Rob Griffin robgriffin...@hotmail.com wrote:
 As another potential route could I put something in to the original code
 that makes df.2 (maindata2) which picks one of the AffyIds at random for the
 duplicated FlybaseCG values (shown below)

 maindata2-aggregate(maindata[,c(161,172,168,255,254,258,264,265,263,271)],
 by = maindata[,167, drop = F], mean)

 Rob

 -Original Message- From: Rob Griffin
 Sent: Wednesday, November 16, 2011 4:35 PM
 To: Dennis Murphy
 Cc: r-help@r-project.org
 Subject: Re: [R] create list of names where two df contain == values

 Ok, thanks for looking in to this so far, I seem to have confused you all a
 little though so I think I need to make this a bit clearer:

 in the real situation:
 df.1 is 271*13891, and contains (amongst others) columns with Flybase.CG,
 rMF, and Affyid values.
 df.2  is 14*12572 and is made from subset of df.1 which removed rows with
 duplicated Flybase.CG values, and df.2 also includes the rMF column
 because df.2 is made from the non-duplicated values it is shorter.

 I now need to put the Affyid column from df.1 in to df.2 -

 My idea is:
 to match a value on each row that is unique to that row (within column) but
 shared on both datasets - rMF contains such numbers
 then get R to copy the corresponding Affyid value (an alphanumeric id) from
 df.1 and place it in df.2$Affy (or at least in to a list which I could then
 put in to a column) with all shared rMF values and ignore all others

 for example df.1 and df.2 both contain the rMF value 0.3393211 which
 corresponds to the same data point which in df.1 has this Affyid: 1638273_at

 if you imagine the two rMF columns lined up next to each other they start
 the same and run in the same order, but df.2's has had random points
 removed as was the aim of making df.2, so as soon as you get to that point
 the rest of the list doesn't line up.
 What R needs to do is go down the df.2 rMF list one by one, and for each
 df.2 rMF check the entire df.1 rMF list for a match, then take the
 corresponding Affyid.

 for example df.1 and df.2 both contain the rMF value      0.3393211
 which corresponds to the same sample point which in df.1 has this
 Affyid: 1638273_at     but they occur on different rows in the data frame.

 is that a bit clearer? I know this is pretty complex.

 David, your idea with ifelse worked for the first few lines then as soon as
 it got to a point where one of the Flybase.CG values had been removed during
 the process of making df.2 it got out of line between the data frames and
 just gave NA after there.


 Rob





 -Original Message- From: Dennis Murphy
 Sent: Wednesday, November 16, 2011 4:03 PM
 To: Rob Griffin
 Cc: r-help@r-project.org
 Subject: Re: [R] create list of names where two df contain == values

 Hi:

 I think you're overthinking this problem. As is usually the case in R,
 a vectorized solution is clearer and provides more easily understood
 code.

 It's not obvious to me exactly what you want, so we'll try a couple of
 variations on the same idea. Equality of floating point numbers is a
 difficult computational problem (see R FAQ 7.31), but if it makes
 sense to define a threshold difference between floating numbers that
 practically equates to zero, then you're in business. In your example,
 the difference in numb1 for letter h in the two data frames is far
 from zero, so define 'equal' to be a difference  10 ^{-6}. Then:

 # Return the entire matching data frame
 df.1[abs(df.1$numb1 - df.2$numb1)  0.01, ]
  Letters     numb1 extra.col    id
 1        a 0.3735462         1 CG234
 2        b 1.1836433         2 CG232
 3        c 0.1643714         3 CG441
 4        d 2.5952808         4 CG128
 5        e 1.3295078         5 CG125
 6        f 0.1795316         6 CG182
 7        g 1.4874291         7 CG982
 9        i 1.5757814         9 CG282
 10       j 0.6946116        10 CG154

 # Return the matching letters only as a vector:
 df.1[abs(df.1$numb1 - df.2$numb1)  0.01, 'Letters' ]

 If you want the latter object to remain a data frame, use drop = FALSE
 as an extra argument after 'Letters'. If you want to create a list
 object such that each letter comprises a different list component,
 then the following will do - the as.character() part coerces the
 factor Letters into a character object:

 as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1)  0.01,
            'Letters' ]))

 HTH,
 Dennis


 On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com
 wrote:

 Hello again... sorry to be posting yet again, but I hadn't anticipated
 this
 problem.

 I am trying to now put the names found in one column in data frame 1 (lets
 call it df.1[,1]) in to a list from the rows where the values in df.1[,2]
 match values in a column of another dataframe (df.2[3])
 I tried to write this function so

[R] create list of names where two df contain == values

Re: [R] create list of names where two df contain == values

Re: [R] create list of names where two df contain == values

Re: [R] create list of names where two df contain == values

Re: [R] create list of names where two df contain == values

Re: [R] create list of names where two df contain == values

Re: [R] create list of names where two df contain == values

7 matches

Site Navigation

Mail list logo

Footer information