[R] create list of names where two df contain == values
Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) I tried to write this function so that it put the list of names (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too complex for a beginner R-enthusiast ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL} Iffy-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, c=1 ) But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) (newX[, i]) Here is a dataset that replicates the problem, you'll notice the h criteria values are different between the two dataframes and therefore it would produce a list of the 9 letters where the two criteria columns matched (a,b,c,d,e,f,g,i,j): df.1-data.frame(rep(letters[1:10])) colnames(df.1)[1]-(Letters) set.seed(1) df.1$numb1-rnorm(10,1,1) df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.1$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.1 df.2-data.frame(rep(letters[1:10])) colnames(df.2)[1]-(Letters) set.seed(1) df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.2$numb1-rnorm(10,1,1) df.2$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.2[8,3]-12 df.1 df.2 Your patience is much appreciated, Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create list of names where two df contain == values
I'm not at a computer now, so I can't take a close look at it, but I think the match() function can be helpful here. I'll try to get back to you with a fuller answer later. Michael On Nov 16, 2011, at 8:03 AM, Rob Griffin robgriffin...@hotmail.com wrote: Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) I tried to write this function so that it put the list of names (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too complex for a beginner R-enthusiast ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL} Iffy-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, c=1 ) But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) (newX[, i]) Here is a dataset that replicates the problem, you'll notice the h criteria values are different between the two dataframes and therefore it would produce a list of the 9 letters where the two criteria columns matched (a,b,c,d,e,f,g,i,j): df.1-data.frame(rep(letters[1:10])) colnames(df.1)[1]-(Letters) set.seed(1) df.1$numb1-rnorm(10,1,1) df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.1$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.1 df.2-data.frame(rep(letters[1:10])) colnames(df.2)[1]-(Letters) set.seed(1) df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.2$numb1-rnorm(10,1,1) df.2$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.2[8,3]-12 df.1 df.2 Your patience is much appreciated, Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create list of names where two df contain == values
On Nov 16, 2011, at 8:03 AM, Rob Griffin wrote: Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) I tried to write this function so that it put the list of names (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too complex for a beginner R-enthusiast ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL} When you are building a helper function for use with apply, your should realize that tat function will be getting a vector, not a list. The construction [[,a]] looks pretty strange as well. Generally column selection is done with one of [[a]] or [ , a]. I am not absolutely sure that you cannot have [[,]] but I was under the impression you could not. AND you shouldn't be retruning NULLs if what yoyr really want are NA's. Iffy-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, c=1 ) So a single vector will be assigned to the x argument in the ify function and the rest of the arguments will be populated from the other arguments. You do NOT need to supply an x argument in that list and if you do so you will throw an error. Furthermore you cannot expect the apply function to keep track of which row it's one for indexing a different data.frame. The mapply function might be used for this purpose but I am going to suggest a much cleaner solution below. But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) (newX[, i]) Here is a dataset that replicates the problem, you'll notice the h criteria values are different between the two dataframes and therefore it would produce a list of the 9 letters where the two criteria columns matched (a,b,c,d,e,f,g,i,j): If you know that df.1 and df.2 have the same number of rows then use the ifelse function which is designed to work on vectors. The if)_else construct is NOT: ifelse( df.1[,2] ==df.2[,3], {as.character(df.1[,1])} , {NA} ) [1] a b c d e f g NA i j The reason as.character was needed lies in that fact that you constructed df.1[,1] as a factor variable. AS I understand it, the ifelse tries to make it numeric to match the datatype of the comaprison. I've never understood this frankly. Maybe someoen can educate me. If you wanted a function that allowed you to specify the columns and dataframes then consider this ret3.m1.eq.n2 - function(df1, df2, col1, col2, col3){ ifelse( df1[,col1] ==df2[,col2], {as.character(df1[,col3])} , {NA} ) df.1-data.frame(rep(letters[1:10])) colnames(df.1)[1]-(Letters) set.seed(1) df.1$numb1-rnorm(10,1,1) df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.1$id- c (CG234 ,CG232 ,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.1 df.2-data.frame(rep(letters[1:10])) colnames(df.2)[1]-(Letters) set.seed(1) df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.2$numb1-rnorm(10,1,1) df.2$id- c (CG234 ,CG232 ,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.2[8,3]-12 df.1 df.2 Your patience is much appreciated, Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create list of names where two df contain == values
Hi: I think you're overthinking this problem. As is usually the case in R, a vectorized solution is clearer and provides more easily understood code. It's not obvious to me exactly what you want, so we'll try a couple of variations on the same idea. Equality of floating point numbers is a difficult computational problem (see R FAQ 7.31), but if it makes sense to define a threshold difference between floating numbers that practically equates to zero, then you're in business. In your example, the difference in numb1 for letter h in the two data frames is far from zero, so define 'equal' to be a difference 10 ^{-6}. Then: # Return the entire matching data frame df.1[abs(df.1$numb1 - df.2$numb1) 0.01, ] Letters numb1 extra.colid 1a 0.3735462 1 CG234 2b 1.1836433 2 CG232 3c 0.1643714 3 CG441 4d 2.5952808 4 CG128 5e 1.3295078 5 CG125 6f 0.1795316 6 CG182 7g 1.4874291 7 CG982 9i 1.5757814 9 CG282 10 j 0.694611610 CG154 # Return the matching letters only as a vector: df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ] If you want the latter object to remain a data frame, use drop = FALSE as an extra argument after 'Letters'. If you want to create a list object such that each letter comprises a different list component, then the following will do - the as.character() part coerces the factor Letters into a character object: as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ])) HTH, Dennis On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com wrote: Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) I tried to write this function so that it put the list of names (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too complex for a beginner R-enthusiast ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL} Iffy-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, c=1 ) But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) (newX[, i]) Here is a dataset that replicates the problem, you'll notice the h criteria values are different between the two dataframes and therefore it would produce a list of the 9 letters where the two criteria columns matched (a,b,c,d,e,f,g,i,j): df.1-data.frame(rep(letters[1:10])) colnames(df.1)[1]-(Letters) set.seed(1) df.1$numb1-rnorm(10,1,1) df.1$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.1$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.1 df.2-data.frame(rep(letters[1:10])) colnames(df.2)[1]-(Letters) set.seed(1) df.2$extra.col-c(1,2,3,4,5,6,7,8,9,10) df.2$numb1-rnorm(10,1,1) df.2$id-c(CG234,CG232,CG441,CG128,CG125,CG182,CG982,CG541,CG282,CG154) df.2[8,3]-12 df.1 df.2 Your patience is much appreciated, Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create list of names where two df contain == values
Ok, thanks for looking in to this so far, I seem to have confused you all a little though so I think I need to make this a bit clearer: in the real situation: df.1 is 271*13891, and contains (amongst others) columns with Flybase.CG, rMF, and Affyid values. df.2 is 14*12572 and is made from subset of df.1 which removed rows with duplicated Flybase.CG values, and df.2 also includes the rMF column because df.2 is made from the non-duplicated values it is shorter. I now need to put the Affyid column from df.1 in to df.2 - My idea is: to match a value on each row that is unique to that row (within column) but shared on both datasets - rMF contains such numbers then get R to copy the corresponding Affyid value (an alphanumeric id) from df.1 and place it in df.2$Affy (or at least in to a list which I could then put in to a column) with all shared rMF values and ignore all others for example df.1 and df.2 both contain the rMF value 0.3393211 which corresponds to the same data point which in df.1 has this Affyid: 1638273_at if you imagine the two rMF columns lined up next to each other they start the same and run in the same order, but df.2's has had random points removed as was the aim of making df.2, so as soon as you get to that point the rest of the list doesn't line up. What R needs to do is go down the df.2 rMF list one by one, and for each df.2 rMF check the entire df.1 rMF list for a match, then take the corresponding Affyid. for example df.1 and df.2 both contain the rMF value 0.3393211 which corresponds to the same sample point which in df.1 has this Affyid: 1638273_at but they occur on different rows in the data frame. is that a bit clearer? I know this is pretty complex. David, your idea with ifelse worked for the first few lines then as soon as it got to a point where one of the Flybase.CG values had been removed during the process of making df.2 it got out of line between the data frames and just gave NA after there. Rob -Original Message- From: Dennis Murphy Sent: Wednesday, November 16, 2011 4:03 PM To: Rob Griffin Cc: r-help@r-project.org Subject: Re: [R] create list of names where two df contain == values Hi: I think you're overthinking this problem. As is usually the case in R, a vectorized solution is clearer and provides more easily understood code. It's not obvious to me exactly what you want, so we'll try a couple of variations on the same idea. Equality of floating point numbers is a difficult computational problem (see R FAQ 7.31), but if it makes sense to define a threshold difference between floating numbers that practically equates to zero, then you're in business. In your example, the difference in numb1 for letter h in the two data frames is far from zero, so define 'equal' to be a difference 10 ^{-6}. Then: # Return the entire matching data frame df.1[abs(df.1$numb1 - df.2$numb1) 0.01, ] Letters numb1 extra.colid 1a 0.3735462 1 CG234 2b 1.1836433 2 CG232 3c 0.1643714 3 CG441 4d 2.5952808 4 CG128 5e 1.3295078 5 CG125 6f 0.1795316 6 CG182 7g 1.4874291 7 CG982 9i 1.5757814 9 CG282 10 j 0.694611610 CG154 # Return the matching letters only as a vector: df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ] If you want the latter object to remain a data frame, use drop = FALSE as an extra argument after 'Letters'. If you want to create a list object such that each letter comprises a different list component, then the following will do - the as.character() part coerces the factor Letters into a character object: as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ])) HTH, Dennis On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com wrote: Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) I tried to write this function so that it put the list of names (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too complex for a beginner R-enthusiast ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL} Iffy-apply( df.1, 1, FUN=ify, x=df.1, y=df.2, a=2, b=3, c=1 ) But this didn't work... Error in FUN(newX[, i], ...) : unused argument(s) (newX[, i]) Here is a dataset that replicates the problem, you'll notice the h criteria values are different between the two dataframes and therefore it would produce a list of the 9 letters where the two criteria columns matched (a,b,c,d,e,f,g,i,j): df.1-data.frame(rep(letters[1:10])) colnames(df.1)[1]-(Letters) set.seed(1) df.1$numb1-rnorm(10,1,1) df.1$extra.col-c
Re: [R] create list of names where two df contain == values
As another potential route could I put something in to the original code that makes df.2 (maindata2) which picks one of the AffyIds at random for the duplicated FlybaseCG values (shown below) maindata2-aggregate(maindata[,c(161,172,168,255,254,258,264,265,263,271)], by = maindata[,167, drop = F], mean) Rob -Original Message- From: Rob Griffin Sent: Wednesday, November 16, 2011 4:35 PM To: Dennis Murphy Cc: r-help@r-project.org Subject: Re: [R] create list of names where two df contain == values Ok, thanks for looking in to this so far, I seem to have confused you all a little though so I think I need to make this a bit clearer: in the real situation: df.1 is 271*13891, and contains (amongst others) columns with Flybase.CG, rMF, and Affyid values. df.2 is 14*12572 and is made from subset of df.1 which removed rows with duplicated Flybase.CG values, and df.2 also includes the rMF column because df.2 is made from the non-duplicated values it is shorter. I now need to put the Affyid column from df.1 in to df.2 - My idea is: to match a value on each row that is unique to that row (within column) but shared on both datasets - rMF contains such numbers then get R to copy the corresponding Affyid value (an alphanumeric id) from df.1 and place it in df.2$Affy (or at least in to a list which I could then put in to a column) with all shared rMF values and ignore all others for example df.1 and df.2 both contain the rMF value 0.3393211 which corresponds to the same data point which in df.1 has this Affyid: 1638273_at if you imagine the two rMF columns lined up next to each other they start the same and run in the same order, but df.2's has had random points removed as was the aim of making df.2, so as soon as you get to that point the rest of the list doesn't line up. What R needs to do is go down the df.2 rMF list one by one, and for each df.2 rMF check the entire df.1 rMF list for a match, then take the corresponding Affyid. for example df.1 and df.2 both contain the rMF value 0.3393211 which corresponds to the same sample point which in df.1 has this Affyid: 1638273_at but they occur on different rows in the data frame. is that a bit clearer? I know this is pretty complex. David, your idea with ifelse worked for the first few lines then as soon as it got to a point where one of the Flybase.CG values had been removed during the process of making df.2 it got out of line between the data frames and just gave NA after there. Rob -Original Message- From: Dennis Murphy Sent: Wednesday, November 16, 2011 4:03 PM To: Rob Griffin Cc: r-help@r-project.org Subject: Re: [R] create list of names where two df contain == values Hi: I think you're overthinking this problem. As is usually the case in R, a vectorized solution is clearer and provides more easily understood code. It's not obvious to me exactly what you want, so we'll try a couple of variations on the same idea. Equality of floating point numbers is a difficult computational problem (see R FAQ 7.31), but if it makes sense to define a threshold difference between floating numbers that practically equates to zero, then you're in business. In your example, the difference in numb1 for letter h in the two data frames is far from zero, so define 'equal' to be a difference 10 ^{-6}. Then: # Return the entire matching data frame df.1[abs(df.1$numb1 - df.2$numb1) 0.01, ] Letters numb1 extra.colid 1a 0.3735462 1 CG234 2b 1.1836433 2 CG232 3c 0.1643714 3 CG441 4d 2.5952808 4 CG128 5e 1.3295078 5 CG125 6f 0.1795316 6 CG182 7g 1.4874291 7 CG982 9i 1.5757814 9 CG282 10 j 0.694611610 CG154 # Return the matching letters only as a vector: df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ] If you want the latter object to remain a data frame, use drop = FALSE as an extra argument after 'Letters'. If you want to create a list object such that each letter comprises a different list component, then the following will do - the as.character() part coerces the factor Letters into a character object: as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ])) HTH, Dennis On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com wrote: Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) I tried to write this function so that it put the list of names (called Iffy) where the 2 criteria (df.1[141] and df.2[21]) matched but I think its too complex for a beginner R-enthusiast ify-function(x,y,a,b,c) if(x[[,a]]==y[[,b]]) {list(x[[,c]])} else {NULL} Iffy-apply( df.1, 1, FUN=ify, x=df.1, y
Re: [R] create list of names where two df contain == values
Perhaps, if R FAQ 7.31 isn't a problem, this would work. (df.1$AffyIds)[match(df.2$rMF, df.1$rMF)] Michael On Wed, Nov 16, 2011 at 1:11 PM, Rob Griffin robgriffin...@hotmail.com wrote: As another potential route could I put something in to the original code that makes df.2 (maindata2) which picks one of the AffyIds at random for the duplicated FlybaseCG values (shown below) maindata2-aggregate(maindata[,c(161,172,168,255,254,258,264,265,263,271)], by = maindata[,167, drop = F], mean) Rob -Original Message- From: Rob Griffin Sent: Wednesday, November 16, 2011 4:35 PM To: Dennis Murphy Cc: r-help@r-project.org Subject: Re: [R] create list of names where two df contain == values Ok, thanks for looking in to this so far, I seem to have confused you all a little though so I think I need to make this a bit clearer: in the real situation: df.1 is 271*13891, and contains (amongst others) columns with Flybase.CG, rMF, and Affyid values. df.2 is 14*12572 and is made from subset of df.1 which removed rows with duplicated Flybase.CG values, and df.2 also includes the rMF column because df.2 is made from the non-duplicated values it is shorter. I now need to put the Affyid column from df.1 in to df.2 - My idea is: to match a value on each row that is unique to that row (within column) but shared on both datasets - rMF contains such numbers then get R to copy the corresponding Affyid value (an alphanumeric id) from df.1 and place it in df.2$Affy (or at least in to a list which I could then put in to a column) with all shared rMF values and ignore all others for example df.1 and df.2 both contain the rMF value 0.3393211 which corresponds to the same data point which in df.1 has this Affyid: 1638273_at if you imagine the two rMF columns lined up next to each other they start the same and run in the same order, but df.2's has had random points removed as was the aim of making df.2, so as soon as you get to that point the rest of the list doesn't line up. What R needs to do is go down the df.2 rMF list one by one, and for each df.2 rMF check the entire df.1 rMF list for a match, then take the corresponding Affyid. for example df.1 and df.2 both contain the rMF value 0.3393211 which corresponds to the same sample point which in df.1 has this Affyid: 1638273_at but they occur on different rows in the data frame. is that a bit clearer? I know this is pretty complex. David, your idea with ifelse worked for the first few lines then as soon as it got to a point where one of the Flybase.CG values had been removed during the process of making df.2 it got out of line between the data frames and just gave NA after there. Rob -Original Message- From: Dennis Murphy Sent: Wednesday, November 16, 2011 4:03 PM To: Rob Griffin Cc: r-help@r-project.org Subject: Re: [R] create list of names where two df contain == values Hi: I think you're overthinking this problem. As is usually the case in R, a vectorized solution is clearer and provides more easily understood code. It's not obvious to me exactly what you want, so we'll try a couple of variations on the same idea. Equality of floating point numbers is a difficult computational problem (see R FAQ 7.31), but if it makes sense to define a threshold difference between floating numbers that practically equates to zero, then you're in business. In your example, the difference in numb1 for letter h in the two data frames is far from zero, so define 'equal' to be a difference 10 ^{-6}. Then: # Return the entire matching data frame df.1[abs(df.1$numb1 - df.2$numb1) 0.01, ] Letters numb1 extra.col id 1 a 0.3735462 1 CG234 2 b 1.1836433 2 CG232 3 c 0.1643714 3 CG441 4 d 2.5952808 4 CG128 5 e 1.3295078 5 CG125 6 f 0.1795316 6 CG182 7 g 1.4874291 7 CG982 9 i 1.5757814 9 CG282 10 j 0.6946116 10 CG154 # Return the matching letters only as a vector: df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ] If you want the latter object to remain a data frame, use drop = FALSE as an extra argument after 'Letters'. If you want to create a list object such that each letter comprises a different list component, then the following will do - the as.character() part coerces the factor Letters into a character object: as.list(as.character(df.1[abs(df.1$numb1 - df.2$numb1) 0.01, 'Letters' ])) HTH, Dennis On Wed, Nov 16, 2011 at 5:03 AM, Rob Griffin robgriffin...@hotmail.com wrote: Hello again... sorry to be posting yet again, but I hadn't anticipated this problem. I am trying to now put the names found in one column in data frame 1 (lets call it df.1[,1]) in to a list from the rows where the values in df.1[,2] match values in a column of another dataframe (df.2[3]) I tried to write this function so