[R] lapply not reading arguments from the correct environment
Hello, I am facing a problem with lapply which I think''' may be a bug. This is the most basic function in which I can reproduce it: myfun - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos,subset,select=fooCollumn) return(cFoo) } I am building a list of dataframes, in each of which I want to keep only column 2 (obviously I would not do it this way in real life but that's just to demonstrate the bug). If I execute the commands inline it works but if I clean my environment, then define the function and then execute: myfun() I get this error: Error in eval(expr, envir, enclos) : object fooCollumn not found while fooCollumn is defined, in the function, right before lapply. In addition, if I define it outside the function and then execute the function: fooCollumn=1 myfun() it works but uses the value defined in the general environment and not the one defined in the function. This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) What did I do wrong? Is this indeed a bug? An intended behavior? Thanks in advance. JiHO --- http://jo.irisson.free.fr/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply not reading arguments from the correct environment
subset() was not defined inside myfun(); try this version instead: myfun - function () { foo - data.frame(1:10, 10:1) foos - list(foo) fooCollumn - 2 my.subset - function(...) subset(...) cFoo - lapply(foos, my.subset, select = fooCollumn) cFoo } myfun() I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: jiho [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Friday, May 18, 2007 4:41 PM Subject: [R] lapply not reading arguments from the correct environment Hello, I am facing a problem with lapply which I think''' may be a bug. This is the most basic function in which I can reproduce it: myfun - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos,subset,select=fooCollumn) return(cFoo) } I am building a list of dataframes, in each of which I want to keep only column 2 (obviously I would not do it this way in real life but that's just to demonstrate the bug). If I execute the commands inline it works but if I clean my environment, then define the function and then execute: myfun() I get this error: Error in eval(expr, envir, enclos) : object fooCollumn not found while fooCollumn is defined, in the function, right before lapply. In addition, if I define it outside the function and then execute the function: fooCollumn=1 myfun() it works but uses the value defined in the general environment and not the one defined in the function. This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) What did I do wrong? Is this indeed a bug? An intended behavior? Thanks in advance. JiHO --- http://jo.irisson.free.fr/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply not reading arguments from the correct environment
On Fri, 18 May 2007, jiho wrote: Hello, I am facing a problem with lapply which I think''' may be a bug. This is the most basic function in which I can reproduce it: myfun - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos,subset,select=fooCollumn) return(cFoo) } snip I get this error: Error in eval(expr, envir, enclos) : object fooCollumn not found while fooCollumn is defined, in the function, right before lapply. snip This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) What did I do wrong? Is this indeed a bug? An intended behavior? No, it isn't a bug (though it may be confusing). The problem is that subset() evaluates its select argument in an unusual way. Usually the argument would be evaluated inside myfun() and the value passed to lapply(), and everything would work as you expect. subset() bypasses the normal evaluation and explicitly evaluates the select argument in the calling frame, ie, inside lapply(), where fooCollumn is not visible. You could do lapply(foos, function(foo) subset(foo, select=fooCollum)) capturing fooCollum by lexical scope. In R this is often a better option than passing extra arguments to lapply (or other functions that take function arguments). -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply not reading arguments from the correct environment
On 2007-May-18 , at 17:09 , Thomas Lumley wrote: On Fri, 18 May 2007, jiho wrote: I am facing a problem with lapply which I think''' may be a bug. This is the most basic function in which I can reproduce it: myfun - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos,subset,select=fooCollumn) return(cFoo) } snip I get this error: Error in eval(expr, envir, enclos) : object fooCollumn not found while fooCollumn is defined, in the function, right before lapply. snip This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) What did I do wrong? Is this indeed a bug? An intended behavior? The problem is that subset() evaluates its select argument in an unusual way. Usually the argument would be evaluated inside myfun() and the value passed to lapply(), and everything would work as you expect. subset() bypasses the normal evaluation and explicitly evaluates the select argument in the calling frame, ie, inside lapply(), where fooCollumn is not visible. You could do lapply(foos, function(foo) subset(foo, select=fooCollum)) capturing fooCollum by lexical scope. In R this is often a better option than passing extra arguments to lapply (or other functions that take function arguments). Thank you very much, this works well indeed. I agree it is a bit confusing, to say the least. The point is that supplying other arguments in the ... of lapply worked for all other functions I tried before (mean, sd, summary and even spline) so it is really a problem with subset. Anyway, R is great even with such little flaws here and there and as long as the community is there to support it, it will rule. Cheers, JiHO --- http://jo.irisson.free.fr/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply not reading arguments from the correct environment
You need to study carefully what the semantics of 'subset' are. The function body of myfun is not in the evaluation environment. (The issue is 'subset', not 'lapply': select is an *expression* and not a value.) Hint: using subset() programmatically is almost always a mistake. R's subsetting function is '[': subset is a convenience wrapper. On Fri, 18 May 2007, jiho wrote: Hello, I am facing a problem with lapply which I think''' may be a bug. This is the most basic function in which I can reproduce it: myfun - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos,subset,select=fooCollumn) return(cFoo) } I am building a list of dataframes, in each of which I want to keep only column 2 (obviously I would not do it this way in real life but that's just to demonstrate the bug). If I execute the commands inline it works but if I clean my environment, then define the function and then execute: myfun() I get this error: Error in eval(expr, envir, enclos) : object fooCollumn not found while fooCollumn is defined, in the function, right before lapply. In addition, if I define it outside the function and then execute the function: fooCollumn=1 myfun() it works but uses the value defined in the general environment and not the one defined in the function. This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) What did I do wrong? Is this indeed a bug? An intended behavior? It is a bug, in your function. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply not reading arguments from the correct environment
In particular, we can use [ directly instead of subset. This is the same as your function except for the line marked ### : myfun2 - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos, [, fooCollumn) ### return(cFoo) } myfun2() # test On 5/18/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: You need to study carefully what the semantics of 'subset' are. The function body of myfun is not in the evaluation environment. (The issue is 'subset', not 'lapply': select is an *expression* and not a value.) Hint: using subset() programmatically is almost always a mistake. R's subsetting function is '[': subset is a convenience wrapper. On Fri, 18 May 2007, jiho wrote: Hello, I am facing a problem with lapply which I think''' may be a bug. This is the most basic function in which I can reproduce it: myfun - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos,subset,select=fooCollumn) return(cFoo) } I am building a list of dataframes, in each of which I want to keep only column 2 (obviously I would not do it this way in real life but that's just to demonstrate the bug). If I execute the commands inline it works but if I clean my environment, then define the function and then execute: myfun() I get this error: Error in eval(expr, envir, enclos) : object fooCollumn not found while fooCollumn is defined, in the function, right before lapply. In addition, if I define it outside the function and then execute the function: fooCollumn=1 myfun() it works but uses the value defined in the general environment and not the one defined in the function. This is with R 2.5.0 on both OS X and Linux (Fedora Core 6) What did I do wrong? Is this indeed a bug? An intended behavior? It is a bug, in your function. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply not reading arguments from the correct environment
On 2007-May-18 , at 18:21 , Gabor Grothendieck wrote: In particular, we can use [ directly instead of subset. This is the same as your function except for the line marked ### : myfun2 - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos, [, fooCollumn) ### return(cFoo) } myfun2() # test On 5/18/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: You need to study carefully what the semantics of 'subset' are. The function body of myfun is not in the evaluation environment. (The issue is 'subset', not 'lapply': select is an *expression* and not a value.) Hint: using subset() programmatically is almost always a mistake. R's subsetting function is '[': subset is a convenience wrapper. Thank you very much. Indeed it is much better this way. I got used to subset for data.frames because [ does not work with negative named arguments while select does. E.g.: x[,-c(name1,name2)] does not work while subset(x,select=-c(name1,name2)) works (it eliminates columns named name1 and name 2 from x). But I guess in most cases an other syntax can achieve the same thing with [, like: x[,-which(names(x)%in%c(name1,name2))] it's just a little less clear. Thanks again. JiHO --- http://jo.irisson.free.fr/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lapply not reading arguments from the correct environment
On 5/18/07, jiho [EMAIL PROTECTED] wrote: On 2007-May-18 , at 18:21 , Gabor Grothendieck wrote: In particular, we can use [ directly instead of subset. This is the same as your function except for the line marked ### : myfun2 - function() { foo = data.frame(1:10,10:1) foos = list(foo) fooCollumn=2 cFoo = lapply(foos, [, fooCollumn) ### return(cFoo) } myfun2() # test On 5/18/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: You need to study carefully what the semantics of 'subset' are. The function body of myfun is not in the evaluation environment. (The issue is 'subset', not 'lapply': select is an *expression* and not a value.) Hint: using subset() programmatically is almost always a mistake. R's subsetting function is '[': subset is a convenience wrapper. Thank you very much. Indeed it is much better this way. I got used to subset for data.frames because [ does not work with negative named arguments while select does. E.g.: x[,-c(name1,name2)] does not work while subset(x,select=-c(name1,name2)) works (it eliminates columns named name1 and name 2 from x). But I guess in most cases an other syntax can achieve the same thing with [, like: x[,-which(names(x)%in%c(name1,name2))] it's just a little less clear. which is not needed. Using builtin CO2: CO2[ ! names(CO2) %in% c(Type, conc ] or CO2[ setdiff(names(CO2), c(Type, conc)) ] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.