Marc, That makes the difference between do.call and lapply crystal clear. Your explanation would make a nice FAQ entry.
Thanks! Bob ========================================================= Bob Muenchen (pronounced Min'-chen), Manager Statistical Consulting Center U of TN Office of Information Technology 200 Stokely Management Center, Knoxville, TN 37996-0520 Voice: (865) 974-5230 FAX: (865) 974-4810 Email: [EMAIL PROTECTED] Web: http://oit.utk.edu/scc, News: http://listserv.utk.edu/archives/statnews.html ========================================================= > -----Original Message----- > From: Marc Schwartz [mailto:[EMAIL PROTECTED] > Sent: Monday, April 09, 2007 1:06 PM > To: Muenchen, Robert A (Bob) > Cc: [email protected] > Subject: Re: do.call vs. lapply for lists > > On Mon, 2007-04-09 at 12:45 -0400, Muenchen, Robert A (Bob) wrote: > > Hi All, > > > > I'm trying to understand the difference between do.call and lapply > for > > applying a function to a list. Below is one of the variations of > > programs (by Marc Schwartz) discussed here recently to select the > first > > and last n observations per group. > > > > I've looked in several books, the R FAQ and searched the archives, > but I > > can't find enough to figure out why lapply doesn't do what do.call > does > > in this case. The help files & newsletter descriptions of do.call > sound > > like it would do the same thing, but I'm sure that's due to my lack > of > > understanding about their specific terminology. I would appreciate it > if > > you could take a moment to enlighten me. > > > > Thanks, > > Bob > > > > mydata <- data.frame( > > id = c('001','001','001','002','003','003'), > > math = c(80,75,70,65,65,70), > > reading = c(65,70,88,NA,90,NA) > > ) > > mydata > > > > mylast <- lapply( split(mydata,mydata$id), tail, n=1) > > mylast > > class(mylast) #It's a list, so lapply will so *something* with it. > > > > #This gets the desired result: > > do.call("rbind", mylast) > > > > #This doesn't do the same thing, which confuses me: > > lapply(mylast,rbind) > > > > #...and data.frame won't fix it as I've seen it do in other > > circumstances: > > data.frame( lapply(mylast,rbind) ) > > Bob, > > A key difference is that do.call() operates (in the above example) as > if > the actual call was: > > > rbind(mylast[[1]], mylast[[2]], mylast[[3]]) > id math reading > 3 001 70 88 > 4 002 65 NA > 6 003 70 NA > > In other words, do.call() takes the quoted function and passes the list > object as if it was a list of individual arguments. So rbind() is only > called once. > > In this case, rbind() internally handles all of the factor level > issues, > etc. to enable a single common data frame to be created from the three > independent data frames contained in 'mylast': > > > str(mylast) > List of 3 > $ 001:'data.frame': 1 obs. of 3 variables: > ..$ id : Factor w/ 3 levels "001","002","003": 1 > ..$ math : num 70 > ..$ reading: num 88 > $ 002:'data.frame': 1 obs. of 3 variables: > ..$ id : Factor w/ 3 levels "001","002","003": 2 > ..$ math : num 65 > ..$ reading: num NA > $ 003:'data.frame': 1 obs. of 3 variables: > ..$ id : Factor w/ 3 levels "001","002","003": 3 > ..$ math : num 70 > ..$ reading: num NA > > > On the other hand, lapply() (as above) calls rbind() _separately_ for > each component of mylast. It therefore acts as if the following series > of three separate calls were made: > > > > rbind(mylast[[1]]) > id math reading > 3 001 70 88 > > > rbind(mylast[[2]]) > id math reading > 4 002 65 NA > > > rbind(mylast[[3]]) > id math reading > 6 003 70 NA > > > Of course, the result of lapply() is that the above are combined into a > single R list object and returned: > > > lapply(mylast, rbind) > $`001` > id math reading > 3 001 70 88 > > $`002` > id math reading > 4 002 65 NA > > $`003` > id math reading > 6 003 70 NA > > > It is a subtle, but of course critical, difference in how the internal > function is called and how the arguments are passed. > > Does that help? > > Regards, > > Marc Schwartz > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
