Yes, this was a little bug that will be fixed in the next release.
Hadley
On Thu, Sep 16, 2010 at 1:11 PM, Dylan Beaudette
debeaude...@ucdavis.edu wrote:
Hi,
I have been trying to use the new .parallel argument with the most recent
version of plyr [1] to speed up some tasks. I can run the example in the NEWS
file [1], and it seems to be working correctly. However, R will only use a
single core when I try to apply this same approach with ddply().
1. http://cran.r-project.org/web/packages/plyr/NEWS
Watching my CPUs I see that in both cases only a single core is used, and they
take about the same amount of time. Is there a limitation with how ddply()
dispatches parallel jobs, or is this task not suitable for parallel
computing?
Cheers,
Dylan
Here is an example:
library(plyr)
library(doMC)
registerDoMC(cores=2)
# example data
d - data.frame(y=rnorm(1000), id=rep(letters[1:4], each=500))
# function that wastes some time
f - function(x) {
m - vector(length=1)
for(i in 1:1) {
m[i] - mean(sample(x$y, 100))
}
mean(m)
}
system.time(ddply(d, .(id), .fun=f, .parallel=FALSE))
# user system elapsed
# 2.740 0.016 2.766
system.time(ddply(d, .(id), .fun=f, .parallel=TRUE))
# user system elapsed
# 2.720 0.000 2.726
--
Dylan Beaudette
Soil Resource Laboratory
http://casoilresource.lawr.ucdavis.edu/
University of California at Davis
530.754.7341
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.