Dear Luke, Thank you, this makes perfect sense.
I find it quite hard to express this issue in a way that is both compact and understandable. In any case, below you find a proposal for an update of the documentation. Thank you again for all your work, Mark Index: src/library/parallel/man/clusterApply.Rd =================================================================== --- src/library/parallel/man/clusterApply.Rd (revision 79673) +++ src/library/parallel/man/clusterApply.Rd (working copy) @@ -136,6 +136,15 @@ more efficient than \code{parApply} but do less post-processing of the result. + Functions with a \code{fun} or \code{FUN} parameter send a serialized + copy of the argument from the main process to each worker node. + When the argument passed to \code{fun} or \code{FUN} is a function + this is equivalent to calling the same function on the worker node, + except when the function has an enclosing environment it modifies. + A notable example is \code{\link{.libPaths}}. To ensure that the + function local to each worker is called so it modifies its local + enclosing environment, pass the name of the function as a string. + A chunk size of \code{0} with static scheduling uses the default (one chunk per node). With dynamic scheduling, chunk size of \code{0} has the same effect as \code{1} (one invocation of \code{FUN}/\code{fun} per On Tue, Dec 22, 2020 at 2:37 PM <luke-tier...@uiowa.edu> wrote: > On Tue, 22 Dec 2020, Mark van der Loo wrote: > > > Dear all, > > > > It is not possible to set library paths on worker nodes with > > parallel::clusterCall (or snow::clusterCall) and I wonder if this is > > intended behavior. > > > > Example. > > > > library(parallel) > > libdir <- "./tmplib" > > if (!dir.exists(libdir)) dir.create("./tmplib") > > > > cl <- makeCluster(2) > > clusterCall(cl, .libPaths, c(libdir, .libPaths()) ) > > > > The output is as expected with the extra libdir returned for each worker > > node. However, running > > > > clusterEvalQ(cl, .libPaths()) > > > > Shows that the library paths have not been set. > > Use this: > > clusterCall(cl, ".libPaths", c(libdir, .libPaths()) ) > > This will find the function .libPaths on the workers. > > Your clusterCall sends across a serialized copy of your process' > .libPaths and calls that. Usually that is equivalent to calling the > function found by the name you used on the workers, but not when the > function has an enclosing environment that the function modifies by > assignment. > > Alternate implementations of .libPaths that are more > serialization-friendly are possible in principle but probably not > practical given limitations of the base package. > > The distinction between providing a function value or a character > string as the function argument to clusterCall and others could > probably use a paragraph in the help file; happy to consider a patch > if anyone wants to take a crack at it. > > Best, > > luke > > > > > If this is indeed a bug, I'm happy to file it at bugzilla. Tested on R > > 4.0.3 and r-devel. > > > > Best, > > Mark > > ps: a workaround is documented here: > > > https://www.markvanderloo.eu/yaRb/2020/12/17/how-to-set-library-path-on-a-parallel-r-cluster/ > > > > > >> sessionInfo() > > R Under development (unstable) (2020-12-21 r79668) > > Platform: x86_64-pc-linux-gnu (64-bit) > > Running under: Ubuntu 20.04.1 LTS > > > > Matrix products: default > > BLAS: /home/mark/projects/Rdev/R-devel/lib/libRblas.so > > LAPACK: /home/mark/projects/Rdev/R-devel/lib/libRlapack.so > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > [3] LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8 > > [7] LC_PAPER=nl_NL.UTF-8 LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > [11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] parallel stats graphics grDevices utils datasets methods > > [8] base > > > > loaded via a namespace (and not attached): > > [1] compiler_4.1.0 > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > -- > Luke Tierney > Ralph E. Wareham Professor of Mathematical Sciences > University of Iowa Phone: 319-335-3386 > Department of Statistics and Fax: 319-335-3017 > Actuarial Science > 241 Schaeffer Hall email: luke-tier...@uiowa.edu > Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel