Hi Thibault,
mclapply has been designed to signal an error in two ways. User code
errors are returned as special objects (of class "try-error") in the
respective element of the result list. All other errors (including a
process killed) are returned as NULL in the respective elements of the
result list. To detect these errors reliably, one needs to implement FUN
so that it never returns NULL normally (also it cannot return a raw
vector). This is how mclapply was designed and implemented (and also
mccollect, etc). It may be surprising to see multiple NULL elements when
a single process is killed, but this is expected with pre-scheduling
when that process has been tasked to compute multiple elements.
To make this API more user friendly, I've added a warning that is now
emitted when a job does not deliver a result (that is, when a vector
element is NULL because of such error). I've also made it more explicit
in the documentation that NULL signals an error.
Best,
Tomas
On 07/26/2018 08:37 PM, Thibault Vatter wrote:
Hi,
I wondered about the behavior described in the following stackoverflow
question:
https://stackoverflow.com/questions/20674538/mclapply-returns-null-randomly
More specifically, I would like to know if you ever considered the
suggestion made in the comments of the first answer, namely to somehow warn
the user if one of the processes has been killed by the out-of-memory
killer ?
I am always surprised to see the random NULLs without message/warning/error
of any kind, and I think that it could be a useful feature to know whether
the function executed by mclapply returned a NULL or if the process was
killed for some reason.
In the following gist, I have an example of this (in this case non-random)
behavior:
https://gist.github.com/tvatter/2fcf3a9a99c256f9b9360f596b300715
For the record, I generate the list of NULLs in the 4th mclapply in the
girst above with a late 2013 macbook pro with macOS High Sierra, 16GB of
memory, and my sessionInfo() is:
R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin16.7.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS:
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK:
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
base
loaded via a namespace (and not attached):
[1] compiler_3.5.0 tools_3.5.0 yaml_2.1.19
------------------------------------------------------------
Thibault Vatter
Department of Statistics
Columbia University
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel