On Mon, Jan 7, 2019 at 3:26 PM Henrik Bengtsson
wrote:
>
> 1. To achieve fully numerically reproducible RNGs in way that is
> *invariant to the number of workers* (amount of chunking), I think the
> only solution is to pregenerated RNG seeds (using
> parallel::nextRNGStream()) for each
Good point Kasper, I hadn’t even considered the possibility of that. This has
opened an unpleasant box of worms...
I haven’t noticed any issues with MT, but I’ve just switched my DropletUtils
C++ code from boost::random::mt19937 to dqrng’s pcg32 to avoid potential
problems with overlaps
To add to Henrik's comments, it is also worthwhile to recognize that
mclapply() does not deliver statistically sound random numbers even within
the apply "loop" unless you use RNGkind("L'Ecuyer-CMRG") which is not set
as default. This is because mclapply will initialize random streams with
On Mon, Jan 7, 2019 at 4:09 AM Martin Morgan wrote:
>
> I hope for 1. to have a 'local socket' (i.e., not using ports) implementation
> shortly.
>
> I committed a patch in 1.17.6 for the wrong-seeming behavior of 2. We now have
>
> > library(BiocParallel)
> > set.seed(1); p = bpparam(); rnorm(1)
The main problem I’ve described refers to changes in the random seed due to the
MulticoreParam() constructor, prior to dispatch to workers. For the
related-but-separate problem of obtaining consistent random results within each
worker, we’ve been discussing the possible solutions on another
I don't know if this is helpful for BiocParallel, but there's an extension
for the foreach package that ensures reproducible RNG behavior for all
parallel backends: https://cran.r-project.org/web/packages/doRNG/index.html
Perhaps some of the principles from that package can be re-used?
On Mon,
> I hope for 1. to have a 'local socket' (i.e., not using ports) implementation
> shortly.
Yes, that would be helpful.
> I committed a patch in 1.17.6 for the wrong-seeming behavior of 2. We now have
>
>> library(BiocParallel)
>> set.seed(1); p = bpparam(); rnorm(1)
> [1] -0.6264538
>>
I hope for 1. to have a 'local socket' (i.e., not using ports) implementation
shortly.
I committed a patch in 1.17.6 for the wrong-seeming behavior of 2. We now have
> library(BiocParallel)
> set.seed(1); p = bpparam(); rnorm(1)
[1] -0.6264538
> set.seed(1); p = bpparam(); rnorm(1)
[1]
As we know, the default BiocParallel backends are currently set to
MulticoreParam (Linux/Mac) or SnowParam (Windows). I can understand this to
some extent because a new user running, say, bplapply() without additional
arguments or set-up would expect some kind of parallelization. However, from