Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-08 Thread Ryan Thompson
On Mon, Jan 7, 2019 at 3:26 PM Henrik Bengtsson wrote: > > 1. To achieve fully numerically reproducible RNGs in way that is > *invariant to the number of workers* (amount of chunking), I think the > only solution is to pregenerated RNG seeds (using > parallel::nextRNGStream()) for each

Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-08 Thread Aaron Lun
Good point Kasper, I hadn’t even considered the possibility of that. This has opened an unpleasant box of worms... I haven’t noticed any issues with MT, but I’ve just switched my DropletUtils C++ code from boost::random::mt19937 to dqrng’s pcg32 to avoid potential problems with overlaps

Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-07 Thread Kasper Daniel Hansen
To add to Henrik's comments, it is also worthwhile to recognize that mclapply() does not deliver statistically sound random numbers even within the apply "loop" unless you use RNGkind("L'Ecuyer-CMRG") which is not set as default. This is because mclapply will initialize random streams with

Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-07 Thread Henrik Bengtsson
On Mon, Jan 7, 2019 at 4:09 AM Martin Morgan wrote: > > I hope for 1. to have a 'local socket' (i.e., not using ports) implementation > shortly. > > I committed a patch in 1.17.6 for the wrong-seeming behavior of 2. We now have > > > library(BiocParallel) > > set.seed(1); p = bpparam(); rnorm(1)

Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-07 Thread Aaron Lun
The main problem I’ve described refers to changes in the random seed due to the MulticoreParam() constructor, prior to dispatch to workers. For the related-but-separate problem of obtaining consistent random results within each worker, we’ve been discussing the possible solutions on another

Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-07 Thread Ryan Thompson
I don't know if this is helpful for BiocParallel, but there's an extension for the foreach package that ensures reproducible RNG behavior for all parallel backends: https://cran.r-project.org/web/packages/doRNG/index.html Perhaps some of the principles from that package can be re-used? On Mon,

Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-07 Thread Aaron Lun
> I hope for 1. to have a 'local socket' (i.e., not using ports) implementation > shortly. Yes, that would be helpful. > I committed a patch in 1.17.6 for the wrong-seeming behavior of 2. We now have > >> library(BiocParallel) >> set.seed(1); p = bpparam(); rnorm(1) > [1] -0.6264538 >>

Re: [Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-07 Thread Martin Morgan
I hope for 1. to have a 'local socket' (i.e., not using ports) implementation shortly. I committed a patch in 1.17.6 for the wrong-seeming behavior of 2. We now have > library(BiocParallel) > set.seed(1); p = bpparam(); rnorm(1) [1] -0.6264538 > set.seed(1); p = bpparam(); rnorm(1) [1]

[Bioc-devel] Using SerialParam() as the registered back-end for all platforms

2019-01-06 Thread Aaron Lun
As we know, the default BiocParallel backends are currently set to MulticoreParam (Linux/Mac) or SnowParam (Windows). I can understand this to some extent because a new user running, say, bplapply() without additional arguments or set-up would expect some kind of parallelization. However, from