Thank you Martin, that's exactly the problem. So for now I will just leave it like it is without setting a seed inside a function, and hope that the behaviour of DelayedArray might be updated. Anyway, I don't think it is a big problem.
Best, Steffi > On May 22, 2019 at 5:02 PM Martin Morgan <[email protected]> wrote: > > > I think the problem is that, even if the user were to set.seed(), it will > have different consequences depending on whether DelayedArray is already > loaded, or not yet loaded. DelayedArray gets loaded in some way that is not > transparent to the user, as a dependency-of-a-dependency-of-an annotation > package. > > I guess an acceptable solution would be for DelayedArray to remember and > restore the random number seed before creating a BiocParallel param, with an > edge case being that .Random.seed is NULL in a new R session. > > Martin > > On 5/22/19, 9:57 AM, "Kasper Daniel Hansen" <[email protected]> > wrote: > > Why don't you let this be under the user's control and not do this at > all. People should know that reproducibility of random numbers requires > setting the seed, but that is best done by the user and not a package author. > > On Wed, May 22, 2019 at 9:30 AM Steffi Grote <[email protected]> > wrote: > > > Hi all, > > I tried to circumvent the problem by adding an optional seed as parameter > like this: > > my_fun = function(..., seed = NULL){ > > code that might change the RNG > > if (!is.null(seed)){ > set.seed(seed) > } > > code that runs permutations > } > > which solves the reproducibility issue, but gives me a Warning in > BiocCheck: > * WARNING: Remove set.seed usage in R code > Found in R/ directory functions: > my_fun() > > What is the best way to deal with this? > > Thanks in advance, > Steffi > > > > On April 12, 2019 at 1:10 AM Martin Morgan <[email protected]> > wrote: > > > > > > That easy strategy wouldn't work, for instance two successive calls to > MulticoreParam() would get the same port assigned, rather than the contract > of a 'random' port in a specific range; the port can be assigned by the > manager.port= argument if the user wants > to avoid random assignment. I could maintain a separate random number > stream in BiocParallel for what amounts to a pretty trivial and probably > dubious strategy [choosing random ports in hopes that one is not in use], but > that starts to sound like a more substantial > feature. > > > > Martin > > > > On 4/11/19, 7:06 PM, "Pages, Herve" <[email protected]> wrote: > > > > Hi Steffi, > > > > Any code that gets called between your calls to set.seed() and > runif() > > could potentially use the random number generator. So the sequence > > set.seed(123); runif(1) is only guaranteed to be deterministic if > no > > other code is called in between, or if the code called in between > does > > not use the random number generator (but if that code is not under > your > > control it could do anything). > > > > @Martin: I'll look at your suggestion for DelayedArray. An easy > > workaround would be to avoid changing the RNG state in BiocParallel > by > > having .snowPort() make a copy of .Random.seed (if it exists) > before > > calling runif() and restoring it on exit. > > > > H. > > > > On 4/11/19 15:25, Martin Morgan wrote: > > > This is actually from a dependency DelayedArray which, on load, > calls DelayedArray::setAutoBPPARAM, which calls > BiocParallel::MulticoreParam(), which uses the random number generator to > select a random port for connection. > > > > > > A different approach would be for DelayedArray to respect the > user's configuration and use bpparam(), or perhaps look at the class of > bpparam() and tell the user they should, e.g., > BiocParallel::register(SerialParam()) if that's appropriate, or use > registered("MulticoreParam") or registered("SerialParam") if available > (they are by default) rather than creating an ad-hoc instance. > > > > > > Martin > > > > > > On 4/11/19, 10:17 AM, "Bioc-devel on behalf of Steffi Grote" > <[email protected] on behalf of > [email protected]> wrote: > > > > > > Hi all, > > > I found out that example code for my package GOfuncR yields > a different result the first time it's executed, despite setting a seed. All > the following executions are identical. > > > It turned out that loading the database package > 'Homo.sapiens' changed the random numbers: > > > > > > set.seed(123) > > > runif(1) > > > # [1] 0.2875775 > > > > > > set.seed(123) > > > suppressWarnings(suppressMessages(require(Homo.sapiens))) > > > runif(1) > > > # [1] 0.7883051 > > > > > > set.seed(123) > > > runif(1) > > > # [1] 0.2875775 > > > > > > Is that known or expected behaviour? > > > Should I not load a package inside a function that later > uses random numbers? > > > > > > Thanks in advance, > > > Steffi > > > > > > _______________________________________________ > > > [email protected] mailing list > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8XXamcpEeef966i7IGk_3aE9GMJodKAzXwWW4fL_hrI&s=KoHGLM0HbP4whRZLG4ol66_q1qkg9E0LjFHObDqgNuo&e= > > <https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8XXamcpEeef966i7IGk_3aE9GMJodKAzXwWW4fL_hrI&s=KoHGLM0HbP4whRZLG4ol66_q1qkg9E0LjFHObDqgNuo&e=> > > > > > > > -- > > Hervé Pagès > > > > Program in Computational Biology > > Division of Public Health Sciences > > Fred Hutchinson Cancer Research Center > > 1100 Fairview Ave. N, M1-B514 > > P.O. Box 19024 > > Seattle, WA 98109-1024 > > > > E-mail: [email protected] > > Phone: (206) 667-5791 > > Fax: (206) 667-1319 > > > > > > _______________________________________________ > [email protected] mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > > > > > > > -- > Best, > Kasper > > > _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
