On Sat, Jun 16, 2018 at 8:29 PM, Robert Kern <robert.k...@gmail.com> wrote: > On 6/16/18 05:54, josef.p...@gmail.com wrote: >> >> On Sat, Jun 16, 2018 at 3:59 AM, Robert Kern <robert.k...@gmail.com> >> wrote: >>> >>> I have made a significant revision. In this version, downstream projects >>> like scikit-learn should experience significantly less forced churn. >>> >>> >>> https://github.com/rkern/numpy/blob/nep/rng-clarification/doc/neps/nep-0019-rng-policy.rst >>> >>> https://mail.python.org/pipermail/numpy-discussion/2018-June/078252.html >>> >>> tl;dr RandomState lives! But its distributions are forever frozen. So >>> maybe >>> "undead" is more apt. Anyways, RandomState will continue to provide the >>> same >>> stream-compatibility that it always has. But it will be internally >>> refactored to use the same core uniform PRNG objects that the new >>> RandomGenerator distributions class will use underneath (defaulting to >>> the >>> current Mersenne Twister, of course). The distribution methods on >>> RandomGenerator will be allowed to evolve with numpy versions and get >>> better/faster implementations. >>> >>> Your code can mix the usage of RandomState and RandomGenerator as needed, >>> but they can be made to share the same underlying RNG algorithm's state. >> >> >> >> Sounds good to me, and I think handles all our concerns. >> >> I also think that the issues behind the np.random.* section about the >> global state and seed can be revisited for possible deprecation of >> convenience features. >> >> One clarifying question, mainly to see IIUC >> >> in this quote >> """ >> Calling numpy.random.seed() thereafter SHOULD just pass the given seed >> to the current basic RNG object and not attempt to reset the basic RNG >> to the Mersenne Twister. The global RandomState instance MUST be >> accessible by the name numpy.random.mtrand._rand >> """ >> >> "the current basic RNG object" refers to the global object. AFAIU, it >> is possible to change it numpy.random.mtrand._rand. Is it? > > > numpy.random.mtrand._rand would not be a basic RNG object; it would be (as > it is now) a RandomState instance. "the current basic RNG object" would be > the basic RNG that that global RandomState instance is currently using. > > It is not possible (now or in the glorious NEP future) to assign a new > instance to numpy.random.mtrand._rand. All of the numpy.random.* functions > are actually just simple aliases to the methods on that object when the > module is first built. Re-assigning _rand wouldn't reassign those aliases. > numpy.random.standard_normal(), for instance, would still be the > .standard_normal() method on the RandomState instance that _rand initially > pointed to. > > Currently and under the NEP, the only way to modify > numpy.random.mtrand._rand is to call its methods (i.e. the numpy.random.* > convenience functions) to modify its internal state. That's not changing. > > The only thing that will change will be that there will be a new > numpy.random.* function to call that will let you give the global > RandomState a new basic RNG object that it will swap in internally. Let's > call it np.random.swap_global_basic_rng(). If you don't use that function, > you won't have a problem. I intend to make this new function *very* explicit > about what it is doing, and document the crap out of it so it won't be > misused like np.random.seed() is.
I didn't catch that part. Now it's clear. > >> I never tried that so I didn't know we can change the global >> RandomState, and thought we will have to replace np.random.seed usage > >> with a specific RandomState(seed) instance > > > I did a quick review of np.random.seed() usage in statsmodels, and I think > you are mostly fine. It looks like you mostly use it in unit tests and at > the top of examples. The only possible problem that I can see that you might > have with the swap_global_basic_rng() is if some other package that you rely > on calls it in its library code. Then subsequent statsmodels unit tests > might fail because when they call np.random.seed(), it won't be reseeding a > Mersenne Twister but another basic RNG. > > However, I intend to make that a weird and unnatural thing to do. It's > already unlikely to happen as it's a niche requirement that one mostly would > need at the start of a whole program, not buried down inside library code. > But we will also document that function to discourage such usage, and > probably have unconditional noisy warnings that users would have to > explicitly silence. > > If one of your dependencies did that, you'd be well within your rights to > tell them that they are misusing numpy and causing breakage in statsmodels. > >> In loose analogy: >> >> Matplotlib has a "global" current figure and axis, gca, gcf. >> In statsmodels we avoid any access to and usage of it and only work >> with individual figure/axis instances that can be provided by the >> user. (except for maybe some documentation examples and maybe some >> "legacy" code.) >> ( >> https://github.com/statsmodels/statsmodels/blob/master/statsmodels/graphics/utils.py#L48 >> ) >> >> AFAICS, essentially, statsmodels will need a similar policy for >> RandomState/RandomGenerator and give up the usage of the global random >> instance. > > > I mean, you certainly *should* (outside of unit tests) for very similar > reasons why you avoid the global state in matplotlib, but this NEP won't > force you to. You should do so anyways under the status quo, too. For any of > your functions that call np.random.* functions internally, it's hard to use > them in threaded applications, for instance, because it is relying on that > global state. > > scikit-learn's check_random_state() is a good pattern to follow. Thanks for the clarification. I just realized that I had replied to scikit-learn mailing list. I had thought this was numpy-discussion. sorry about that. Josef > > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > that is made terrible by our own mad attempt to interpret it as though it > had > an underlying truth." > -- Umberto Eco > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn