On Thu, Dec 17, 2020 at 9:56 AM Evgeni Burovski <evgeny.burovs...@gmail.com> wrote:
> On Thu, Dec 17, 2020 at 1:01 PM Matti Picus <matti.pi...@gmail.com> wrote: > > > > > > On 12/17/20 11:47 AM, Evgeni Burovski wrote: > > > Just as a side note, this is not very prominent in the docs, and I'm > > > ready to volunteer to send a doc PR --- I'm only not sure which part > > > of the docs, and would appreciate a pointer. > > > > Maybe here > > > > > https://numpy.org/devdocs/reference/random/bit_generators/index.html#seeding-and-entropy > > > > which is here in the sources > > > > > https://github.com/numpy/numpy/blob/master/doc/source/reference/random/bit_generators/index.rst#seeding-and-entropy > > > > > > And/or in the SeedSequence docstring documentation > > > > > https://numpy.org/devdocs/reference/random/bit_generators/generated/numpy.random.SeedSequence.html#numpy.random.SeedSequence > > > > which is here in the sources > > > > > https://github.com/numpy/numpy/blob/master/numpy/random/bit_generator.pyx#L255 > > > Here's the PR, https://github.com/numpy/numpy/pull/18014 > > Two minor comments, both OT for the PR: > > 1. The recommendation to seed the generators from the OS --- I've been > bitten by exactly this once. That was a rather exotic combination of a > vendor RNG and a batch queueing system, and some of my runs did end up > with identical random streams. Given that the recommendation is what > it is, it probably means that experience is a singular point and it no > longer happens with modern generators. > I suspect the vendor RNG was rolling its own entropy using time. We use `secrets.getrandbits()`, which ultimately uses the best cryptographic entropy source available. And if there is no cryptographic entropy source available, I think we fail hard instead of falling back to less reliable things like time. I'm not entirely sure that's a feature, but it is safe! > 2. Robert's comment that `SeedSequence(..., spawn_key=(num,))` is not > equivalent to `SeedSequence(...).spawn(num)[num]` and that the former > is not recommended. I'm not questioning the recommendation, but then > __repr__ seems to suggest the equivalence: > I was saying that they were equivalent. That's precisely why it's not recommended: it's too easy to do both and get identical streams inadvertently. -- Robert Kern
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion