Yuval S <sadan.yu...@gmail.com> added the comment:

Thank you for the attention and the quick fix. However, the current 
documentation for "Notes on Reproducibility" should still address this issue of 
hash randomization. Not only `sample` is affected by this, but any code that 
combines strings (or bytes or datetime) with hash and random, e.g.

>>> import random
>>> random.seed(6)
>>> a = list(set(str(i) for i in range(500)))
>>> print(a[int(random.random() * 500)])

or, this

>>> import random
>>> import datetime
>>> random.seed(6)
>>> print(random.choice(range(hash(datetime.datetime(2000,1,1)) % 100)))

will still produce non-reproducible results even after the fix. Here is my 
suggestion for documentation:

> Hash randomization, which is enabled by default since version 3.3, is not 
> affected by `random.seed()`. For this reason, code that relies on string 
> hashes, such as code that relies on the ordering of `set` or `dict`, might be 
> non-reproducible, unless string hash randomization is disabled or seeded 
> (see: https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED).

My vote would be to keep hash randomization ties to `random.seed()`, and this 
would make all use cases more predictable, as well as allow `random.sample()` 
to support `set`.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue40325>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to