New submission from Thomas Dybdahl Ahle <lob...@gmail.com>:
Given a generator `f()` we can use `random.sample(list(f()), 10)` to get a uniform sample of the values generated. This is fine, and fast, as long as `list(f())` easily fits in memory. However, if it doesn't, one has to implement the reservoir sampling algorithm as a pure python function, which is much slower, and not so easy. It seems that having a fast reservoir sampling implementation in `random.sample` to use for iterators would be both useful and make the API more predictable. Currently when passing an iterator `random.sample` throws `TypeError: Population must be a sequence or set.`. This is inconsistent with most of the standard library which accepts lists and iterators transparently. I apologize if this enhancement has already been discussed. I wasn't able to find it. If wanted, I can write up a pull request. I believe questions like this: https://stackoverflow.com/questions/12581437/python-random-sample-with-a-generator-iterable-iterator makes it clear that such functionality is wanted and non-obvious. ---------- components: Library (Lib) messages: 348445 nosy: thomasahle priority: normal severity: normal status: open title: random.sample should support iterators type: enhancement versions: Python 2.7, Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue37682> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com