[issue24567] random.choice IndexError due to double-rounding

2018-06-27 Thread Raymond Hettinger
Raymond Hettinger added the comment: New changeset acda5ea916f4233ab90ca7b4d28af735aa962af3 by Raymond Hettinger (Miss Islington (bot)) in branch '3.6': bpo-24567: Random subnormal.diff (GH-7954) (GH-7956) https://github.com/python/cpython/commit/acda5ea916f4233ab90ca7b4d28af735aa962af3

[issue24567] random.choice IndexError due to double-rounding

2018-06-27 Thread Raymond Hettinger
Raymond Hettinger added the comment: New changeset 0eaf7b975bd61169a8d78945d2d12f23299f61a8 by Raymond Hettinger (Miss Islington (bot)) in branch '3.7': bpo-24567: Random subnormal.diff (GH-7954) (GH-7955) https://github.com/python/cpython/commit/0eaf7b975bd61169a8d78945d2d12f23299f61a8

[issue24567] random.choice IndexError due to double-rounding

2018-06-27 Thread miss-islington
Change by miss-islington : -- pull_requests: +7563 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue24567] random.choice IndexError due to double-rounding

2018-06-27 Thread Raymond Hettinger
Raymond Hettinger added the comment: New changeset ddf7171911e117aa7ad4b0f9ded4f0c3a4ca0fec by Raymond Hettinger in branch 'master': bpo-24567: Random subnormal.diff (#7954) https://github.com/python/cpython/commit/ddf7171911e117aa7ad4b0f9ded4f0c3a4ca0fec --

[issue24567] random.choice IndexError due to double-rounding

2018-06-27 Thread miss-islington
Change by miss-islington : -- pull_requests: +7562 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue24567] random.choice IndexError due to double-rounding

2018-06-27 Thread Raymond Hettinger
Change by Raymond Hettinger : -- pull_requests: +7561 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue24567] random.choice IndexError due to double-rounding

2018-06-27 Thread Raymond Hettinger
Raymond Hettinger added the comment: I concur with Tim. Marking the original bug as "won't fix" for the reasons that he listed. In a separate PR, I'll modify the last line random.choices() to be "return [population[bisect(cum_weights, random() * total, 0, hi)] for i in range(k)]". That

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: [Victor] > This method [shuffle()] has a weird API. What is > the point of passing a random function, > ... I proposed to deprecate this argument and remove it later. I don't care here. This is a bug report. Making backward-incompatible API changes does

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread STINNER Victor
STINNER Victor added the comment: Let me see random_double_round.diff. In master: def shuffle(self, x, random=None): """Shuffle list x in place, and return None. Optional argument random is a 0-argument function returning a random float in [0.0, 1.0); if it is

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: Victor, look at Raymond's patch. In Python 3, `randrange()` and friends already use the all-integer `getrandbits()`. He's changing three other lines, where some variant of `int(random() * someinteger)` is being used in an inner loop for speed. Presumably the

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread STINNER Victor
STINNER Victor added the comment: The error goes avoid if we stop using floating point number to generate random integers. Can't we drop/deprecate support of RNG only providing getrandom() but not getrandbits()? I'm talking about the master branch. Things are simpler when you only

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: [Mark] > If we do this, can we also persuade Guido to Pronounce that > Python implementations assume IEEE 754 format and semantics > for floating-point? On its own, I don't think a change to force 53-bit precision _on_ 754 boxes would justify that. That's just

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Tim Peters
Tim Peters added the comment: Mark, ya, I agree it's most prudent to let sleeping dogs lie. In the one "real" complaint we got (issue 24546) the cause was never determined - but double rounding was ruled out in that specific case, and no _plausible_ cause was identified (short of transient

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Mark Dickinson
Mark Dickinson added the comment: Some relevant notes from Vincent Lefèvre here: http://www.vinc17.net/research/extended.en.html -- ___ Python tracker ___

[issue24567] random.choice IndexError due to double-rounding

2018-06-26 Thread Mark Dickinson
Mark Dickinson added the comment: [Tim] > Mark, do you believe that 32-bit Linux uses a different libm? I don't know. But I'd expect to see accuracy losses as a result of forcing 53-bit precision, and I wouldn't be totally surprised to see other more catastrophic failure modes (infinite

[issue24567] random.choice IndexError due to double-rounding

2018-06-25 Thread Tim Peters
Tim Peters added the comment: Mark, do you believe that 32-bit Linux uses a different libm? One that fails if, e.g., SSE2 were used instead? I don't know, but I'd sure be surprised it if did. Very surprised - compilers have been notoriously unpredictable in exactly when and where

[issue24567] random.choice IndexError due to double-rounding

2018-06-25 Thread Mark Dickinson
Mark Dickinson added the comment: [Tim] > a couple lines of inline assembler could be added at Python startup to > set the Pentium's FPU "precision control" bits to "round to 53 bits" mode On second thoughts, I don't think this will work; or at least, not easily. The libm on such a platform

[issue24567] random.choice IndexError due to double-rounding

2018-06-25 Thread Mark Dickinson
Mark Dickinson added the comment: > Python _configuration_ should be changed to disable double rounding on such > platforms That sounds good to me, with the nitpick that setting an x87 FPU to 53-bit precision still doesn't avoid potential double rounding on underflow, and I'm not aware of

[issue24567] random.choice IndexError due to double-rounding

2018-06-24 Thread Tim Peters
Tim Peters added the comment: There are a couple bug reports here that have been open for years, and it's about time we closed them. My stance: if any platform still exists on which "double rounding" is still a potential problem, Python _configuration_ should be changed to disable double

[issue24567] random.choice IndexError due to double-rounding

2018-06-24 Thread Raymond Hettinger
Raymond Hettinger added the comment: Attaching a patch for 3.6 and 3.7. Haven't had a chance to run timings yet. Since it involves adding code to the inner loop, presumably it will have a noticeable affect on speed. Am not sure how to test this (I don't seem to be able to reproduce the

[issue24567] random.choice IndexError due to double-rounding

2017-10-22 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Do you mind to create a pull request Raymond? -- components: +Library (Lib) versions: -Python 3.5 ___ Python tracker

[issue24567] random.choice IndexError due to double-rounding

2016-09-22 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: > I'll write that as "'i == n and n > 0" which reads better and runs faster in > the common case (look at the disassembly of each). issue27236 -- ___ Python tracker

[issue24567] random.choice IndexError due to double-rounding

2016-09-22 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Ping. -- versions: +Python 3.7 -Python 3.4 ___ Python tracker ___ ___

[issue24567] random.choice IndexError due to double-rounding

2016-09-10 Thread Mark Dickinson
Mark Dickinson added the comment: A similar bug affects the new `choices` method: in the `choices` source, if `random() * total` happens to round up to `total`, the bisect call returns an out-of-range index. There are two ways that that could happen: (1) double rounding, as in this issue

[issue24567] random.choice IndexError due to double-rounding

2016-06-08 Thread Raymond Hettinger
Raymond Hettinger added the comment: Will get to it shortly. -- ___ Python tracker ___ ___ Python-bugs-list

[issue24567] random.choice IndexError due to double-rounding

2016-06-07 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Ping. -- ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Mark Dickinson
Mark Dickinson added the comment: Serhiy: there's already a small bias inherent in using `int(random() * n)`, regardless of double rounding, since in general `random()` gives 2**53 equally likely (modulo deficiencies in the source generator) outcomes and `n` need not be a divisor of `2**53`.

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Tim Peters
Tim Peters added the comment: Hmm. Looks like the answer to my question came before, via Custom collection can has non-standard behavior with negative indices. Really? We're worried about a custom collection that assigns some crazy-ass meaning to a negative index applied to an empty

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Tim Peters
Tim Peters added the comment: I have a question about this new snippet in choice(): +if i == n and n 0: +i = n - 1 What's the purpose of the and n 0 clause? Without it, if i == n == 0 then i will be set to -1, which is just as good as 0 for the purpose of raising

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Tim Peters
Tim Peters added the comment: [Serhiy Storchaka] ... I want to say that double rounding causes not only bias from ideal distribution, but a difference between platforms That's so, but not too surprising. On those platforms users will see differences between primitive floating addition and

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Raymond Hettinger
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- keywords: +patch Added file: http://bugs.python.org/file39903/handle_double_rounding.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Raymond Hettinger
Changes by Raymond Hettinger raymond.hettin...@gmail.com: Removed file: http://bugs.python.org/file39903/handle_double_rounding.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567 ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Raymond Hettinger
Changes by Raymond Hettinger raymond.hettin...@gmail.com: Added file: http://bugs.python.org/file39905/handle_double_rounding2.diff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567 ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Unfortunately a link to Rietveld is not available for review and inlined comments. In sample() the selected set can be initialized to {n} to avoid additional checks in the loop for large population. In the branch for small population, non-empty pool can be

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I originally used the {n} approach but it was less clear and it lead to a re-selection rather than undoing the rounding. That would change the output. In normal case j is never equal to n. In very rare cases on platforms with double rounding the

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Raymond Hettinger
Raymond Hettinger added the comment: In sample() the selected set can be initialized to {n} I originally used the {n} approach but it was less clear and it lead to a re-selection rather than undoing the rounding. That would change the output. I like Tim's suggestion about import-time

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I'm aware of issue 23974. But I want to say that double rounding causes not only bias from ideal distribution, but a difference between platforms. The result of sample() (with the same seed) can vary between platforms. --

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Stefan Krah
Stefan Krah added the comment: The double rounding problem even occurs between different versions of gcc. I think ultimately we should put our foot down and ship with sse2 enabled (not possible for 2.7 though, unless an LTS-PEP emerges). If distributions want to break that, it's their business.

[issue24567] random.choice IndexError due to double-rounding

2015-07-12 Thread Antoine Pitrou
Antoine Pitrou added the comment: I think ultimately we should put our foot down and ship with sse2 enabled That's good for x86 but that wouldn't solve the issue if it exists on other archs. I trust Raymond to come up with an acceptable solution; though I think it would be good to add a

[issue24567] random.choice IndexError due to double-rounding

2015-07-11 Thread Tim Peters
Tim Peters added the comment: [Raymond] I can't say that I feel good about making everyone pay a price for a problem that almost no one ever has. As far as I know, nobody has ever had the problem. But if we know a bug exists, I think it's at best highly dubious to wait for a poor user to

[issue24567] random.choice IndexError due to double-rounding

2015-07-11 Thread Raymond Hettinger
Raymond Hettinger added the comment: See the attached timings. The performance hit isn't bad and the code's beauty isn't marred terribly. Yes, we'll fix it, but no I don't have to feel good about it ;-) -- ___ Python tracker

[issue24567] random.choice IndexError due to double-rounding

2015-07-11 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: For shuffle I would write if j i. I think 3.x should be fixed as well as 2.7. And I like Tim's suggestion about import-time patching. -- ___ Python tracker rep...@bugs.python.org

[issue24567] random.choice IndexError due to double-rounding

2015-07-11 Thread Raymond Hettinger
Raymond Hettinger added the comment: [Antoine] If that would give a different sequence of random numbers, I'm not sure that's acceptable in a bugfix release. Raymond can shed a light. You're right. It is not acceptable to give a different sequence of random numbers within a bugfix

[issue24567] random.choice IndexError due to double-rounding

2015-07-11 Thread Raymond Hettinger
Changes by Raymond Hettinger raymond.hettin...@gmail.com: -- assignee: - rhettinger ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567 ___ ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-11 Thread Raymond Hettinger
Raymond Hettinger added the comment: FWIW, here are some variants (with differing degrees of brevity, clarity, and performance): def choice(self, seq): Choose a random element from a non-empty sequence. n = len(seq) i = int(self.random() * n) if i == n:

[issue24567] random.choice IndexError due to double-rounding

2015-07-11 Thread Raymond Hettinger
Changes by Raymond Hettinger raymond.hettin...@gmail.com: Added file: http://bugs.python.org/file39897/choice_timings.txt ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567 ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-10 Thread STINNER Victor
STINNER Victor added the comment: As I wrote, glib switch from floats to integers to generate random numbers. To provide reproductible random numbers, an environment variable was added to select the old PRNG. Anyway, if we modify random.py, the generated numbers should be different, no? To me,

[issue24567] random.choice IndexError due to double-rounding

2015-07-10 Thread Antoine Pitrou
Antoine Pitrou added the comment: As I wrote, glib switch from floats to integers to generate random numbers. And as I wrote, this would be accepted in a feature release. Not necessarily a bugfix release. -- ___ Python tracker

[issue24567] random.choice IndexError due to double-rounding

2015-07-10 Thread Antoine Pitrou
Antoine Pitrou added the comment: I propose to rewrite Random.randint() in random.py. If that would give a different sequence of random numbers, I'm not sure that's acceptable in a bugfix release. Raymond can shed a light. -- stage: - needs patch versions: -Python 3.2, Python 3.3,

[issue24567] random.choice IndexError due to double-rounding

2015-07-10 Thread Tim Peters
Tim Peters added the comment: Anyway, if we modify random.py, the generated numbers should be different, no? Not in a bugfix release. The `min()` trick changes no results whatsoever on a box that doesn't do double-rounding. On a box that does do double-rounding, the only difference in

[issue24567] random.choice IndexError due to double-rounding

2015-07-10 Thread STINNER Victor
STINNER Victor added the comment: Ok, it looks like most people are in favor of min(). Can anyone propose a patch? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567 ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: There are other affected methods: randrange(), randint(), shuffle(), sample(). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567 ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-09 Thread Stefan Krah
Stefan Krah added the comment: Qt had a similar initiative regarding -msse2 -mfpmath: http://lists.qt-project.org/pipermail/development/2013-December/014530.html They say that also Visual Studio 2012 has switched to sse2 by default. The only problem are the Linux distributions that are stuck

[issue24567] random.choice IndexError due to double-rounding

2015-07-09 Thread Antoine Pitrou
Antoine Pitrou added the comment: I suppose the simplest fix would be to replace relevant instances of int(random() * N) with min(int(random() * N), N-1) That sounds simple and generic. It skews the distribution a tiny little bit, but it doesn't sound significant (perhaps Mark would

[issue24567] random.choice IndexError due to double-rounding

2015-07-09 Thread Tim Peters
Tim Peters added the comment: It skews the distribution a tiny little bit, ... But it doesn't - that's the point ;-) If double-rounding doesn't occur at all (which appears to be the case on most platforms), absolutely nothing changes (because min(int(random() * N), N-1) == int(random() * N)

[issue24567] random.choice IndexError due to double-rounding

2015-07-09 Thread STINNER Victor
STINNER Victor added the comment: Again, why not using only integers? Pseudo-code to only use integers: def randint(a, b): num = b - a if not num: return a nbits = (num + 1).bit_length() while True: x = random.getrandbits(nbits) if x = num:

[issue24567] random.choice IndexError due to double-rounding

2015-07-09 Thread Tim Peters
Tim Peters added the comment: Victor, if people want to use getrandbits(), we should backport the Python3 code, not reinvent it from scratch. Note too Mark's comment: There are several places in the source where something of the form `int(i * random.random())` is used. The `min()` trick is

[issue24567] random.choice IndexError due to double-rounding

2015-07-09 Thread STINNER Victor
STINNER Victor added the comment: Victor, if people want to use getrandbits(), we should backport the Python3 code, not reinvent it from scratch. Sorry, I don't understand your comment. Backport what? getrandbits() is available on Python 2 and Python 3, for Random and SystemRandom. I

[issue24567] random.choice IndexError due to double-rounding

2015-07-09 Thread Tim Peters
Tim Peters added the comment: Victor, don't ask me, look at the code: the random.choice() implementations in Python 2 and Python 3 have approximately nothing in common, and the bug here should already be impossible in Python 3 (but I can't check that, because I don't have a platform that

[issue24567] random.choice IndexError due to double-rounding

2015-07-06 Thread Stefan Krah
Stefan Krah added the comment: Python-the-language makes no promises about adhering to IEEE 754 arithmetic rule. Still, I think we could switch to -mfpmath=sse, which would at least guarantee consistent behavior across different gcc versions. --

[issue24567] random.choice IndexError due to double-rounding

2015-07-06 Thread Mark Dickinson
Mark Dickinson added the comment: Still, I think we could switch to -mfpmath=sse, which would at least guarantee consistent behavior across different gcc versions. ... which would fix fsum on 32-bit Linux, too. (I thought there was an open issue for this, but now I can't find it.)

[issue24567] random.choice IndexError due to double-rounding

2015-07-06 Thread Tim Peters
Tim Peters added the comment: Mark, closest I could find to a substantive SSE-vs-fsum report is here, but it was closed (because the fsum tests were changed to ignore the problem ;-) ): http://bugs.python.org/issue5593 -- ___ Python tracker

[issue24567] random.choice IndexError due to double-rounding

2015-07-05 Thread Mark Dickinson
Mark Dickinson added the comment: IEEE 745 IEEE 754. Stupid fingers. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue24567 ___ ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-05 Thread Mark Dickinson
Mark Dickinson added the comment: Note that this affects other Random methods, too. There are several places in the source where something of the form `int(i * random.random())` is used to select an integer in the half-open range [0, i). And a nitpick: it's a stretch to describe

[issue24567] random.choice IndexError due to double-rounding

2015-07-05 Thread STINNER Victor
STINNER Victor added the comment: There is a simple fix. If the result is invalid, drop it and generate a new number. It should keep the distribution uniform. Another fix is to only use integers. Why do we use float in choice by the way? -- ___

[issue24567] random.choice IndexError due to double-rounding

2015-07-05 Thread Stefan Krah
Stefan Krah added the comment: I've finally gotten hold of a gcc (5.1.0) that produces the unwanted code. For a *different* test case, even the -O0 and -O2 results differ, since with -O2 gcc uses mpfr (I think!) for constant folding, which produces the correct result. Using -mfpmath=sse seems

[issue24567] random.choice IndexError due to double-rounding

2015-07-05 Thread Tim Peters
Tim Peters added the comment: I suppose the simplest fix would be to replace relevant instances of int(random() * N) with min(int(random() * N), N-1) That would be robust against many kinds of arithmetic quirks, and ensure that platforms with and without such quirks would, if forced to the

[issue24567] random.choice IndexError due to double-rounding

2015-07-04 Thread Steven D'Aprano
New submission from Steven D'Aprano: While investigating issue 24546, I discovered that at least some versions of Python on 32-bit Linux have a double-rounding bug in multiplication which may lead to an unexpected IndexError in random.choice. See http://bugs.python.org/issue24546 for