[issue1602] windows console doesn't print or input Unicode
Changes by Christopher Gurnee <ch...@gurneeconsulting.net>: -- nosy: +gurnec ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue1602> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23974] random.randrange() biased output
Christopher Gurnee added the comment: Option 3 of course wasn't my first choice (given how small the patch is and how minimal its potential negative impact), but it's certainly better than allowing an issue to linger in limbo. Thank you, all. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23974 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23974] random.randrange() biased output
Christopher Gurnee added the comment: There's been no activity on this issue in a few months The three options as I see it are: 1. Fix it for both randrange and SystemRandom.randrange, breaking randrange's implied stability between minor versions. 2. Fix it only for SystemRandom.randrange. 3. Close it as wont fix (for performance reasons I'd assume?). Since I'm in favor of option 2, I've attached a simple patch which implements it. Here are some quick-and-dirty performance numbers showing the decrease in performance (3 tests of the original code followed by 3 of the patched code): $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**8' 's.randrange(r)' 1 loops, best of 10: 22.5 usec per loop $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**31' 's.randrange(r)' 1 loops, best of 10: 22.6 usec per loop $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**53 * 2//3' 's.randrange(r)' 1 loops, best of 10: 22.4 usec per loop $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**8' 's.randrange(r)' 1 loops, best of 10: 23.7 usec per loop $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**31' 's.randrange(r)' 1 loops, best of 10: 46.2 usec per loop $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**53 * 2//3' 's.randrange(r)' 1 loops, best of 10: 34.8 usec per loop The patch also includes a unit test (with a false negative rate of 1 in 8.5 * 10^-8: http://www.wolframalpha.com/input/?i=probability+of+417+or+fewer+successes+in+1000+trials+with+p%3D0.5). Any opinions on which of the three options should be taken? -- keywords: +patch Added file: http://bugs.python.org/file39845/issue23974.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23974 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23974] random.randrange() biased output
Christopher Gurnee added the comment: If you have to care about security, you shouldn't use the random module at all. random.SystemRandom() merely uses a CPRNG as entropy source. But It also manipulates numbers in ways that may or may not be safe. I must respectfully disagree with this. The current docs say: Use os.urandom() or SystemRandom if you require a cryptographically secure pseudo-random number generator. That's a pretty strong statement, and IMO it would lead most to believe that SystemRandom along with *all* of its member functions is safe to use for cryptographic purposes[1] (assuming of course that os.urandom() is also a safe CSPRNG). As a compromise, perhaps SystemRandom could provide its own randrange() with the #9025 fix, while keeping random.randrange() unmodified to preserve the implied same-sequence rule. [1] I don't mean to imply that this bias bug necessarily is a cryptographic safety issue--it seems unlikely to me that it is one, however not being a cryptographer myself, I'd rather not draw any conclusions either way, and instead I'd prefer to err on the side of safety. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23974 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23974] random.randrange() biased output
New submission from Christopher Gurnee: Due to an optimization in random.randrange() only in Python 2, as the stop-start range approaches 2^53 the output becomes noticeably biased. This bug also affects random.SystemRandom. For example, counting the number of even ints in a set of 10^6 random ints each in the range [0, 2^52 * 2/3) produces normal results; about half are even: sum(randrange(2**52 * 2//3) % 2 for i in xrange(100)) / 100.0 0.499932 Change the range to [0, 2^53 * 2/3), and you get a degenerate case where evens are two times more likely to occur than odds: sum(randrange(2**53 * 2//3) % 2 for i in xrange(100)) / 100.0 0.39 The issue occurs in three places inside randrange(), here's one: if istart = _maxwidth: return self._randbelow(istart) return _int(self.random() * istart) _maxwidth is the max size of a double where every digit to the left of the decimal point can still be represented w/o loss of precision (2^53, where a double has 53 mantissa bits). With istart = _maxwidth, _randbelow() behaves correctly. With istart _maxwidth, the rounding error in random() * istart begins to cause problems as istart approaches 2^53. Changing _maxwidth to be significantly less should (practically speaking anyways) fix this, although I'm open to suggestion on just how conservatively it should be set. -- components: Library (Lib) messages: 241261 nosy: gurnec priority: normal severity: normal status: open title: random.randrange() biased output versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23974 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23974] random.randrange() biased output
Christopher Gurnee added the comment: I shouldn't have called this a rounding error issue, that's not really what it is. A smaller example might help. If I'm given a random int, x, in the range [0, 12), and asked to produce from it a random int, y, in the range (0,8], I've got (at least?) two choices: 1. y = x If x 8 Else fail 2. y = f(x), where f maps values from [0, 12) - (0,8] The problem with method 2 is you end up with a mapping like this: 0,1 - 0 2- 1 3,4 - 2 5- 3 6,7 - 4 8- 5 9,10 - 6 11 - 7 _randbelow() uses method 1 above. _int(self.random() * istart) is more like method 2. I chose 2^53 * 2/3 just because the bias was easy to demonstrate. There will always be some bias when stop-start % 2^53 != 0, but it might not manifest itself as easily as checking for evenness. Personally, I think 2^52 is still way too high as a cutoff point for using the (presumably faster, I didn't timeit) method 2, but I don't claim to be an expert here -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue23974 ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com