[issue9025] Non-uniformity in randrange for large arguments.

Mark Dickinson Fri, 18 Jun 2010 03:31:35 -0700

New submission from Mark Dickinson <dicki...@gmail.com>:

Not a serious bug, but worth noting:


The result of randrange(n) is not even close to uniform for large n.  Witness 
the obvious skew in the following (this takes a minute or two to run, so you 
might want to reduce the range argument):

Python 3.2a0 (py3k:81980, Jun 14 2010, 11:23:36)
[GCC 4.2.1 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from random import randrange
>>> from collections import Counter
>>> Counter(randrange(6755399441055744) % 3 for _ in range(100000000))
Counter({1: 37508130, 0: 33323818, 2: 29168052})

(The actual probabilities here are, as you might guess from the above numbers:  
{0: 1/3, 1: 3/8, 2: 7/24}.)

The cause:  for n < 2**53, randrange(n) is effectively computed as int(random() 
* n).  For small n, there's a tiny bias involved, but this is still an 
effective method.  However, as n increases towards 2**53, the bias increases 
significantly.  (For n >= 2**53, the random module uses a different strategy 
that *does* produce uniformly distributed results.)

A solution would be to lower the cutoff point where randrange() switches from 
using int(random() * n) to using the _randbelow method.

----------
components: Library (Lib)
messages: 108095
nosy: mark.dickinson, rhettinger
priority: low
severity: normal
status: open
title: Non-uniformity in randrange for large arguments.
type: behavior
versions: Python 3.2

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9025>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue9025] Non-uniformity in randrange for large arguments.

Reply via email to