Chris Angelico wrote: > On Fri, Sep 23, 2011 at 4:14 AM, Steven D'Aprano > <steve+comp.lang.pyt...@pearwood.info> wrote: >> What makes you think it's in C? I don't have Python 3.3a, but in 3.2 the >> random module is mostly Python. There is an import of _random, which >> presumably is in C, but it doesn't have a randint method: > > True. It seems to be defined in cpython/lib/random.py as a reference > to randrange, which does a pile of error checking and calls > _randbelow... which does a whole lot of work as well as calling > random(). Guess I should have checked the code before asking! > > There's probably good reasons for using randint(),
If you want unbiased, random (or at least pseudo-random) integers chosen from an uniform distribution with proper error checking, you should use randint or randrange. > but if you just > want a pile of more-or-less random integers, int(random.random()*top) > is the best option. "I only need random-ish ints, but I need 'em yesterday!!!" If you want biased, not-quite-uniformly distributed integers with no error checking, then the above will save you a few nanoseconds per call. So if your application needs a million such ints, you might save a full five seconds or so. Sounds like premature optimization to me, but what do I know? If you have an application where the quality of randomness is unimportant and generating random ints is a genuine performance bottleneck, then go right ahead. >> I'm not seeing any significant difference in speed between 2.6 and 3.2: >> >> [steve@sylar ~]$ python2.6 -m timeit -s "from random import >> randint" "randint(0, 1000000)" >> 100000 loops, best of 3: 4.29 usec per loop >> >> [steve@sylar ~]$ python3.2 -m timeit -s "from random import >> randint" "randint(0, 1000000)" >> 100000 loops, best of 3: 4.98 usec per loop > > That might be getting lost in the noise. Try the list comp that I had > above and see if you can see a difference - or anything else that > calls randint that many times. If you want to measure the speed of randint, you should measure the speed of randint. That's what the timeit call does: it calls randint 100000 times. You can't measure the speed of: [random.randint(0,sz*10-1) for i in range(sz)] and draw conclusions about randint alone. Roughly speaking, you are measuring the time taken to: lookup range lookup sz, twice call range, possibly generating a massive list iterate over the list/range object lookup random lookup random.randint multiply sz by 10 and subtract 1 build a list of the results, repeatedly resizing it as you go (in no particular order) There is MUCH more noise in your suggestion, not my version. But for what it's worth, I get these results: [steve@sylar ~]$ python3.2 -m timeit -s "import random" -s "sz = 1000000" "[random.randint(0,sz*10-1) for i in range(sz)]" 10 loops, best of 3: 6.09 sec per loop [steve@sylar ~]$ python2.6 -m timeit -s "import random" -s "sz = 1000000" "[random.randint(0,sz*10-1) for i in range(sz)]" 10 loops, best of 3: 4.67 sec per loop So Python 3.2 is about 30% slower for the whole mess. (I would, however, give little confidence in those results: at 4-6 seconds per loop, I expect that OS-level noise will be significant.) For what (little) it's worth, it seems that Python 3.2 should be a bit faster than 2.6, at least if you consider the Pystone benchmark to mean anything. http://www.levigross.com/post/2340736877/pystone-benchmark-on-2-6-2-7-3-2 > Performance-testing with a heapsort (and by the way, it's > _ridiculously_ slower implementing it in Python instead of just > calling a.sort(), but we all knew that already!) shows a similar > difference in performance. You do know that Python's heap implementation is written in C? Don't be fooled by the heapq module being in Python. About 2/3rds of the way down, there's a sneaky little trick: # If available, use C implementation try: from _heapq import * except ImportError: pass -- Steven -- http://mail.python.org/mailman/listinfo/python-list