On Sun, Apr 12, 2015 at 3:52 AM, Paul Rubin <no.email@nospam.invalid> wrote: >> PS Note that you're being "wasteful" by multiplying c*c over and over > > Yeah this is a reasonable point, though most of the c's should fit in a > machine word, at least in my 64-bit system. I think Python still > separates ints and longs in the implementation.
I don't think it does. Performance doesn't seem to change in Py3 as the numbers get bigger: rosuav@sikorsky:~$ cat perftest.py def naive_sum(top,base=0): i=0 while i<top: i+=1 base+=i return base # Correctness test print("Sum of numbers from 1 to 10 is: %d"%naive_sum(10)) print("Sum of numbers from 1 to 20 is: %d"%(naive_sum(20,1000)-1000)) import timeit for base in (0, 10, 60, 100, 200): print("Base: 2**%d"%base) # Simpler than doing the whole iteration-judging thing ourselves timeit.main([ "-s","from __main__ import naive_sum", "naive_sum(1000,%d)"%(2**base) ]) rosuav@sikorsky:~$ python2 perftest.py Sum of numbers from 1 to 10 is: 55 Sum of numbers from 1 to 20 is: 210 Base: 2**0 10000 loops, best of 3: 89.6 usec per loop Base: 2**10 10000 loops, best of 3: 89.9 usec per loop Base: 2**60 10000 loops, best of 3: 92.3 usec per loop Base: 2**100 10000 loops, best of 3: 145 usec per loop Base: 2**200 10000 loops, best of 3: 153 usec per loop Python 2.7: Clear difference in timing once all the numbers we're using are above 2**64. rosuav@sikorsky:~$ python3 perftest.py Sum of numbers from 1 to 10 is: 55 Sum of numbers from 1 to 20 is: 210 Base: 2**0 10000 loops, best of 3: 145 usec per loop Base: 2**10 10000 loops, best of 3: 144 usec per loop Base: 2**60 10000 loops, best of 3: 155 usec per loop Base: 2**100 10000 loops, best of 3: 151 usec per loop Base: 2**200 10000 loops, best of 3: 156 usec per loop Python 3.5: Much more consistent timing. Similarly, sticking an explicit "L" suffix onto the number gives Py2 consistent timings: rosuav@sikorsky:~$ python perftest.py Sum of numbers from 1 to 10 is: 55 Sum of numbers from 1 to 20 is: 210 Base: 2**0 10000 loops, best of 3: 130 usec per loop Base: 2**10 10000 loops, best of 3: 129 usec per loop Base: 2**60 10000 loops, best of 3: 138 usec per loop Base: 2**100 10000 loops, best of 3: 139 usec per loop Base: 2**200 10000 loops, best of 3: 140 usec per loop So it's the data type, not the size of the numbers, that makes the difference. rosuav@sikorsky:~$ pike perftest.pike Sum of numbers from 1 to 10 is: 55 Sum of numbers from 1 to 20 is: 210 Base: 2**0 100000 loops, best of 3: 18.3 usec per loop Base: 2**10 100000 loops, best of 3: 18.5 usec per loop Base: 2**60 100000 loops, best of 3: 18.5 usec per loop Base: 2**100 10000 loops, best of 3: 398 usec per loop Base: 2**200 10000 loops, best of 3: 406 usec per loop Like Python 3, Pike has a single "int" type which stores arbitrary-precision integers. Like Python 2, it has an optimization for small ones. (One you can't defeat by using "2L".) Interestingly, once you defeat Pike's optimizer, it's actually quite a lot slower than Py3 at repeated bignum arithmetic. Conclusion: Python 3 has one single integer type with consistent performance across the board. "Machine word" is a meaningless concept to Py3. ChrisA -- https://mail.python.org/mailman/listinfo/python-list