On Mon, Sep 6, 2010 at 4:08 PM, BartC <ba...@freeuk.com> wrote: > "Stefan Behnel" <stefan...@behnel.de> wrote in message > news:mailman.470.1283712666.29448.python-l...@python.org... >> >> BartC, 05.09.2010 19:09: > >>> All those compilers that offer loop unrolling are therefore wasting >>> their time... >> >> Sometimes they do, yes. > > Modifying the OP's code a little: > > a = 0 > for i in xrange(100000000): # 100 million > a = a + 10 # add 10 or 100 > print a > > Manually unrolling such a loop four times (ie. 4 copies of the body, and > counting only to 25 million) increased the speed by between 16% and 47% (ie. > runtime reducing by between 14% and 32%). > > This depended on whether I added +10 or +100 (ie. whether long integers are > needed), whether it was inside or outside a function, and whether I was > running Python 2 or 3 (BTW why doesn't Python 3 just accept 'xrange' as a > synonym for 'range'?) > > These are just some simple tests on my particular machine and > implementations, but they bring up some points: > > (1) Loop unrolling does seem to have a benefit, when the loop body is small. > > (2) Integer arithmetic seems to go straight from 32-bits to long integers; > why not use 64-bits before needing long integers? >
On 64-bit systems, integer arithmetic will go from 64-bit native integers to long. Will using any emulated 64-bit type on a 32-bit system actually be better than the python long implementation? From my 64-bit linux system: In [1]: n = 2 ** 40 In [2]: type(n) Out[2]: <type 'int'> In [3]: n = 2 ** 80 In [4]: type(n) Out[4]: <type 'long'> > (3) Since the loop variable is never used, why not have a special loop > statement that repeats code so many times? This can be very fast, since the > loop counter need not be a Python object, and probably there would be no > need for unrolling at all: > > repeat 100000000: # for example > a = a + 10 > -- regards, kushal -- http://mail.python.org/mailman/listinfo/python-list