Mark Dickinson <[EMAIL PROTECTED]> added the comment:

Just to be clear:  this patch halves the number of shifts and masks, 
asymptotically;  it doesn't affect the number of adds and multiplies 
(except for saving a few additions to zero by setting the initial carry 
intelligently).  Is that correct?  It's quite surprising that the bit 
operations have such a large effect on the running time.

Perhaps this is an argument for considering changing PyLong_SHIFT to 16 
(or better, 32, since surely almost every compiler provides uint64_t or 
equivalent these days)?  Though that would be quite an involved task.

While I believe the idea is sound, the patch isn't quite correct---it's 
got some assertions that won't always hold.  For example, with the 
patch, in a debug build of Python, I get:

>>> a = b = 2**30-1
[36413 refs]
>>> a*b
Assertion failed: (carry <= PyLong_MASK), function x_mul, file 
Objects/longobject.c, line 2228.
Abort trap

I'm fairly sure that the assert(carry <= PyLong_MASK) doesn't matter, 
and that the assertion at the end of the main outer loop (assert(carry 
>> PyLong_SHIFT) == 0)) should still hold.  But some more comments and 
analysis in the code would be good.  For example, if carry >= 
PyLong_MASK the the comment that 2*MASK + 2*MASK*MASK is contained in a 
twodigit isn't so useful.  How big can carry get?

----------
nosy: +marketdickinson

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue3944>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to