New submission from Pernici Mario:
A trivial optimization can be made in ``pow(a, b, c)``
if ``b`` is even and ``c - a a``
```
In [1]: c = (1 100) + 1
In [2]: a = c - 1234567
In [3]: b = 2
In [4]: %timeit pow(a, b, c)
1 loops, best of 3: 3.03 s per loop
In [5]: %timeit pow(c - a if c
Pernici Mario pern...@users.sourceforge.net added the comment:
In this patch there is an implementation of the algorithm to convert
numbers in strings by recursively splitting the number in half, adapted
from Fredrik's div.py
--
Added file: http://bugs.python.org/file13496
Pernici Mario pern...@users.sourceforge.net added the comment:
In this second patch to the above patch it is added the recursive
division algorithm by Burnikel and Ziegler (BZ)
from longobject2.diff, unmodified,
to see the effect of the subquadratic algorithm; there is still a lot of
work
Pernici Mario pern...@users.sourceforge.net added the comment:
The attached patch uses mul1 in long_mul in the version patched with
30bit_longdigit13+optimizations.patch
Comparison between these two patches on hp pavilion Q8200 2.33GHz
pybench patch new patch
Pernici Mario [EMAIL PROTECTED] added the comment:
In this patch I added to the patch by Mark in issue 3944 the code
in the previous patch, modified to release memory in case of exceptions.
Benchmark for division on Athlon 64 3800+ with respect to Python3.0:
(1) Python with patch 30bit.patch
Pernici Mario [EMAIL PROTECTED] added the comment:
Mark, following your suggestions about using bigger integer types,
I added code to convert Python numbers to arrays of twodigits,
when a 64 bit integer type is supported, and for numbers with size
larger than 20; otherwise the code
Pernici Mario [EMAIL PROTECTED] added the comment:
Yes, I think that the speed-up is due to reducing the number of
shifts and masks.
Changing PyLong_SHIFT to 16 would be complicated; for instance in
v_iadd() carry could not be a digit of 16 bits anymore; writing code
specific for 64 bit
Pernici Mario [EMAIL PROTECTED] added the comment:
I have translated in C the algorithm for long division by
Burnikel and Ziegler (BZ), using the Python version fast_div.py
and the suggestions by Mark.
Here is a benchmark for divmod(p. q), p = 7**np, q = 5**nq
digits = q_digits = p_digits/2