[issue36957] Speed up math.isqrt

2019-05-19 Thread Mark Dickinson


Mark Dickinson  added the comment:

Calling this done for now.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-19 Thread Mark Dickinson


Mark Dickinson  added the comment:


New changeset 5c08ce9bf712acbb3f05a3a57baf51fcb534cdf0 by Mark Dickinson in 
branch 'master':
bpo-36957: Speed up math.isqrt (#13405)
https://github.com/python/cpython/commit/5c08ce9bf712acbb3f05a3a57baf51fcb534cdf0


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-19 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:


New changeset a5119e7d75c9729fc36c059d05f3d7132e7f6bb4 by Serhiy Storchaka in 
branch 'master':
bpo-36957: Add _PyLong_Rshift() and _PyLong_Lshift(). (GH-13416)
https://github.com/python/cpython/commit/a5119e7d75c9729fc36c059d05f3d7132e7f6bb4


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-19 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

I have also some ideas about algorithmic optimizations (they need to be 
tested). In classic formula $a_{i+1} = a_i + (n - a_i^2)/(2*a_i)$ we can 
calculate $n - a_i^2$ as $(n - a_{i-1}^2) - (a_i^2 - a_{i-1})^2 = (n - 
a_{i-1}^2) - (a_i^2 - a_{i-1})*(a_i^2 + a_{i-1})$. $n - a_i^2$ usually is much 
smaller than $n$, so this can speed up subtraction and division. Things become 
more complicated when use shifts as in your formula, but I think that we can 
get benefit even in this case. This can also speed up the final check $a_i^2 <= 
n$.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-19 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

It is possible to get yet 10-20% by avoiding to create temporary Python 
integers for right arguments of shift operations. PR 13416 adds two private 
functions _PyLong_Rshift() and _PyLong_Lshift() which take the second argument 
as C size_t instead of Python integer. _PyLong_Lshift() can also be used in 
factorial() and in float to int comparison.

$ ./python -m timeit -s "from math import isqrt; x = range(2**63-1000, 
2**63+1000)" "[isqrt(n) for n in x]"
Unpatched:  200 loops, best of 5: 1.84 msec per loop
Patched:200 loops, best of 5: 1.51 msec per loop

$ ./python -m timeit -s "from math import isqrt; x = range(2**95-1000, 
2**95+1000)" "[isqrt(n) for n in x]"
Unpatched:  100 loops, best of 5: 2.09 msec per loop
Patched:200 loops, best of 5: 1.75 msec per loop

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-19 Thread Serhiy Storchaka


Change by Serhiy Storchaka :


--
pull_requests: +13327

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-18 Thread Mark Dickinson


Mark Dickinson  added the comment:

> Did you try the floating point implementation?

The aim here was to use exactly the same algorithm, but speed it up by working 
with C integers where possible; that's a fairly simple change.

Using floating-point would require more complex changes. Again, the biggest 
issue with using a floating-point sqrt as an initial value is that we can't 
assume either IEEE 754 floating-point format, *or* a correctly rounded libm 
sqrt, even though both those things are highly likely on a typical modern 
machine. So any use of floating-point would also have to have an accuracy check 
and a fallback integer-only implementation for the case where the 
floating-point fails. It's possible to make those changes, but I think we'd end 
up crossing the threshold to "too complicated" for the implementation of a 
simple function.

It's a bit of a shame, because if we _are_ allowed to assume IEEE 754, and a 
correctly-rounded sqrt implementation (using round-ties-to-even), then it turns 
out that one can prove that for any value `n` smaller than 2**106 and `a := 
int(math.sqrt(float(n)))` (assuming that the `int` and `float` conversions are 
*also* correctly rounded), we have (a - 1)**2 < n < (a + 1)**2, which is 
exactly the loop invariant that the current algorithm needs to maintain.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-18 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Did you try the floating point implementation?

--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-18 Thread Mark Dickinson


Mark Dickinson  added the comment:

> introduce in GH-36887

Sorry, that should have been: introduced in GH-13244. #36887 was the 
corresponding b.p.o. issue.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-18 Thread Mark Dickinson


Change by Mark Dickinson :


--
keywords: +patch
pull_requests: +13316
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36957] Speed up math.isqrt

2019-05-18 Thread Mark Dickinson


New submission from Mark Dickinson :

The `math.isqrt` algorithm introduce in GH-36887 currently works entirely with 
Python long integers. That's unnecessarily inefficient for small inputs.

For n < 2**64, `math.isqrt(n)` can be computed, via exactly the same algorithm, 
using entirely C integer arithmetic.

For larger n, the first 5 iterations of the algorithm can similarly be 
performed entirely in C integer arithmetic, and we can then switch to long 
integer arithmetic for subsequent iterations.

On my machine, these simple changes make a substantial difference (4x faster) 
for small inputs, and a significant but less substantial difference (70% 
speedup) for inputs not much larger than 2**64. The speedup for huge integers 
is likely to be much smaller, percentage-wise.

Some timings:

Unpatched
-
lovelace:cpython mdickinson$ ./python.exe -m timeit -s "from math import isqrt" 
"[isqrt(n) for n in range(2, 1000)]"
1000 loops, best of 5: 327 usec per loop
lovelace:cpython mdickinson$ ./python.exe -m timeit -s "from math import isqrt; 
x = range(2**63-1000, 2**63+1000)" "[isqrt(n) for n in x]"
200 loops, best of 5: 1.44 msec per loop
lovelace:cpython mdickinson$ ./python.exe -m timeit -s "from math import isqrt; 
x = range(2**95-1000, 2**95+1000)" "[isqrt(n) for n in x]"
200 loops, best of 5: 1.64 msec per loop

Patched (PR imminent)
---
lovelace:cpython mdickinson$ ./python.exe -m timeit -s "from math import isqrt" 
"[isqrt(n) for n in range(2, 1000)]"
5000 loops, best of 5: 78.1 usec per loop
lovelace:cpython mdickinson$ ./python.exe -m timeit -s "from math import isqrt; 
x = range(2**63-1000, 2**63+1000)" "[isqrt(n) for n in x]"
1000 loops, best of 5: 355 usec per loop
lovelace:cpython mdickinson$ ./python.exe -m timeit -s "from math import isqrt; 
x = range(2**95-1000, 2**95+1000)" "[isqrt(n) for n in x]"
500 loops, best of 5: 954 usec per loop

--
assignee: mark.dickinson
components: Extension Modules
messages: 342799
nosy: mark.dickinson
priority: normal
severity: normal
stage: needs patch
status: open
title: Speed up math.isqrt
type: performance
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com