[issue24076] sum() several times slower on Python 3 64-bit

2021-09-22 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Always happy to help :)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-22 Thread Stefan Behnel


Stefan Behnel  added the comment:

Sorry for that, Pablo. I knew exactly where the problem was, the second I read 
your notification. Thank you for resolving it so quickly.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:


New changeset 1c7e98dc258a0e7ccd2325a1aefc4aa2de51e1c5 by Pablo Galindo Salgado 
in branch 'main':
bpo-24076: Fix reference in sum() introduced by GH-28469 (GH-28493)
https://github.com/python/cpython/commit/1c7e98dc258a0e7ccd2325a1aefc4aa2de51e1c5


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Sorry, I meant PR 28493

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Pablo Galindo Salgado


Pablo Galindo Salgado  added the comment:

Opened #28493 to fix the refleak

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Pablo Galindo Salgado


Change by Pablo Galindo Salgado :


--
pull_requests: +26888
pull_request: https://github.com/python/cpython/pull/28493

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Pablo Galindo Salgado

Pablo Galindo Salgado  added the comment:

Unfortunately commit debd80403721b00423680328d6adf160a28fbff4 introduced a 
reference leak:


❯ ./python -m test test_grammar -R :
0:00:00 load avg: 2.96 Run tests sequentially
0:00:00 load avg: 2.96 [1/1] test_grammar
beginning 9 repetitions
123456789
.
test_grammar leaked [12, 12, 12, 12] references, sum=48
test_grammar failed (reference leak)

== Tests result: FAILURE ==

1 test failed:
test_grammar

Total duration: 1.1 sec
Tests result: FAILURE

debd80403721b00423680328d6adf160a28fbff4 is the first bad commit
commit debd80403721b00423680328d6adf160a28fbff4
Author: scoder 
Date:   Tue Sep 21 11:01:18 2021 +0200

bpo-24076: Inline single digit unpacking in the integer fastpath of sum() 
(GH-28469)

 .../Core and Builtins/2021-09-20-10-02-12.bpo-24076.ZFgFSj.rst |  1 +
 Python/bltinmodule.c   | 10 +-
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 Misc/NEWS.d/next/Core and 
Builtins/2021-09-20-10-02-12.bpo-24076.ZFgFSj.rst

--
nosy: +pablogsal

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Thank you again Stefan. Now no doubts are left.

BTW, pyperf gives more stable results. I use it if have any doubts (either the 
results of timeit are not stable or the difference is less than say 10%).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Stefan Behnel


Change by Stefan Behnel :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Stefan Behnel


Stefan Behnel  added the comment:

Old, with PGO:
$ ./python -m timeit -s 'd = list(range(2**61, 2**61 + 1))' 'sum(d)'
1000 loops, best of 5: 340 usec per loop
$ ./python -m timeit -s 'd = list(range(2**30, 2**30 + 1))' 'sum(d)'
2000 loops, best of 5: 114 usec per loop
$ ./python -m timeit -s 'd = list(range(2**29, 2**29 + 1))' 'sum(d)'
5000 loops, best of 5: 73.4 usec per loop
$ ./python -m timeit -s 'd = list(range(1))' 'sum(d)'
5000 loops, best of 5: 73.3 usec per loop
$ ./python -m timeit -s 'd = [0] * 1' 'sum(d)'
5000 loops, best of 5: 78.7 usec per loop


New, with PGO:
$ ./python -m timeit -s 'd = list(range(2**61, 2**61 + 1))' 'sum(d)'
1000 loops, best of 5: 305 usec per loop
$ ./python -m timeit -s 'd = list(range(2**30, 2**30 + 1))' 'sum(d)'
2000 loops, best of 5: 115 usec per loop
$ ./python -m timeit -s 'd = list(range(2**29, 2**29 + 1))' 'sum(d)'
5000 loops, best of 5: 52.4 usec per loop
$ ./python -m timeit -s 'd = list(range(1))' 'sum(d)'
5000 loops, best of 5: 54 usec per loop
$ ./python -m timeit -s 'd = [0] * 1' 'sum(d)'
5000 loops, best of 5: 45.8 usec per loop

The results are a bit more mixed with PGO optimisation (I tried a couple of 
times), not sure why. Might just be normal fluctuation, bad benchmark value 
selection, or accidental PGO tuning, can't say. In any case, the 1-digit case 
(1, 2**29) is again about 28% faster and none of the other cases seems 
(visibly) slower.

I think this is a very clear net-win.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Stefan Behnel


Stefan Behnel  added the comment:

Hmm, thanks for insisting, Serhiy. I was accidentally using a debug build this 
time. I'll make a PGO build and rerun the microbenchmarks.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

Thank you. Could you please test PGO builds?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Stefan Behnel


Stefan Behnel  added the comment:

Original:
$ ./python -m timeit -s 'd = list(range(2**61, 2**61 + 1))' 'sum(d)'
500 loops, best of 5: 712 usec per loop
$ ./python -m timeit -s 'd = list(range(2**30, 2**30 + 1))' 'sum(d)'
2000 loops, best of 5: 149 usec per loop
$ ./python -m timeit -s 'd = list(range(2**29, 2**29 + 1))' 'sum(d)'
2000 loops, best of 5: 107 usec per loop
$ ./python -m timeit -s 'd = list(range(1))' 'sum(d)'
2000 loops, best of 5: 107 usec per loop

New:
$ ./python -m timeit -s 'd = list(range(2**61, 2**61 + 1))' 'sum(d)'
500 loops, best of 5: 713 usec per loop
$ ./python -m timeit -s 'd = list(range(2**30, 2**30 + 1))' 'sum(d)'
2000 loops, best of 5: 148 usec per loop
$ ./python -m timeit -s 'd = list(range(2**29, 2**29 + 1))' 'sum(d)'
5000 loops, best of 5: 77.4 usec per loop
$ ./python -m timeit -s 'd = list(range(1))' 'sum(d)'
5000 loops, best of 5: 77.2 usec per loop

Seems to be 28% faster for the single digit case and exactly as fast as before 
with larger integers.
Note that these are not PGO builds.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Stefan Behnel


Change by Stefan Behnel :


--
Removed message: https://bugs.python.org/msg402301

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Stefan Behnel


Stefan Behnel  added the comment:

Original:
$ ./python -m timeit -s 'd = list(range(2**61, 2**61 + 1))' 'sum(d)'
500 loops, best of 5: 712 usec per loop
$ ./python -m timeit -s 'd = list(range(2**30, 2**30 + 1))' 'sum(d)'
2000 loops, best of 5: 149 usec per loop
$ ./python -m timeit -s 'd = list(range(2**29, 2**29 + 1))' 'sum(d)'
2000 loops, best of 5: 107 usec per loop
$ ./python -m timeit -s 'd = list(range(1))' 'sum(d)'
2000 loops, best of 5: 107 usec per loop

New:
stefan@flup:~/ablage/software/Python/python-git$ ./python -m timeit -s 'd = 
list(range(2**61, 2**61 + 1))' 'sum(d)'
500 loops, best of 5: 713 usec per loop
$ ./python -m timeit -s 'd = list(range(2**30, 2**30 + 1))' 'sum(d)'
2000 loops, best of 5: 148 usec per loop
$ ./python -m timeit -s 'd = list(range(2**29, 2**29 + 1))' 'sum(d)'
5000 loops, best of 5: 77.4 usec per loop
$ ./python -m timeit -s 'd = list(range(1))' 'sum(d)'
5000 loops, best of 5: 77.2 usec per loop

Seems to be 28% faster for the single digit case and exactly as fast as before 
with larger integers.
Note that these are not PGO builds.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

What are microbenchmark results for PR 28469 in comparison with the baseline?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-21 Thread Stefan Behnel


Stefan Behnel  added the comment:


New changeset debd80403721b00423680328d6adf160a28fbff4 by scoder in branch 
'main':
bpo-24076: Inline single digit unpacking in the integer fastpath of sum() 
(GH-28469)
https://github.com/python/cpython/commit/debd80403721b00423680328d6adf160a28fbff4


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-20 Thread Guido van Rossum


Guido van Rossum  added the comment:

Sounds good, you have my blessing.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-20 Thread Stefan Behnel


Stefan Behnel  added the comment:

> The patch looks fine, but it looks a bit like benchmark chasing. Is the speed 
> of builtin sum() of a sequence of integers important enough to do this bit of 
> inlining?

Given that we already accepted essentially separate loops for the int, float 
and everything else cases, I think the answer is that it doesn't add much to 
the triplication.


> It may break if we change the internals of Py_Long, as Mark Shannon has been 
> wanting to do for a while

I would assume that such a structural change would come with suitable macros to 
unpack the special 0-2 digit integers. Those would then apply here, too. As it 
stands, there are already some modules distributed over the source tree that 
use direct digit access: ceval.c, _decimal.c, marshal.c. They are easy to find 
with grep and my PR just adds one more.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-20 Thread Guido van Rossum


Guido van Rossum  added the comment:

The patch looks fine, but it looks a bit like benchmark chasing. Is the speed 
of builtin sum() of a sequence of integers important enough to do this bit of 
inlining? (It may break if we change the internals of Py_Long, as Mark Shannon 
has been wanting to do for a while -- see 
https://github.com/faster-cpython/ideas/issues/42.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-20 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> I created a PR from my last patch, inlining the unpacking 
> of single digit integers. 

Thanks, that gets to the heart of the issue.

I marked the PR as approved (though there is a small coding nit you may want to 
fix).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-20 Thread Stefan Behnel


Stefan Behnel  added the comment:

I created a PR from my last patch, inlining the unpacking of single digit 
integers. Since most integers should fit into a single digit these days, this 
is as fast a path as it gets.

https://github.com/python/cpython/pull/28469

--
versions: +Python 3.11 -Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-20 Thread Stefan Behnel


Change by Stefan Behnel :


--
pull_requests: +26868
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/28469

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-20 Thread Raymond Hettinger


Raymond Hettinger  added the comment:

> OTOH on my Mac I still find that 3.10 with PGO is still 
> more than twice as slow than 2.7.

> Thinking about it that's a bit odd, since (presumably) 
> the majority of the work in sum() involves a long int result 
> (even though the values returned by range() all fit in 30 bits, 
> the sum quickly exceeds that).

The actual accumulation of a long int result is still as fast as it ever was.

The main difference from Py2.7 isn't the addition, it is that detecting and 
extracting a small int added has become expensive.  


-- Python 2 fastpath --

if (PyInt_CheckExact(item)) {   // Very cheap
long b = PyInt_AS_LONG(item);   // Very cheap
long x = i_result + b;  // Very cheap
if ((x^i_result) >= 0 || (x^b) >= 0) {  // Semi cheap
i_result = x;   // Zero cost 
Py_DECREF(item);// Most expensive step, 
but still cheap
continue;
}
}

-- Python 3 fastpath --

if (PyLong_CheckExact(item) || PyBool_Check(item)) { // Cheap
long b = PyLong_AsLongAndOverflow(item, );  // Super 
Expensive
if (overflow == 0 && // Branch 
predictable test
(i_result >= 0 ? (b <= LONG_MAX - i_result)  // Slower 
but better test  
   : (b >= LONG_MIN - i_result)))
{
i_result += b;// Very 
cheap
Py_DECREF(item);
continue;
}
}

-- Supporting function 

long
PyLong_AsLongAndOverflow(PyObject *vv, int *overflow) // OMG, 
this does a lot of work
{
/* This version by Tim Peters */
PyLongObject *v;
unsigned long x, prev;
long res;
Py_ssize_t i;
int sign;
int do_decref = 0; /* if PyNumber_Index was called */

*overflow = 0;
if (vv == NULL) {
PyErr_BadInternalCall();
return -1;
}

if (PyLong_Check(vv)) {
v = (PyLongObject *)vv;
}
else {
v = (PyLongObject *)_PyNumber_Index(vv);
if (v == NULL)
return -1;
do_decref = 1;
}

res = -1;
i = Py_SIZE(v);

switch (i) {
case -1:
res = -(sdigit)v->ob_digit[0];
break;
case 0:
res = 0;
break;
case 1:
res = v->ob_digit[0];
break;
default:
sign = 1;
x = 0;
if (i < 0) {
sign = -1;
i = -(i);
}
while (--i >= 0) {
prev = x;
x = (x << PyLong_SHIFT) | v->ob_digit[i];
if ((x >> PyLong_SHIFT) != prev) {
*overflow = sign;
goto exit;
}
}
/* Haven't lost any bits, but casting to long requires extra
 * care (see comment above).
 */
if (x <= (unsigned long)LONG_MAX) {
res = (long)x * sign;
}
else if (sign < 0 && x == PY_ABS_LONG_MIN) {
res = LONG_MIN;
}
else {
*overflow = sign;
/* res is already set to -1 */
}
}
  exit:
if (do_decref) {
Py_DECREF(v);
}
return res;
}

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2021-09-18 Thread Guido van Rossum


Guido van Rossum  added the comment:

@Stefan
> FWIW, a PGO build of Py3.7 is now about 20% *faster* here than my Ubuntu 
> 16/04 system Python 2.7

Does that mean we can close this issue? Or do I misunderstand what you are 
comparing? 32 vs. 64 bits? PGO vs. non-PGO?

OTOH on my Mac I still find that 3.10 with PGO is still more than twice as slow 
than 2.7.

Thinking about it that's a bit odd, since (presumably) the majority of the work 
in sum() involves a long int result (even though the values returned by range() 
all fit in 30 bits, the sum quickly exceeds that).

(earlier)
> I suspect that adding a free-list for single-digit PyLong objects (the most 
> common case) would provide some visible benefit.

If my theory is correct that wouldn't help this particular case, right?

FWIW just "for i in [x]range(15, 10**9, 15): pass" is about the same speed in 
Python 2.7 as in 3.11.

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2018-08-12 Thread Antoine Pitrou


Change by Antoine Pitrou :


--
nosy:  -pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2018-08-12 Thread Stefan Behnel


Stefan Behnel  added the comment:

FWIW, a PGO build of Py3.7 is now about 20% *faster* here than my Ubuntu 16/04 
system Python 2.7, and for some (probably unrelated) reason, the system Python 
3.5 is another 2% faster on my side.

IMHO, the only other thing that seems obvious to try would be to inline the 
unpacking of single digit PyLongs into sum(). I attached a simple patch that 
does that, in case someone wants to test it out. For non-PGO builds, it's about 
17% faster for me. Didn't take the time to benchmark PGO builds with it.

--
versions:  -Python 3.5
Added file: https://bugs.python.org/file47748/unpack_single_digits.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3 64-bit

2017-04-11 Thread Louie Lu

Changes by Louie Lu :


--
title: sum() several times slower on Python 3 -> sum() several times slower on 
Python 3 64-bit

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2017-04-11 Thread Louie Lu

Changes by Louie Lu :


--
versions: +Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-05-11 Thread Stefan Behnel

Stefan Behnel added the comment:

Issue 24165 was created to pursue the path of a free-list for PyLong objects.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-05-01 Thread Steven D'Aprano

Steven D'Aprano added the comment:

Antoine asked: 
 If someone has actual code that suffers from this, it would be good to know 
 about it.

You might have missed Łukasz' earlier comment: In this particular case, 
there's internal usage at Twitter that unearthed it. The example is just a 
simplified repro.

--
nosy: +steven.daprano

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-05-01 Thread Stefan Behnel

Stefan Behnel added the comment:

I tried implementing a freelist. Patch attached, mostly adapted from the one in 
dictobject.c, but certainly needs a bit of cleanup.

The results are not bad, about 10-20% faster:

Original:

$ ./python -m timeit 'sum(range(1, 10))'
1000 loops, best of 3: 1.86 msec per loop

$ ./python -m timeit -s 'l = list(range(1000, 1))' '[(i*2+5) // 7 for i in 
l]'
1000 loops, best of 3: 1.05 msec per loop


With freelist:

$ ./python -m timeit 'sum(range(1, 10))'
1000 loops, best of 3: 1.52 msec per loop

$ ./python -m timeit -s 'l = list(range(1000, 1))' '[(i*2+5) // 7 for i in 
l]'
1000 loops, best of 3: 931 usec per loop

--
keywords: +patch
Added file: http://bugs.python.org/file39245/pylong_freelist.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-05-01 Thread Stefan Behnel

Stefan Behnel added the comment:

I don't think it's irrelevant. Throw-away integers are really not uncommon. 
For-loops use them quite often, non-trivial arithmetic expressions can create a 
lot of intermediate temporaries. Speeding up the create-delete cycle of PyLong 
sounds like a very obvious thing to do.

Imagine some code that iterates over a list of integers, applies some 
calculation to them, and then stores them in a new list, maybe even using a 
list comprehension or so. If you could speed up the intermediate calculation by 
avoiding overhead in creating temporary PyLong objects, such code could benefit 
a lot.

I suspect that adding a free-list for single-digit PyLong objects (the most 
common case) would provide some visible benefit.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-05-01 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Le 01/05/2015 08:09, Stefan Behnel a écrit :
 
 I don't think it's irrelevant. Throw-away integers are really not
uncommon. For-loops use them quite often, non-trivial arithmetic
expressions can create a lot of intermediate temporaries. Speeding up
the create-delete cycle of PyLong sounds like a very obvious thing to do.

That may be a good thing indeed. I'm just saying that the benchmarks
people are worried about here are completely pointless.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-30 Thread Antoine Pitrou

Antoine Pitrou added the comment:

It seems we (like the benchmarks posted) are spending a whole lot of time on 
something that's probably not relevant to any real-world situation.

If someone has actual code that suffers from this, it would be good to know 
about it.
(note by the way that summing on a range() can be done O(1): it's just a 
variation on a arithmetic series)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Łukasz Langa

Łukasz Langa added the comment:

Serhiy, this is 64-bit specific. Antoine, as far as I can tell, the main use 
case is: Don't make it look like migrating to Python 3 is a terrible 
performance downgrade.

As we discussed on the language summit this year [1], we have to be at least 
not worse to look appealing. This might be a flawed benchmark but people will 
make them anyway. In this particular case, there's internal usage at Twitter 
that unearthed it. The example is just a simplified repro.

Some perf degradations were expected, like switching text to Unicode. In this 
case, the end result computed by both 2.7 and 3.4 is the same so we should be 
able to address this.

[1] http://lwn.net/Articles/640224/

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

If that's due to the different representation of Python 2's int type and Python 
3's int type then I don't see an easy solution to this.

--
versions:  -Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I reproduce under 64-bit Linux. So this may be because the Python long digit 
(30 bits) is smaller than the C long (64 bits).

Lukasz: is there a specific use case? Note you can use Numpy for such 
calculations.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Can't reproduce on 32-bit Linux.

$ time python2.7 -c print sum(xrange(3, 10**9, 3)) + sum(xrange(5, 10**9, 5)) 
- sum(xrange(15, 10**9, 15))
216668

real1m11.614s
user1m11.376s
sys 0m0.056s
$ time python3.4 -c print(sum(range(3, 10**9, 3)) + sum(range(5, 10**9, 5)) - 
sum(range(15, 10**9, 15)))
216668

real1m11.658s
user1m10.980s
sys 0m0.572s

$ python2.7 -m timeit -n1 -r1 sum(xrange(3, 10**9, 3)) + sum(xrange(5, 10**9, 
5)) - sum(xrange(15, 10**9, 15))
1 loops, best of 1: 72 sec per loop
$ python3.4 -m timeit -n1 -r1 sum(range(3, 10**9, 3)) + sum(range(5, 10**9, 
5)) - sum(range(15, 10**9, 15))
1 loops, best of 1: 72.5 sec per loop

$ python2.7 -m timeit -s a = list(range(10**6)) -- sum(a)
10 loops, best of 3: 114 msec per loop
$ python3.4 -m timeit -s a = list(range(10**6)) -- sum(a)
10 loops, best of 3: 83.5 msec per loop

What is sys.int_info on your build?

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Łukasz Langa

New submission from Łukasz Langa:

I got a report that summing numbers is noticably slower on Python 3. This is 
easily reproducible:


$ time python2.7 -c print sum(xrange(3, 10**9, 3)) + sum(xrange(5, 10**9, 5)) 
- sum(xrange(15, 10**9, 15))
216668

real0m6.165s
user0m6.100s
sys 0m0.032s


$ time python3.4 -c print(sum(range(3, 10**9, 3)) + sum(range(5, 10**9, 5)) - 
sum(range(15, 10**9, 15)))
216668

real0m16.413s
user0m16.086s
sys 0m0.089s


I can't tell from initial poking what's the core issue here. Both examples 
produce equivalent bytecode, the builtin_sum() function is only noticably 
different in the fact that it uses PyLong_* across the board, including 
PyLong_AsLongAndOverlow. We'll need to profile this, which I didn't have time 
for yet.

--
components: Interpreter Core
messages: 242238
nosy: lukasz.langa, pitrou, rhettinger
priority: normal
severity: normal
stage: needs patch
status: open
title: sum() several times slower on Python 3
type: performance
versions: Python 3.4, Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Stefan Behnel

Stefan Behnel added the comment:

 there are three ingredients here - sum, (x)range and the integer addition 
 that sum will be performing at each iteration.

... not to forget the interpreter startup time on his machine. :)

I did a tiny bit of profiling and about 90% of the time seems to be spent 
creating and deallocating throw-away PyLong objects. My guess is that it simply 
lacks a free-list in _PyLong_New().

--
nosy: +scoder

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Mark Dickinson

Mark Dickinson added the comment:

Łukasz: there are three ingredients here - sum, (x)range and the integer 
addition that sum will be performing at each iteration.  Is there any chance 
you can separate the effects on your machine?

On my machine (OS X, 64-bit), I'm seeing *some* speed difference in the integer 
arithmetic, but not enough to explain the whole of the timing mismatch.

One thing we've lost in Python 3 is the fast path for small-int addition 
*inside* the ceval loop.  It may be possible to restore something there.

--
nosy: +mark.dickinson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24076] sum() several times slower on Python 3

2015-04-29 Thread Mark Dickinson

Mark Dickinson added the comment:

Throwing out sum, I'm seeing significant slowdown simply from xrange versus 
range:

taniyama:Desktop mdickinson$ python2 -m timeit -s 'x = xrange(3, 10**9, 3)' 
'for e in x: pass'
10 loops, best of 3: 5.01 sec per loop
taniyama:Desktop mdickinson$ python3 -m timeit -s 'x = range(3, 10**9, 3)' 'for 
e in x: pass'
10 loops, best of 3: 8.62 sec per loop

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com