[issue24138] Speed up range() by caching and modifying long objects

2021-09-18 Thread Guido van Rossum


Guido van Rossum  added the comment:

(Never mind, that should have gone to https://bugs.python.org/issue24165, which 
is the integer freelist -- conclusion there in 2015 was also that it didn't do 
much.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2021-09-18 Thread Guido van Rossum


Guido van Rossum  added the comment:

Should we try the free list for integers again?

--
nosy: +gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-11 Thread Larry Hastings

Larry Hastings added the comment:

I'd rather have the general-purpose freelist for ints too.  How about we close 
this issue now, and assuming the freelist for ints goes in we can abandon this 
approach entirely.

--
resolution:  - rejected
stage: patch review - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-11 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

According to my and Larry's measurements [1] the distribution of created int's 
by size during running Python tests is:

On 32-bit Linux:
int   42828741  13.40%
  0 425353   0.99%   0.99%
  1   21399290  49.96%  50.96%
  2   10496856  24.51%  75.47%
  34873346  11.38%  86.85%
  41021563   2.39%  89.23%
  51246444   2.91%  92.14%
  6 733676   1.71%  93.85%
  7 123074   0.29%  94.14%
  8 139203   0.33%  94.47%
...

On 64-bit Linux:
int  47600237  13.53%
 0 294964   0.62%   0.62%
 1   36135772  75.92%  76.53%
 24504046   9.46%  86.00%
 32109837   4.43%  90.43%
 41277995   2.68%  93.11%
 5 542775   1.14%  94.25%
 6 485451   1.02%  95.27%
...

86% of ints have size = 3 on 32-bit and = 2 on 64-bit. This is enough to 
represent 32-bit integers (as in Python 2 int). I think we should support free 
list not only for 1-digit ints, but at least up to 3 digit on 32-bit build and 
up to 2 digits on 64-bit build. Other natural limit is 3 digit on 64-bit build 
(enough to represent 64-bit C long on Linux or pointer on all platforms). 
Larger integers perhaps are encountered mainly in tests.

Pre-allocating a block has a disadvantage. It is hard to free allocated block. 
The program can create a lot of integers, then drop most of them, and request 
the memory for other needs, but blocks once allocated for integers would not 
freed. This is not trivial design decision and should be discussed on 
Python-Dev and accepted by BDFL.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-11 Thread Antoine Pitrou

Antoine Pitrou added the comment:

I'd rather stick to the simple freelist approach. People interested in 
(allegedly) more ambitious designs can open new issues, together with patches 
to evaluate their efficiency.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-11 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

I like the idea of adding a free list of longs in Python 3, but I think we 
should extend this somewhat to cover more ground, e.g. by pre-allocating a 
block of 1 digit long objects, like we did for Python 2 ints, and perhaps 
allocate up to 4k (= 1 memory page) towards such a free list.

The added cache locality of having the data in a pre-allocated block should 
provide some more performance.

--
nosy: +lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-11 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

[1] http://comments.gmane.org/gmane.comp.python.devel/153078

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-11 Thread Marc-Andre Lemburg

Marc-Andre Lemburg added the comment:

On 11.05.2015 11:42, Serhiy Storchaka wrote:
 
 Pre-allocating a block has a disadvantage. It is hard to free allocated 
 block. The program can create a lot of integers, then drop most of them, and 
 request the memory for other needs, but blocks once allocated for integers 
 would not freed. This is not trivial design decision and should be discussed 
 on Python-Dev and accepted by BDFL.

True, but if it's only 1-4k RAM, I don't think anyone would mind :-)

Python 2 is doing exactly that with 1k RAM for the integer free
list. I think it was one of the first free lists ever added to
Python, so done in a time when RAM was expensive ;-).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-11 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Well, 1-4k RAM can be consumed just after the start up (it's only 100-300 
integers) and never freed. What next?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-07 Thread Ethan Furman

Changes by Ethan Furman et...@stoneleaf.us:


--
nosy: +ethan.furman

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
keywords: +patch
Added file: http://bugs.python.org/file39308/long_free_list.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Larry Hastings

Larry Hastings added the comment:

10**3 doesn't show off this hack as much as other numbers would; the hack only 
operates from 257 to the max in that will fit in a single long digit (32767 
on 32-bit, 2**30 on 64-bit).

Anyway, freelist for one-digit longs seems nearly as good, and it's a lot more 
general-purpose, so I'm much more interested in that.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

This regression was just discussed in issue24076. General suggestion about free 
list for small ints was proposed. It could speed up other integer computations, 
but comprehensive benchmarking results are needed.

See also similar issue23507 for tuples. Perhaps we need general solution for 
fast specialized free lists.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Mark Lawrence

Mark Lawrence added the comment:

How does this apply where people like me scarcely if ever use range, it's 
standard for loops?  There are plenty of recipes using itertools that show that 
you don't need range, is this really needed?  No axe to grind, just curious.

--
nosy: +BreamoreBoy

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Larry Hastings

New submission from Larry Hastings:

This probably shouldn't be checked in.  But it was an interesting experiment, 
and I did get it to work.

My brother forwarded me this question from Stack Overflow:

http://stackoverflow.com/questions/23453133/is-there-a-reason-python-3-enumerates-slower-than-python-2

The debate brought up a good point: a lot of the overhead of range() is in 
creating and destroying the long objects.  I wondered, could I get rid of that? 
 Long story short, yes.

rangeiterobject is a special-case range object for when you're iterating over 
integer values and the values all fit inside C longs.  Otherwise it has to use 
the much slower general-purpose longrangeiterobject.)  rangeiter_next is 
simple: it computes the new value then returns PyLong_FromLong of that value.

First thought: cache a reference to the previous value.  If its reference count 
is 1, we have the only reference.  Overwrite its value and return it.  But that 
doesn't help in the general case, because in for x in range(1000) x is 
holding a reference at the time __next__ is called on the iterator.

The trick: cache *two* old yielded objects.  In the general case, by the second 
iteration, everyone else has dropped their references to the older cached 
object and we can modify it safely.

The second trick: if the value you're yielding is one of the interned small 
ints, you *have* to return the interned small int.  (Otherwise you break 0 == 
0, I kid you not.)

With this patch applied all regression tests pass.  And, on my desktop machine, 
the benchmark they used in the above link:

./python -mtimeit -n 5 -r 2 -scnt = 0 for i in range(1000): cnt += 1

drops from 410ms to 318ms (5 loops, best of 2: 318 msec per loop).


This implementation requires the rangeiterobject to have intimate knowledge of 
the implementation of the PyLongObject, including copy-and-pasting some 
information that isn't otherwise exposed (max small int essentially).  At the 
very least that information would need to be exposed properly so 
rangeiterobject could use it correctly before this could be checked in.

It might be cleaner for longobject.c to expose a private set this long object 
to this value function.  It would fail if we can't do it safely: if the value 
was an interned (small) int, or if the long was of the wrong size (Py_SIZE(o) 
!= 1).


Is this interesting enough to pursue?  I'm happy to drop it.

--
assignee: larry
components: Interpreter Core
files: larry.range.hack.1.txt
messages: 242685
nosy: larry, pitrou, rhettinger, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Speed up range() by caching and modifying long objects
type: performance
versions: Python 3.5
Added file: http://bugs.python.org/file39307/larry.range.hack.1.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Stefan Behnel

Stefan Behnel added the comment:

See issue 24076.

--
nosy: +scoder

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Mark Dickinson

Changes by Mark Dickinson dicki...@gmail.com:


--
nosy: +mark.dickinson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24138] Speed up range() by caching and modifying long objects

2015-05-06 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch that adds a free list for 1-digit long objects.

$ ./python -m timeit -s r = range(10**3) -- for i in r: pass
Unpatched: 1 loops, best of 3: 54.4 usec per loop
With free list: 1 loops, best of 3: 38 usec per loop
With hacked range: 1 loops, best of 3: 34.5 usec per loop
Python 2.7: 1 loops, best of 3: 37.1 usec per loop

$ ./python -m timeit -s r = list(range(10**3)) -- for i in r: pass
1 loops, best of 3: 30.7 usec per loop

In Python 2.7:
$ ./python -m timeit -s r = xrange(10**3) -- for i in r: pass
1 loops, best of 3: 41.4 usec per loop
$ ./python -m timeit -s r = range(10**3) -- for i in r: pass
1 loops, best of 3: 37.1 usec per loop

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24138
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com