subject:"\[issue10227\] Improve performance of MemoryView slicing"

[issue10227] Improve performance of MemoryView slicing

2012-02-13 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

Sure.  Flagging this as fixed.  Can´t close it until 10181 is closed due to 
some dependency thing. (perhaps someone else knows what to do?)

--
resolution:  - fixed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2012-02-13 Thread Stefan Krah


Stefan Krah stefan-use...@bytereef.org added the comment:

Great. I removed the dependency since it's fixed in both cpython
and pep-3118.

--
dependencies:  -Problems with Py_buffer management in memoryobject.c (and 
elsewhere?)
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2012-02-10 Thread Stefan Krah


Stefan Krah stefan-use...@bytereef.org added the comment:

Kristján, I ran the benchmarks from http://bugs.python.org/issue10227#msg143731
in the current cpython and pep-3118 repos. In both cases the differences between
Linux and Windows are far less pronounced than they used to be. All benchmarks
were run with the x64 builds.

I also ran the profile guided optimization build for Visual Studio. The
results are equal to (or better than) the non-pgo gcc results. In my
experience Visual Studio relies heavily on PGO for x64 builds. The
default optimizer is just not as good as gcc's.


If you can reproduce similar results, I think we can close this issue.


./python -m timeit -n 1000 -s x = ((b'x'*1)) x[:100]

linux-cpython (4244e4348362): 0.102 usec
linux-pep-3118 (memoryview:534f6bbe5422): 0.098 usec

windows-cpython:   0.109 usec
windows-pep-3118:  0.112 usec usec
windows-pep-3118-pgo:  0.103 usec


./python -m timeit -n 1000 -s x = (bytearray(b'x'*1)) x[:100]

linux-cpython (4244e4348362): 0.107 usec
linux-pep-3118 (memoryview:534f6bbe5422): 0.109 usec

windows-cpython:  0.127 usec
windows-pep-3118: 0.128 usec
windows-pep-3118-pgo: 0.106 usec


./python -m timeit -n 1000 -s x = memoryview(bytearray(b'x'*1)) 
x[:100]

linux-cpython (4244e4348362): 0.127 usec
linux-pep-3118 (memoryview:534f6bbe5422): 0.12 usec

windows-cpython:  0.145 usec
windows-pep-3118: 0.14 usec
windows-pep-3118-pgo: 0.0984 usec

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-11-18 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

Updated single slice caching patch for latest Py3.3 hg tip.

--
Added file: http://bugs.python.org/file23727/slice-object-cache.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-11-18 Thread Roundup Robot


Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset fa2f8dd077e0 by Antoine Pitrou in branch 'default':
Issue #10227: Add an allocation cache for a single slice object.
http://hg.python.org/cpython/rev/fa2f8dd077e0

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-11-18 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

Thanks Stefan. I'm leaving the issue open since the original topic is a bit 
different.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-09-08 Thread Stefan Krah


Stefan Krah stefan-use...@bytereef.org added the comment:

Kristján, could you check out the new implementation over at #10181?
I have trouble reproducing a big speed difference between bytearray
and memoryview (Linux, 64-bit). Here are the timings I get for the
current and the new version:


Slicing
---

1) ./python -m timeit -n 1000 -s x = bytearray(b'x'*1) x[:100]
2) ./python -m timeit -n 1000 -s x = memoryview(bytearray(b'x'*1)) 
x[:100]

1) cpython: 0.137 usec   pep-3118: 0.138 usec
2) cpython: 0.132 usec   pep-3118: 0.132 usec


Slicing with overhead for multidimensional capabilities:


1) ./python  -m timeit -n 1000 -s import _testbuffer; x = 
_testbuffer.ndarray([ord('x') for _ in range(1)], shape=[1]) x[:100]
2) ./python  -m timeit -n 1000 -s import numpy; x = 
numpy.ndarray(buffer=bytearray(b'x'*1), shape=[1], dtype='B') x[:100]

1) _testbuffer.c: 0.198 usec
2) numpy: 0.415 usec
Slice assignment


1) ./python -m timeit -n 1000 -s x = bytearray(b'x'*1) x[5:10] = 
x[7:12]
2) ./python -m timeit -n 1000 -s x = memoryview(bytearray(b'x'*1)) 
x[5:10] = x[7:12]

1) cpython: 0.242 usec   pep-3118: 0.240 usec
2) cpython: 0.282 usec   pep-3118: 0.287 usec


Slice assignment, overhead for multidimensional capabilities


1) ./python -m timeit -n 1000 -s import _testbuffer; x = 
_testbuffer.ndarray([ord('x') for _ in range(1)], shape=[1], 
flags=_testbuffer.ND_WRITABLE) x[5:10] = x[7:12]

2) ./python -m timeit -n 1000 -s import numpy; x = 
numpy.ndarray(buffer=bytearray(b'x'*1), shape=[1], dtype='B') x[5:10] 
= x[7:12]

_testbuffer.c: 0.469 usec
numpy: 1.37 usec


tolist
--

1) ./python -m timeit -n 1 -s import array; x = array.array('B', 
b'x'*1) x.tolist()
2) ./python -m timeit -n 1 -s x = memoryview(bytearray(b'x'*1)) 
x.tolist()

1) cpython, array:  104.0 usec
2) pep-3118, memoryview: 90.5 usec


tolist, struct module overhead
--

1) ./python -m timeit -n 1 -s import _testbuffer; x = 
_testbuffer.ndarray([ord('x') for _ in range(1)], shape=[1]) 
x.tolist()
2) ./python -m timeit -n 1 -s import numpy; x = 
numpy.ndarray(buffer=bytearray(b'x'*1), shape=[1], dtype='B') 
x.tolist()

_testbuffer.c: 1.38 msec (yes, that's microseconds!)
numpy: 104 usec

--
nosy: +skrah

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-09-08 Thread Stefan Krah


Changes by Stefan Krah stefan-use...@bytereef.org:


--
dependencies: +Problems with Py_buffer management in memoryobject.c (and 
elsewhere?)

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-09-08 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

I'm afraid I had put this matter _far_ out of my head :)  Seeing the amount of 
discussion on that other defect (stuff I had already come across and scrathced 
my head over) I think there is a lot of catching up that I'd need to do and I 
am unable to give this any priority at the moment.
My original patch sought to even out the slicing performance difference between 
bytes and bytearray.  bytes objects were very streamlined while other were not.

python.exe -m timeit -n 1000 -s x = ((b'x'*1)) x[:100]
1000 loops, best of 3: 0.125 usec per loop

python.exe -m timeit -n 1000 -s x = (bytearray(b'x'*1)) x[:100]
1000 loops, best of 3: 0.202 usec per loop

Did you take a look at this at all?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-09-08 Thread Stefan Krah


Stefan Krah stefan-use...@bytereef.org added the comment:

I see. I thought this was mainly about memoryview performance, so
I did not specifically look at bytearray. The poor performance seems
to be Windows specific:

C:\Users\stefan\hg\pep-3118\PCbuildamd64\python.exe -m timeit -n 1000 -s 
x = ((b'x'*1)) x[:100]
1000 loops, best of 3: 0.118 usec per loop

C:\Users\stefan\hg\pep-3118\PCbuildamd64\python.exe -m timeit -n 1000 -s 
x = (bytearray(b'x'*1)) x[:100]
1000 loops, best of 3: 0.191 usec per loop

C:\Users\stefan\hg\pep-3118\PCbuildamd64\python.exe -m timeit -n 1000 -s 
x = memoryview(bytearray(b'x'*1)) x[:100]
1000 loops, best of 3: 0.146 usec per loop


Linux:

bytes: 10.9 usec   bytearray: 0.14 usec   memoryview: 0.14 usec

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-09-08 Thread Stefan Krah


Stefan Krah stefan-use...@bytereef.org added the comment:

With Stefan Behnel's slice-object-cache.patch, I get this (PEP-3118 branch):


Linux:   bytes: 0.097 usec  bytearray:  0.127 usec  memoryview: 0.12  usec
Windows: bytes: 0.11 usec   bytearray:  0,184 usec  memoryview: 0.139 usec


On Linux, that's quite a nice speedup.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-06-25 Thread Mark Dickinson


Changes by Mark Dickinson dicki...@gmail.com:


--
assignee: mark.dickinson - 

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

Here are some real micro benchmarks (note that the pybench benchmarks actually 
do lots of other stuff besides slicing):

base line:

$ ./python -m timeit -s 'l = list(range(100)); s=slice(None)' 'l[s]'
100 loops, best of 3: 0.464 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None)' 'l[s]'
1000 loops, best of 3: 0.149 usec per loop
$ ./python -m timeit -s 'l = list(range(10)); s=slice(None,1)' 'l[s]'
1000 loops, best of 3: 0.135 usec per loop


patched:

$ ./python -m timeit -s 'l = list(range(100))' 'l[:1]'
1000 loops, best of 3: 0.158 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[:]'
100 loops, best of 3: 0.49 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:]'
100 loops, best of 3: 0.487 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:3]'
1000 loops, best of 3: 0.184 usec per loop

$ ./python -m timeit -s 'l = list(range(10))' 'l[:]'
1000 loops, best of 3: 0.185 usec per loop
$ ./python -m timeit -s 'l = list(range(10))' 'l[1:]'
1000 loops, best of 3: 0.181 usec per loop


original:

$ ./python -m timeit -s 'l = list(range(100))' 'l[:1]'
1000 loops, best of 3: 0.171 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[:]'
100 loops, best of 3: 0.499 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:]'
100 loops, best of 3: 0.509 usec per loop
$ ./python -m timeit -s 'l = list(range(100))' 'l[1:3]'
1000 loops, best of 3: 0.198 usec per loop

$ ./python -m timeit -s 'l = list(range(10))' 'l[:]'
1000 loops, best of 3: 0.188 usec per loop
$ ./python -m timeit -s 'l = list(range(10))' 'l[1:]'
100 loops, best of 3: 0.196 usec per loop


So the maximum impact seems to be 8% for very short slices (10) and it quickly 
goes down for longer slices where the copy impact clearly dominates. There's 
still some 2% for 100 items, though.

I find it interesting that the base line is way below the other timings. That 
makes me think it's actually worth caching constant slice instances, as CPython 
already does for tuples. Cython also caches both now. I would expect that 
constant slices like [:], [1:] or [:-1] are extremely common. As you can see 
above, caching them could speed up slicing by up to 30% for short lists, and 
still some 7% for a list of length 100.

Stefan

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

Here's another base line test: slicing an empty list

patched:

$ ./python -m timeit -s 'l = []' 'l[:]'
1000 loops, best of 3: 0.0847 usec per loop

original:

$ ./python -m timeit -s 'l = []' 'l[:]'
1000 loops, best of 3: 0.0977 usec per loop

That's about 13% less overhead.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

 I find it interesting that the base line is way below the other
 timings. That makes me think it's actually worth caching constant
 slice instances, as CPython already does for tuples.

Indeed. I have never touched it, but I suppose it needs an upgrade of
the marshal format to support slices.
(of course, this will not help for other common cases such as l[x:x+2]).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

 of course, this will not help for other common cases such as l[x:x+2]

... which is exactly what this slice caching patch is there for. ;-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

A quick test against the py3k stdlib:

find -name *.py | while read file; do egrep '\[[-0-9]*:[-0-9]*\]' $file; 
done | wc -l

This finds 2096 lines in 393 files.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-03 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

Created follow-up issue 11107 for caching constant slice objects.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-02 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

Any benchmark numbers for the slice cache?
Also, is the call to PyObject_INIT necessary?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-02 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

 Any benchmark numbers for the slice cache?

I ran the list tests in pybench and got this:

Test   minimum run-timeaverage  run-time
thisother   diffthisother   diff

ListSlicing:66ms67ms   -2.2%67ms68ms   -2.7%
 SmallLists:61ms64ms   -4.5%61ms65ms   -5.6%

Totals:   127ms   131ms   -3.3%   128ms   133ms   -4.1%

Repeating this gave me anything between 1.5% and 3.5% in total, with 2% for 
the small lists benchmark (which is the expected best case as slicing large 
lists obviously dominates the slice object creation).

IMHO, even 2% would be pretty good for such a small change.


 Also, is the call to PyObject_INIT necessary?

In any case, the ref-count needs to be re-initialised to 1. A call to 
_Py_NewReference() would be enough, though, following the example in 
listobject.c. So you can replace

 PyObject_INIT(obj, PySlice_Type);

by

 _Py_NewReference((PyObject *)obj);

in the patch. New patch attached.

--
Added file: http://bugs.python.org/file20650/slice-object-cache.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-02 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

 I ran the list tests in pybench and got this:
 
 Test   minimum run-timeaverage  run-time
 thisother   diffthisother   diff
 
 ListSlicing:66ms67ms   -2.2%67ms68ms   -2.7%
  SmallLists:61ms64ms   -4.5%61ms65ms   -5.6%
 
 Totals:   127ms   131ms   -3.3%   128ms   133ms   -4.1%
 
 Repeating this gave me anything between 1.5% and 3.5% in total, with
 2% for the small lists benchmark (which is the expected best case as
 slicing large lists obviously dominates the slice object creation).
 
 IMHO, even 2% would be pretty good for such a small change.

Well, 3% on such micro-benchmarks (and, I assume, 0% on the rest) is
generally considered very small.
On the other hand, I agree the patch itself is quite simple.

 by
 
  _Py_NewReference((PyObject *)obj);
 
 in the patch. New patch attached.

Don't you also need a _Py_ForgetReference() at the other end? Or have I
missed it?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-02 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

There's a PyObject_Del(obj) in all code paths.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-02-01 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

I've extracted and fixed the part of this patch that implements the slice 
object cache. In particular, PySlice_Fini() was incorrectly implemented. This 
patch applies cleanly for me against the latest py3k branch.

--
Added file: http://bugs.python.org/file20639/slice-object-cache.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2011-01-03 Thread Antoine Pitrou


Changes by Antoine Pitrou pit...@free.fr:


--
assignee:  - mark.dickinson
nosy: +mark.dickinson
versions: +Python 3.3 -Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-11-01 Thread Stefan Behnel


Stefan Behnel sco...@users.sourceforge.net added the comment:

I find it a lot easier to appreciate patches that implement a single change 
than those that mix different changes. There are three different things in your 
patch, which I would like to see in at least three different commits. I'd be 
happy if you could separate the changes into more readable feature patches. 
That makes it easier to accept them.

I'm generally happy about the slice changes, but you will have to benchmark the 
equivalent changes in Py3.2 to prove that they are similarly worth applying 
there.

--
nosy: +scoder

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-11-01 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

The benchmarks are from 3.2
Also, I'll do a more relevant profiling session for 3.2.  This patch is based 
on profiling results from 2.7 so there might be more relevant optimization 
cases in 3.2

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-11-01 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

In case I'm not clear enough:
The patch is for 3.2, the benchmarks are 3.2, but it was created based on 2.7 
results, which may not fully apply for 3.2

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Kristján Valur Jónsson


New submission from Kristján Valur Jónsson krist...@ccpgames.com:

In a recent email exchange on python-dev, Antoine Pitrou mentioned that slicing 
memoryview objects (lazy slices) wasn't necessarily very efficient when dealing 
with short slices.  The data he posted was:


$ ./python -m timeit -s x = b'x'*1 x[:100]
1000 loops, best of 3: 0.134 usec per loop
$ ./python -m timeit -s x = memoryview(b'x'*1) x[:100]
1000 loops, best of 3: 0.151 usec per loop

Actually, this is not a fair comparison.  A more realistic alternative to the 
memoryview is the bytearray, a mutable buffer.  My local tests gave these 
numbers:

python.exe -m timeit -n 1000 -s x = ((b'x'*1)) x[:100]
1000 loops, best of 3: 0.14 usec per loop

python.exe -m timeit -n 1000 -s x = (bytearray(b'x'*1)) x[:100]
1000 loops, best of 3: 0.215 usec per loop

python.exe -m timeit -n 1000 -s x = memoryview(bytearray(b'x'*1)) 
x[:100]
1000 loops, best of 3: 0.163 usec per loop

In this case, lazy slicing is indeed faster than greedy slicing.  However, I 
was intrigued by how much these cases differ.  Why was slicing bytes objects so 
much faster?  Each should just result in the generation of a single object.

It turns out that the slicing operation for strings (and sequences is very 
streamlined in the core.  To address this to some extent I provide a patch with 
three main components:

1) There is now a single object cache of slice objects.  These are generated by 
the core when slicing and immediately released.  Reusing them if possible is 
very beneficial.
2) The PySlice_GetIndicesEx couldn't be optimized because of aliasing.  Fixing 
that function sped it up considerably.
3) Creating a new api to create a memory view from a base memory view and a 
slice is much faster.  The old way would do two copies of a Py_buffer with 
adverse effects on cache performance.

Applying this patch provides the following figures:
python.exe -m timeit -n 1000 -s x = ((b'x'*1)) x[:100]
1000 loops, best of 3: 0.125 usec per loop

python.exe -m timeit -n 1000 -s x = (bytearray(b'x'*1)) x[:100]
1000 loops, best of 3: 0.202 usec per loop

python.exe -m timeit -n 1000 -s x = memoryview(bytearray(b'x'*1)) 
x[:100]
1000 loops, best of 3: 0.138 usec per loop

in memoryobject.c there was a comment stating that there should be an API for 
this.  Now there is, only internal.

--
components: Interpreter Core
keywords: needs review, patch
messages: 119872
nosy: krisvale, pitrou
priority: normal
severity: normal
status: open
title: Improve performance of MemoryView slicing
type: performance
versions: Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

You forgot to attach your patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

Oh dear.  Here it is.

--
Added file: http://bugs.python.org/file19410/memoryobj.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

But then, perhaps implementing the sequence protocol for memoryviews might be 
more efficient still.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

The sequence protocol (if I'm not confused) only work with a PyObject ** array.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

As an additional point:  the PyMemoryObject has a base member that I think is 
redundant.  the view.obj should be sufficient.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

 As an additional point:  the PyMemoryObject has a base member that I
 think is redundant.  the view.obj should be sufficient.

Yes, that's what I think as well.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

In 2.x, strings are sliced using PySequence_GetSlice().  ceval.c in 3.0 is 
different, there is no apply_slice there (despite comments to that effect).  
I'd have to take another look with the profiler to figure out how bytes slicing 
in 3.0 works, but I suspect that it is somehow fasttracked passed the creation 
of slice objects, etc.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Antoine Pitrou


Antoine Pitrou pit...@free.fr added the comment:

 I'd have to take another look with the profiler to figure out how
 bytes slicing in 3.0 works, but I suspect that it is somehow
 fasttracked passed the creation of slice objects, etc.

I don't think it is fasttracked at all. 
Even plain indexing is not fasttracked either.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10227] Improve performance of MemoryView slicing

2010-10-29 Thread Kristján Valur Jónsson


Kristján Valur Jónsson krist...@ccpgames.com added the comment:

Well then, its back to the profiler for 3.2.  I did all of the profiling with 
2.7 for practical reasons (it was the only version I had available at the time) 
and then ported the change to 3.2 today.  But obviously there are different 
rules in 3.2 :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10227
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

37 matches

Mail list logo