Re: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase

Serhiy Storchaka Tue, 17 Jul 2012 05:00:41 -0700

On 17.07.12 06:34, Eli Bendersky wrote:

The second approach is consistently 10-20% faster than the first one
(depending on input) for trunk Python 3.3


Is there any reason for this to be so? What does BytesIO give us that
the second approach does not (I tried adding more methods to the patched
RawIOBase to make it more functional, like seekable() and tell(), and it
doesn't affect performance)?

BytesIO resizes underlying buffer if it overflowed (overallocating 1/8of size and copying old content to new buffer). Total it makeslog[9/8](N) allocations and copy 8*N bytes (for large N). List uses thesame strategy, but number of chunks usually significantly less thannumber of bytes. At the end all this chunks concatenated by join, whichcalculates sum of chunk lengths and allocate the resulting array withthe desired size. That is why append/join is faster than BytesIO in thiscase.

There are other note, about ElementTree.tostringlist(). CreatingDataStream class in every function call is too expensive, and that iswhy "monkeypatched" version several times is faster than DataStreamversion for small data. But for long data it is faster too, becausedata.append() is on one lookup slower than "monkeypatched"write=data.append.

This also raises a "moral" question - should I be using the second
approach deep inside the stdlib (ET.tostring) just because it's faster?


Please note that the previous version of Python used a "monkeypatching".

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase

Reply via email to