On Tue, Jul 17, 2012 at 2:59 PM, Serhiy Storchaka <storch...@gmail.com>wrote:

> On 17.07.12 06:34, Eli Bendersky wrote:
>
>> The second approach is consistently 10-20% faster than the first one
>> (depending on input) for trunk Python 3.3
>>
>> Is there any reason for this to be so? What does BytesIO give us that
>> the second approach does not (I tried adding more methods to the patched
>> RawIOBase to make it more functional, like seekable() and tell(), and it
>> doesn't affect performance)?
>>
>
> BytesIO resizes underlying buffer if it overflowed (overallocating 1/8 of
> size and copying old content to new buffer). Total it makes log[9/8](N)
> allocations and copy 8*N bytes (for large N). List uses the same strategy,
> but number of chunks usually significantly less than number of bytes. At
> the end all this chunks concatenated by join, which calculates sum of chunk
> lengths and allocate the resulting array with the desired size. That is why
> append/join is faster than BytesIO in this case.
>
>
I've created http://bugs.python.org/issue15381 to track this (optimizing
BytesIO).


> There are other note, about ElementTree.tostringlist(). Creating
> DataStream class in every function call is too expensive, and that is why
> "monkeypatched" version several times is faster than DataStream version for
> small data. But for long data it is faster too, because data.append() is on
> one lookup slower than "monkeypatched" write=data.append.
>

I updated tostringlist() to use an outside class. This brings performance
back to the old code.

Eli
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to