[issue16311] Use _PyUnicodeWriter API in text decoders

2012-11-07 Thread STINNER Victor
STINNER Victor added the comment: Oh, I forgot my benchmark results. decodebench.py result results on Linux 32 bits: (Linux-3.2.0-32-generic-pae-i686-with-debian-wheezy-sid) $ ./python bench-diff.py original writer ascii 'A'*1 4109 (-3%)3974 latin1

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-11-06 Thread Roundup Robot
Roundup Robot added the comment: New changeset 7ed9993d53b4 by Victor Stinner in branch 'default': Close #16311: Use the _PyUnicodeWriter API in text decoders http://hg.python.org/cpython/rev/7ed9993d53b4 -- nosy: +python-dev resolution: - fixed stage: - committed/rejected status:

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I updated the patch to resolve the conflict with issue14625. -- Added file: http://bugs.python.org/file27806/codecs_writer_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16311

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-31 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file27807/codecs_writer_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16311 ___

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-31 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Removed file: http://bugs.python.org/file27806/codecs_writer_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16311 ___

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-31 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: Added file: http://bugs.python.org/file27808/decodebench.res ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16311 ___

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-31 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: With the patch UTF-8 decoder 20% slower for some data. UTF-16 decoder 20% faster for some data and 20% slower for other data. UTF-32 decoder slower for many data (even after some optimization, naive code was up to 50% slower). Standard charmap decoder 10%

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-31 Thread STINNER Victor
STINNER Victor added the comment: I ran decodebench.py and bench-diff.py scripts from #14624, I just replaced repeat=10 with repeat=100 to get more reliable numbers. I only see some performance regressions between -5% and -1%, but there are some speedup on UTF-8 and UTF-32 (between +11% and

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I will do some experiments and review tomorrow. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16311 ___ ___

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-29 Thread STINNER Victor
STINNER Victor added the comment: Soon I'll post a patch, which speeds up unicode-escape and raw-unicode-escape decoders to 1.5-3x. Also there are not yet reviewed patches for UTF-32 (issue14625) and charmap (issue14850) decoders. Will be merge conflicts. codecs_writer.patch doesn't change

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-24 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@gmail.com: -- nosy: +loewis, serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue16311 ___ ___

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-24 Thread STINNER Victor
New submission from STINNER Victor: Attached patch modifies text decoders to use the _PyUnicodeWriter API to factorize the code. It removes unicode_widen() and unicode_putchar() functions. * Don't overallocate by default (except for raw-unicode-escape codec), enable overallocation on the

[issue16311] Use _PyUnicodeWriter API in text decoders

2012-10-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Soon I'll post a patch, which speeds up unicode-escape and raw-unicode-escape decoders to 1.5-3x. Also there are not yet reviewed patches for UTF-32 (issue14625) and charmap (issue14850) decoders. Will be merge conflicts. But I will review the patch.