Serhiy Storchaka added the comment:
I don't know any application using UTF-32-LE or UTF-32-BE. So I don't want to
waste Python memory/code size with a heavily optimized decoder. The patch A
looks to be enough.
Agree. I had the same doubts. That's why I proposed two patches for your
Roundup Robot added the comment:
New changeset 9badfe3a31a7 by Victor Stinner in branch 'default':
Close #14625: Rewrite the UTF-32 decoder. It is now 3x to 4x faster
http://hg.python.org/cpython/rev/9badfe3a31a7
--
nosy: +python-dev
resolution: - fixed
stage: patch review -
STINNER Victor added the comment:
I applied the patch A with minor changes: replace multiple goto with classic
break/continue and if/else.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
Serhiy Storchaka added the comment:
I applied the patch A with minor changes: replace multiple goto with
classic break/continue and if/else.
Looks good. Thanks.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
STINNER Victor added the comment:
I suggest apply patch A to 3.3 as it fixes performance
regression (2x) and is very simple.
ASCII and UTF-8 are the two most common codecs in the world, so it's justified
to have heavily optimized encoders and decoders.
I don't know any application using
Changes by Serhiy Storchaka storch...@gmail.com:
--
stage: - patch review
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
___
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file25278/decode_utf32_a.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file25279/decode_utf32_b.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file25537/decode_utf32_a_2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Changes by Serhiy Storchaka storch...@gmail.com:
Removed file: http://bugs.python.org/file25538/decode_utf32_b_2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Serhiy Storchaka added the comment:
Patches updated to 3.4.
--
keywords: +needs review
versions: +Python 3.4 -Python 3.3
Added file: http://bugs.python.org/file27638/decode_utf32_a_3.patch
___
Python tracker rep...@bugs.python.org
Changes by Serhiy Storchaka storch...@gmail.com:
--
keywords: +3.3regression
Added file: http://bugs.python.org/file27639/decode_utf32_b_3.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
Serhiy Storchaka added the comment:
I suggest apply patch A to 3.3 as it fixes performance regression (2x) and is
very simple.
--
versions: +Python 3.3
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
Changes by Serhiy Storchaka storch...@gmail.com:
--
nosy: +georg.brandl
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
___
Georg Brandl added the comment:
Very simple? You're changing most of the code there.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Serhiy Storchaka added the comment:
It was too complicated code. Actually patched code is smaller.
1 file changed, 71 insertions(+), 80 deletions(-)
UTF-16 codec was modified in some way.
--
___
Python tracker rep...@bugs.python.org
Georg Brandl added the comment:
That the new code is smaller is no guarantee that it's as correct :)
That is exactly the reason we don't put optimizations in bugfix releases.
--
___
Python tracker rep...@bugs.python.org
Serhiy Storchaka storch...@gmail.com added the comment:
The patches updated to stylistic conformity of the UTF-8 decoder. Patch B is
significantly accelerated for aligned input data (i. e. almost always),
especially for natural order. The UTF-32 decoder can now be faster than ASCII
decoder!
Changes by Serhiy Storchaka storch...@gmail.com:
Added file: http://bugs.python.org/file25538/decode_utf32_b_2.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Changes by Serhiy Storchaka storch...@gmail.com:
--
components: +Unicode
nosy: +ezio.melotti
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Serhiy Storchaka storch...@gmail.com added the comment:
Here are the results of benchmarking (numbers in MB/s).
On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:
Py2.7 Py3.2 Py3.3
patchA patchB
utf-32le 'A'*1
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com:
--
nosy: +Arfrever
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
Changes by Andrew Svetlov andrew.svet...@gmail.com:
--
nosy: +asvetlov
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
___
New submission from Serhiy Storchaka storch...@gmail.com:
I suggest two variants of patch, accelerating the utf-32 decoder. With PEP 393
utf-32 decoder slowed down up to 2x, these patches returns a performance at the
level of Python 3.2 and even much higher (2-3x over 3.2). The variant A is
Changes by Serhiy Storchaka storch...@gmail.com:
Added file: http://bugs.python.org/file25279/decode_utf32_b.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
STINNER Victor victor.stin...@gmail.com added the comment:
See also #14624 for UTF-16 decoder.
--
nosy: +haypo, pitrou
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14625
___
26 matches
Mail list logo