New submission from Serhiy Storchaka <storch...@gmail.com>: In pair to issue14625 here is a patch than speed up UTF-32 encoding in several times. In addition, it fixes an unsafe check of an integer overflow.
Here are the results of benchmarking. See benchmark tools in https://bitbucket.org/storchaka/cpython-stuff repository. On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz: Py2.7 Py3.2 Py3.3 patched 541 (+1032%) 541 (+1032%) 844 (+626%) 6125 encode utf-32le 'A'*10000 543 (+1056%) 541 (+1060%) 844 (+643%) 6275 encode utf-32le '\x80'*10000 544 (+1010%) 542 (+1014%) 843 (+616%) 6037 encode utf-32le '\x80'+'A'*9999 541 (+799%) 542 (+797%) 764 (+537%) 4864 encode utf-32le '\u0100'*10000 544 (+781%) 542 (+784%) 767 (+525%) 4793 encode utf-32le '\u0100'+'A'*9999 544 (+789%) 542 (+792%) 766 (+531%) 4834 encode utf-32le '\u0100'+'\x80'*9999 542 (+799%) 541 (+801%) 764 (+538%) 4874 encode utf-32le '\u8000'*10000 544 (+779%) 542 (+782%) 767 (+523%) 4780 encode utf-32le '\u8000'+'A'*9999 544 (+793%) 542 (+796%) 766 (+534%) 4859 encode utf-32le '\u8000'+'\x80'*9999 544 (+819%) 542 (+823%) 766 (+553%) 5001 encode utf-32le '\u8000'+'\u0100'*9999 430 (+867%) 427 (+874%) 860 (+383%) 4157 encode utf-32le '\U00010000'*10000 543 (+655%) 543 (+655%) 861 (+376%) 4101 encode utf-32le '\U00010000'+'A'*9999 543 (+658%) 543 (+658%) 861 (+378%) 4116 encode utf-32le '\U00010000'+'\x80'*9999 543 (+670%) 543 (+670%) 859 (+387%) 4180 encode utf-32le '\U00010000'+'\u0100'*9999 543 (+666%) 543 (+666%) 860 (+383%) 4158 encode utf-32le '\U00010000'+'\u8000'*9999 541 (+880%) 543 (+876%) 844 (+528%) 5300 encode utf-32be 'A'*10000 541 (+872%) 542 (+870%) 844 (+523%) 5256 encode utf-32be '\x80'*10000 544 (+843%) 542 (+846%) 843 (+509%) 5130 encode utf-32be '\x80'+'A'*9999 541 (+363%) 542 (+362%) 764 (+228%) 2505 encode utf-32be '\u0100'*10000 544 (+366%) 542 (+368%) 766 (+231%) 2534 encode utf-32be '\u0100'+'A'*9999 544 (+363%) 542 (+365%) 766 (+229%) 2519 encode utf-32be '\u0100'+'\x80'*9999 542 (+363%) 541 (+364%) 764 (+228%) 2509 encode utf-32be '\u8000'*10000 544 (+366%) 542 (+368%) 766 (+231%) 2534 encode utf-32be '\u8000'+'A'*9999 544 (+363%) 542 (+364%) 766 (+229%) 2517 encode utf-32be '\u8000'+'\x80'*9999 544 (+372%) 542 (+374%) 766 (+235%) 2568 encode utf-32be '\u8000'+'\u0100'*9999 430 (+428%) 427 (+432%) 860 (+164%) 2270 encode utf-32be '\U00010000'*10000 543 (+317%) 541 (+318%) 861 (+163%) 2262 encode utf-32be '\U00010000'+'A'*9999 543 (+320%) 541 (+321%) 861 (+165%) 2279 encode utf-32be '\U00010000'+'\x80'*9999 543 (+322%) 541 (+323%) 859 (+167%) 2290 encode utf-32be '\U00010000'+'\u0100'*9999 543 (+322%) 541 (+324%) 860 (+167%) 2292 encode utf-32be '\U00010000'+'\u8000'*9999 ---------- components: Interpreter Core, Unicode files: encode-utf32.patch keywords: patch messages: 162474 nosy: Arfrever, asvetlov, ezio.melotti, haypo, pitrou, storchaka priority: normal severity: normal status: open title: Faster UTF-32 encoding type: performance versions: Python 3.3 Added file: http://bugs.python.org/file25857/encode-utf32.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue15027> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com