New submission from Serhiy Storchaka <storch...@gmail.com>: In pair to issue14624 here is a patch than speed up UTF-16 encoding in several times. In addition, it fixes an unsafe check of an integer overflow.
Here are the results of benchmarking. See benchmark tools in https://bitbucket.org/storchaka/cpython-stuff repository. On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz: Py2.7 Py3.2 Py3.3 patched 457 (+575%) 458 (+573%) 1077 (+186%) 3083 encode utf-16le 'A'*10000 457 (+579%) 493 (+529%) 1084 (+186%) 3102 encode utf-16le '\x80'*10000 489 (+534%) 458 (+577%) 1081 (+187%) 3102 encode utf-16le '\x80'+'A'*9999 457 (+1261%) 493 (+1161%) 1116 (+457%) 6219 encode utf-16le '\u0100'*10000 489 (+1266%) 458 (+1358%) 1126 (+493%) 6678 encode utf-16le '\u0100'+'A'*9999 489 (+1263%) 458 (+1355%) 1129 (+490%) 6666 encode utf-16le '\u0100'+'\x80'*9999 457 (+1240%) 493 (+1142%) 1118 (+448%) 6125 encode utf-16le '\u8000'*10000 489 (+1271%) 458 (+1363%) 1127 (+495%) 6702 encode utf-16le '\u8000'+'A'*9999 489 (+1271%) 458 (+1364%) 1129 (+494%) 6705 encode utf-16le '\u8000'+'\x80'*9999 489 (+1135%) 458 (+1218%) 1136 (+432%) 6038 encode utf-16le '\u8000'+'\u0100'*9999 498 (+128%) 505 (+125%) 630 (+80%) 1137 encode utf-16le '\U00010000'*10000 489 (+35%) 458 (+44%) 360 (+83%) 659 encode utf-16le '\U00010000'+'A'*9999 489 (+35%) 458 (+44%) 359 (+84%) 660 encode utf-16le '\U00010000'+'\x80'*9999 489 (+36%) 458 (+45%) 361 (+84%) 663 encode utf-16le '\U00010000'+'\u0100'*9999 489 (+36%) 458 (+45%) 361 (+84%) 663 encode utf-16le '\U00010000'+'\u8000'*9999 447 (+507%) 493 (+450%) 1086 (+150%) 2712 encode utf-16be 'A'*10000 447 (+513%) 493 (+456%) 1080 (+154%) 2739 encode utf-16be '\x80'*10000 489 (+458%) 458 (+496%) 1079 (+153%) 2729 encode utf-16be '\x80'+'A'*9999 447 (+498%) 494 (+441%) 1118 (+139%) 2672 encode utf-16be '\u0100'*10000 489 (+464%) 458 (+502%) 1128 (+144%) 2756 encode utf-16be '\u0100'+'A'*9999 489 (+463%) 458 (+502%) 1131 (+144%) 2755 encode utf-16be '\u0100'+'\x80'*9999 447 (+500%) 493 (+444%) 1119 (+139%) 2680 encode utf-16be '\u8000'*10000 489 (+463%) 458 (+502%) 1126 (+145%) 2755 encode utf-16be '\u8000'+'A'*9999 489 (+464%) 458 (+502%) 1129 (+144%) 2757 encode utf-16be '\u8000'+'\x80'*9999 489 (+479%) 458 (+518%) 1137 (+149%) 2829 encode utf-16be '\u8000'+'\u0100'*9999 499 (+102%) 506 (+99%) 630 (+60%) 1009 encode utf-16be '\U00010000'*10000 489 (+6%) 458 (+13%) 360 (+44%) 519 encode utf-16be '\U00010000'+'A'*9999 489 (+6%) 458 (+13%) 359 (+44%) 518 encode utf-16be '\U00010000'+'\x80'*9999 489 (+6%) 458 (+13%) 361 (+44%) 519 encode utf-16be '\U00010000'+'\u0100'*9999 489 (+6%) 458 (+13%) 361 (+44%) 519 encode utf-16be '\U00010000'+'\u8000'*9999 ---------- components: Interpreter Core, Unicode files: encode-utf16.patch keywords: patch messages: 162473 nosy: Arfrever, asvetlov, ezio.melotti, haypo, pitrou, storchaka priority: normal severity: normal status: open title: Faster UTF-16 encoding type: performance versions: Python 3.3 Added file: http://bugs.python.org/file25856/encode-utf16.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue15026> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com