[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-09 Thread STINNER Victor
STINNER Victor added the comment: Buildbots still like this new API :-) (no test failure recently) I reworked the API a little bit to make its usage simpler in Unicode encoders. I started to open new issues to using this new API in more functions producing byte strings. I consider that this i

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset 9cf89366bbcb by Victor Stinner in branch 'default': Issue #25318: Avoid sprintf() in backslashreplace() https://hg.python.org/cpython/rev/9cf89366bbcb New changeset 0a522f68d275 by Victor Stinner in branch 'default': Issue #25318: Fix backslashrepla

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread STINNER Victor
STINNER Victor added the comment: The FreeBSD 9.x buildbot is grumpy. http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.x/builds/3495/steps/test/logs/stdio Assertion failed: (start[writer->allocated] == 0), function _PyBytesWriter_CheckConsistency, file Objects/bytesobject.c, l

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset e9c1404d6bd9 by Victor Stinner in branch 'default': Issue #25318: Fix compilation error https://hg.python.org/cpython/rev/e9c1404d6bd9 -- ___ Python tracker __

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread STINNER Victor
STINNER Victor added the comment: I created the issue #25349 "Use _PyBytesWriter for bytes%args". -- ___ Python tracker ___ ___ Python

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset c134eddcb347 by Victor Stinner in branch 'default': Issue #25318: Move _PyBytesWriter to bytesobject.c https://hg.python.org/cpython/rev/c134eddcb347 -- ___ Python tracker

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset 59f4806a5add by Victor Stinner in branch 'default': Optimize backslashreplace error handler https://hg.python.org/cpython/rev/59f4806a5add -- ___ Python tracker __

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread STINNER Victor
STINNER Victor added the comment: Oh, I was surprised to see same or worse performances for UTF-8/backslashreplace. In fact, I forgot to enable overallocation. With overallocation, it is now faster ;-) I modified the API to put the "stack buffer" inside _PyBytesWriter API directly. I also rew

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-08 Thread Roundup Robot
Roundup Robot added the comment: New changeset 1a2175149c5e by Victor Stinner in branch 'default': Issue #25318: Add _PyBytesWriter API https://hg.python.org/cpython/rev/1a2175149c5e -- nosy: +python-dev ___ Python tracker

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-05 Thread STINNER Victor
STINNER Victor added the comment: My previous abandonned attempt was the issue #17742. "Add _PyBytesWriter API to optimize Unicode encoders" Oh, I forgot to mention and it may also be used to optimize bytes % args. More generally, any code generating a bytes object with an unknown length is a

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-05 Thread STINNER Victor
STINNER Victor added the comment: A few months ago, I wrote a previous implementation of the _PyBytesWriter API which embedded the "current pointer" inside _PyBytesWriter API. The problem was that GCC produced less efficient code than expect for the hotspot of the encoder. In the new implemen

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-05 Thread STINNER Victor
Changes by STINNER Victor : -- keywords: +patch Added file: http://bugs.python.org/file40685/bytes_writer.patch ___ Python tracker ___ ___

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-05 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file40684/bench_ucs1_result.txt ___ Python tracker ___ ___ Python-bugs-list mai

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-05 Thread STINNER Victor
STINNER Victor added the comment: Results of bench.py attached to issue #25227 (ASCII and Latin1 encoders): attached bench_ucs1_result.txt file. +-+--- Summary | ucs1_before | ucs1_after +-+--- ascii | 1.69 ms (*) |1.69 ms latin1 |

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-05 Thread STINNER Victor
STINNER Victor added the comment: Result of bench.py attached to issue #25267: attached bench_utf8_result.txt. --+-+--- Summary   | utf8_before | utf8_after -

[issue25318] Add _PyBytesWriter API to optimize Unicode encoders

2015-10-05 Thread STINNER Victor
New submission from STINNER Victor: Attached patch is the first step to optimize Unicode encoders: it adds a _PyBytesWriter API. This API is responsible to use the most efficient buffer depending on the need: * it's possible to use a small buffer directly allocated on the C stack * otherwise a