[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: I worked on UTF-16 and UTF-32 encoders, but now I'm off my developing computer. I'll provide updated patch soon. I think that only "surrogateescape" and "surrogatepass" error handlers have need in optimization, because they are used to interpolate with other

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-24 Thread STINNER Victor
STINNER Victor added the comment: I created the issue #25227 to optimize the ASCII and Latin1 *encoders* for surrogateescape. -- ___ Python tracker ___ _

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-22 Thread STINNER Victor
STINNER Victor added the comment: Ok, here is a patch which optimizes surrogatepass too. Result of bench_utf8.py. Common platform: CFLAGS: -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes Timer info: namespace(adjustable=False, implementat

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-22 Thread STINNER Victor
Changes by STINNER Victor : Added file: http://bugs.python.org/file40546/bench_utf8.py ___ Python tracker ___ ___ Python-bugs-list mailing lis

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-22 Thread STINNER Victor
STINNER Victor added the comment: I pushed utf8.patch by mistake :-/ The advantage is that buildbots found bugs. Attached utf8-2.patch fixed bugs. The bug was how the "s" variable was set in the error handler. It's now set with: s = starts + endinpos; Bugs found by the buildbots: ===

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-22 Thread Roundup Robot
Roundup Robot added the comment: New changeset 8317796ca004 by Victor Stinner in branch 'default': Issue #24870: revert unwanted change https://hg.python.org/cpython/rev/8317796ca004 -- ___ Python tracker _

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-21 Thread STINNER Victor
STINNER Victor added the comment: Ok, I prepared the code for the UTF-8 optimization. @Serhiy: would you like to rebase your patch faster_surrogates_hadling.patch? Attached utf8.patch is a less optimal implementation which only changes PyUnicode_DecodeUTF8Stateful(). Maybe it's enough? I wou

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-21 Thread Roundup Robot
Roundup Robot added the comment: New changeset 2cf85e2834c2 by Victor Stinner in branch 'default': Issue #24870: Reuse the new _Py_error_handler enum https://hg.python.org/cpython/rev/2cf85e2834c2 New changeset aa247150a8b1 by Victor Stinner in branch 'default': Issue #24870: Add _PyUnicodeWrite

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-21 Thread STINNER Victor
STINNER Victor added the comment: I pushed a change to optimize the ASCII decoder. Attached bench.py script: microbenchmark on the ASCII decoder. My results follows. Common platform: Platform: Linux-4.1.5-200.fc22.x86_64-x86_64-with-fedora-22-Twenty_Two Bits: int=32, long=64, long long=64, siz

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-21 Thread Roundup Robot
Roundup Robot added the comment: New changeset 3c430259873e by Victor Stinner in branch 'default': Issue #24870: Optimize the ASCII decoder for error handlers: surrogateescape, https://hg.python.org/cpython/rev/3c430259873e -- nosy: +python-dev ___ Py

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-09-10 Thread STINNER Victor
STINNER Victor added the comment: I rebased patch because faster-decode-ascii-surrogateescape.patch was generated in git format, and the git format is not accepted by Rietveld (the Review button). -- Added file: http://bugs.python.org/file40429/faster-decode-ascii-surrogateescape.patc

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-08-19 Thread R. David Murray
R. David Murray added the comment: Since you already have to rewrite the string to do the escaping, I would judge it worth the extra effort to piece string together as binary, but I can understand wanting to use % notation. The performance issue seems to prevent that, though, and there's no g

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-08-19 Thread INADA Naoki
INADA Naoki added the comment: > Why are bytes being escaped in a binary blob? The reason to use > surrogateescape is when you have data that is mostly text, should be > processed as text, but can have occasional binary data. That wouldn't seem > to apply to a database binary blob. Since SQL

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-08-16 Thread INADA Naoki
INADA Naoki added the comment: I've stripped Serhiy's patch for ascii. Here is benchmark result: https://gist.github.com/methane/2376ac5d20642c05a8b6#file-result-md Is there chance for applying this patch to 3.5.1? -- Added file: http://bugs.python.org/file40195/faster-decode-ascii-su

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-08-15 Thread STINNER Victor
STINNER Victor added the comment: Oh. I restored the old title because i replied by email with an old email. -- title: surrogateescape is too slow -> Optimize coding with surrogateescape and surrogatepass error handlers ___ Python tracker

[issue24870] Optimize coding with surrogateescape and surrogatepass error handlers

2015-08-15 Thread Serhiy Storchaka
Changes by Serhiy Storchaka : -- keywords: +patch title: surrogateescape is too slow -> Optimize coding with surrogateescape and surrogatepass error handlers Added file: http://bugs.python.org/file40183/faster_surrogates_hadling.patch ___ Python trac