STINNER Victor added the comment: bench_translate.py: benchmark ASCII 1:1 but also ASCII 1:1 with deletion. Results of the benchmark comparing tip (47b0c076e17d which includes my latest optimization on deletion) and 6a347c0ffbfc + translate_cached_2.patch.
Common platform: Python unicode implementation: PEP 393 CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes Platform: Linux-3.12.8-300.fc20.x86_64-x86_64-with-fedora-20-Heisenbug Bits: int=32, long=64, long long=64, size_t=64, void*=64 CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz Timer: time.perf_counter Timer precision: 45 ns Timer info: namespace(adjustable=False, implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, resolution=1e-09) Platform of campaign remove: SCM: hg revision=47b0c076e17d tag=tip branch=default date="2014-04-05 14:27 +0200" Python version: 3.5.0a0 (default:47b0c076e17d, Apr 5 2014, 14:50:53) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] Date: 2014-04-05 14:51:55 Platform of campaign cache: SCM: hg revision=6a347c0ffbfc+ branch=default date="2014-04-05 11:56 +0200" Python version: 3.5.0a0 (default:6a347c0ffbfc+, Apr 5 2014, 14:53:02) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] Date: 2014-04-05 14:53:12 ---------------------------+-------------+---------------- Tests | remove | cache ---------------------------+-------------+---------------- replace none, length=10 | 184 ns (*) | 275 ns (+50%) replace none, length=10**3 | 1.06 us (*) | 1.1 us replace none, length=10**6 | 827 us (*) | 792 us replace 10%, length=10 | 207 ns (*) | 298 ns (+44%) replace 10%, length=10**3 | 1.08 us (*) | 1.12 us replace 10%, length=10**6 | 828 us (*) | 793 us replace 50%, length=10 | 205 ns (*) | 298 ns (+46%) replace 50%, length=10**3 | 1.08 us (*) | 1.17 us (+7%) replace 50%, length=10**6 | 827 us (*) | 793 us replace 90%, length=10 | 208 ns (*) | 298 ns (+44%) replace 90%, length=10**3 | 1.09 us (*) | 1.13 us replace 90%, length=10**6 | 850 us (*) | 793 us (-7%) replace all, length=10 | 145 ns (*) | 226 ns (+56%) replace all, length=10**3 | 1.03 us (*) | 1.04 us replace all, length=10**6 | 827 us (*) | 792 us remove none, length=10 | 184 ns (*) | 274 ns (+49%) remove none, length=10**3 | 1.07 us (*) | 1.09 us remove none, length=10**6 | 836 us (*) | 793 us (-5%) remove 10%, length=10 | 223 ns (*) | 408 ns (+83%) remove 10%, length=10**3 | 1.45 us (*) | 9.13 us (+531%) remove 10%, length=10**6 | 1.08 ms (*) | 8.73 ms (+706%) remove 50%, length=10 | 221 ns (*) | 407 ns (+84%) remove 50%, length=10**3 | 1.23 us (*) | 8.28 us (+575%) remove 50%, length=10**6 | 948 us (*) | 7.9 ms (+734%) remove 90%, length=10 | 230 ns (*) | 375 ns (+63%) remove 90%, length=10**3 | 1.57 us (*) | 3.86 us (+145%) remove 90%, length=10**6 | 1.28 ms (*) | 3.49 ms (+173%) remove all, length=10 | 139 ns (*) | 266 ns (+92%) remove all, length=10**3 | 1.24 us (*) | 2.46 us (+99%) remove all, length=10**6 | 1.07 ms (*) | 2.13 ms (+100%) ---------------------------+-------------+---------------- Total | 9.38 ms (*) | 27 ms (+188%) ---------------------------+-------------+---------------- You patch is always slower for the common case (ASCII => ASCII translation). I implemented the most obvious optimization for the most common case (ASCII 1:1 and ASCII 1:1 with deletion). I consider that the current code is enough to close this issue. @Josh Rosenberg: Thanks for the report. The current implementation should be almost as fast as bytes.translate() (the "60x" factor you mentionned in the title) for ASCII 1:1 mapping. -- Serhiy: If you are interested to optimize str.translate() for the general case (larger charset), please open a new issue. It will probably require more complex "cache". You may take a look at charmap codec which has such more complex cache (cache with 3 levels), see my message msg215301. IMO it's not interesting to invest time on optimizing str.translate(), it's not a common function. It took some years before an user run a benchmark on it :-) ---------- resolution: -> fixed status: open -> closed _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21118> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com