STINNER Victor added the comment:

bench_translate.py: benchmark ASCII 1:1 but also ASCII 1:1 with deletion. 
Results of the benchmark comparing tip (47b0c076e17d which includes my latest 
optimization on deletion) and 6a347c0ffbfc + translate_cached_2.patch.

Common platform:
Python unicode implementation: PEP 393
CFLAGS: -Wno-unused-result -Werror=declaration-after-statement -DNDEBUG -g 
-fwrapv -O3 -Wall -Wstrict-prototypes
Platform: Linux-3.12.8-300.fc20.x86_64-x86_64-with-fedora-20-Heisenbug
Bits: int=32, long=64, long long=64, size_t=64, void*=64
CPU model: Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
Timer: time.perf_counter
Timer precision: 45 ns
Timer info: namespace(adjustable=False, 
implementation='clock_gettime(CLOCK_MONOTONIC)', monotonic=True, 
resolution=1e-09)

Platform of campaign remove:
SCM: hg revision=47b0c076e17d tag=tip branch=default date="2014-04-05 14:27 
+0200"
Python version: 3.5.0a0 (default:47b0c076e17d, Apr 5 2014, 14:50:53) [GCC 4.8.2 
20131212 (Red Hat 4.8.2-7)]
Date: 2014-04-05 14:51:55

Platform of campaign cache:
SCM: hg revision=6a347c0ffbfc+ branch=default date="2014-04-05 11:56 +0200"
Python version: 3.5.0a0 (default:6a347c0ffbfc+, Apr 5 2014, 14:53:02) [GCC 
4.8.2 20131212 (Red Hat 4.8.2-7)]
Date: 2014-04-05 14:53:12

---------------------------+-------------+----------------
Tests                      |      remove |           cache
---------------------------+-------------+----------------
replace none, length=10    |  184 ns (*) |   275 ns (+50%)
replace none, length=10**3 | 1.06 us (*) |          1.1 us
replace none, length=10**6 |  827 us (*) |          792 us
replace 10%, length=10     |  207 ns (*) |   298 ns (+44%)
replace 10%, length=10**3  | 1.08 us (*) |         1.12 us
replace 10%, length=10**6  |  828 us (*) |          793 us
replace 50%, length=10     |  205 ns (*) |   298 ns (+46%)
replace 50%, length=10**3  | 1.08 us (*) |   1.17 us (+7%)
replace 50%, length=10**6  |  827 us (*) |          793 us
replace 90%, length=10     |  208 ns (*) |   298 ns (+44%)
replace 90%, length=10**3  | 1.09 us (*) |         1.13 us
replace 90%, length=10**6  |  850 us (*) |    793 us (-7%)
replace all, length=10     |  145 ns (*) |   226 ns (+56%)
replace all, length=10**3  | 1.03 us (*) |         1.04 us
replace all, length=10**6  |  827 us (*) |          792 us
remove none, length=10     |  184 ns (*) |   274 ns (+49%)
remove none, length=10**3  | 1.07 us (*) |         1.09 us
remove none, length=10**6  |  836 us (*) |    793 us (-5%)
remove 10%, length=10      |  223 ns (*) |   408 ns (+83%)
remove 10%, length=10**3   | 1.45 us (*) | 9.13 us (+531%)
remove 10%, length=10**6   | 1.08 ms (*) | 8.73 ms (+706%)
remove 50%, length=10      |  221 ns (*) |   407 ns (+84%)
remove 50%, length=10**3   | 1.23 us (*) | 8.28 us (+575%)
remove 50%, length=10**6   |  948 us (*) |  7.9 ms (+734%)
remove 90%, length=10      |  230 ns (*) |   375 ns (+63%)
remove 90%, length=10**3   | 1.57 us (*) | 3.86 us (+145%)
remove 90%, length=10**6   | 1.28 ms (*) | 3.49 ms (+173%)
remove all, length=10      |  139 ns (*) |   266 ns (+92%)
remove all, length=10**3   | 1.24 us (*) |  2.46 us (+99%)
remove all, length=10**6   | 1.07 ms (*) | 2.13 ms (+100%)
---------------------------+-------------+----------------
Total                      | 9.38 ms (*) |   27 ms (+188%)
---------------------------+-------------+----------------

You patch is always slower for the common case (ASCII => ASCII translation).

I implemented the most obvious optimization for the most common case (ASCII 1:1 
and ASCII 1:1 with deletion). I consider that the current code is enough to 
close this issue.

@Josh Rosenberg: Thanks for the report. The current implementation should be 
almost as fast as bytes.translate() (the "60x" factor you mentionned in the 
title) for ASCII 1:1 mapping.

--

Serhiy: If you are interested to optimize str.translate() for the general case 
(larger charset), please open a new issue. It will probably require more 
complex "cache". You may take a look at charmap codec which has such more 
complex cache (cache with 3 levels), see my message msg215301.

IMO it's not interesting to invest time on optimizing str.translate(), it's not 
a common function. It took some years before an user run a benchmark on it :-)

----------
resolution:  -> fixed
status: open -> closed

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue21118>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to