Serhiy Storchaka added the comment: > str_replace_1char.patch: why not implementing replace_1char_inplace() in > stringlib, with one version per character type (UCS1, UCS2, UCS4)?
Because there are no benefits to do it. All three versions (UCS1, UCS2, and UCS4) have no any common code. The best implementation used for every kind of strings. For UCS1 it uses fast memchr() (findchar() has some overhead here), for UCS2 it uses findchar(), and for UCS4 it uses a dumb loop, because findchar() will be too ineffective here. > I prefer unicode_2.patch algorithm because it's simpler: only one loop (vs > two loops for str_replace_1char.patch, with a threshold of 10 different > characters). Yes, UCS1-implementation in str_replace_1char.patch is more complicated, but it is faster for more input strings. memchr() is more effective than a simple loop when the replaceable characters are rare. But when they meet often, a simple cycle is more efficient. The "attempts" counter determines how many characters will be checked before using memchr(). This speeds up the replacement in strings with frequent replacements, but a little slow down the replacement in strings with rare replacements. 10 is a compromise. str_replace_1char.patch speed up not only case when *each* character replaced, but when 1/2, 1/3, 1/5,... characters replaced. > Why do you changed your algorithm? Is str_replace_1char.patch algorithm > more efficient than unicode_2.patch algorithm? Is the speedup really > interesting? You can run benchmarks and compare results. str_replace_1char.patch provides not the best performance, but most stable results for wide sort of strings, and has no regressions comparing with 3.2. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue16061> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com