Hi, > * Use a Py_UCS4 buffer and then convert to the canonical form (ASCII, > UCS1 or UCS2). Approach taken by io.StringIO. io.StringIO is not only > used to write, but also to read and so a Py_UCS4 buffer is a good > compromise. > * PyAccu API: optimized version of chunks=[]; for ...: ... > chunks.append(text); return ''.join(chunks). > * Two steps: compute the length and maximum character of the output > string, allocate the output string and then write characters. str%args > was using it. > * Optimistic approach. Start with a ASCII buffer, enlarge and widen > (to UCS2 and then UCS4) the buffer when new characters are written. > Approach used by the UTF-8 decoder and by str%args since today.
I ran extensive benchmarks on these 4 methods for str%args and str.format(args). The "two steps" method is not promising: parsing the format string twice is slower than other methods. The PyAccu API is faster than a Py_UCS4 buffer to concatenate a lot of strings, but it is slower in many other cases. I implemented the last method as the new internal "_PyUnicodeWriter" API: resize / widen the string buffer when writing new characters. I implemented more optimizations: * overallocate the buffer to limit the cost of realloc() * write characters directly in the buffer, avoid temporary buffers when possible (it is possible in most cases) * disable overallocation when formating the last argument * don't copy by value but copy by reference if the result is just a string (optimization already implemented indirectly in the PyAccu API) The _PyUnicodeWriter is the fastest method: it gives a speed up of 30% over the Py_UCS4 / PyAccu in general, and from 60% to 100% in some specific cases! I also compared str%args and str.format() with Python 2.7 (byte strings), 3.2 (UTF-16 or UCS-4) and 3.3 (PEP 393): Python 3.3 is as fast as Python 2.7 and sometimes faster! (Whereras Python 3.2 is 10 to 30% slower than Python 2 in general) -- I wrote a tool to run benchmarks and to compare results: https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py https://bitbucket.org/haypo/misc/src/tip/python/bench_str.py Run the benchmark: ./python benchmark.py --file=FILE script bench_str.py Compare results: ./python benchmark.py compare_to FILE1 FILE2 FILE3 ... -- Python 2.7 vs 3.2 vs 3.3: http://bugs.python.org/file25685/REPORT_32BIT_2.7_3.2_writer http://bugs.python.org/file25687/REPORT_64BIT_2.7_3.2_writer http://bugs.python.org/file25757/report_windows7 Warning: For the Windows benchmark, Python 3.3 is compiled in 32 bits, whereas 2.7 and 3.2 are compiled in 64 bits (formatting integers is slower in 32 bits). -- UCS4 vs PyAccu vs _PyUnicodeWriter: http://bugs.python.org/file25686/REPORT_32BIT_3.3 http://bugs.python.org/file25688/REPORT_64BIT_3.3 Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com