> Since the integration of the PEP 393, str += str is not more super-fast > (but just fast).
Oh oh. str+=str is now *1450x* slower than ''.join() pattern. Here is a benchmark (see attached script, bench_build_str.py): Python 3.3 str += str : 14548 ms ''.join() : 10 ms StringIO.write: 12 ms StringBuilder : 30 ms array('u') : 67 ms Python 3.2 str += str : 9 ms ''.join() : 9 ms StringIO.write: 9 ms StringBuilder : 30 ms array('u') : 77 ms (FYI results are very different in Python 2) I expect performances similar to StringIO.write if strarray is implemented using a Py_UCS4 buffer, as io.StringIO. PyPy has a UnicodeBuilder class (in __pypy__.builders): it has append(), append_slice() and build() methods. In PyPy, it is the fastest method to build a string: PyPy 1.6 ''.join() : 16 ms StringIO.join : 24 ms StringBuilder : 9 ms array('u') : 66 ms It is even faster if you specify the size to the constructor: 3 ms. > I'm writing this email to ask you if this type solves a real issue, or if > we can just prove the super-fast str.join(list of str). Hum, it looks like "What is the most efficient string concatenation method in python?" in a frequently asked question. There is a recent thread on python- ideas mailing list: "Create a StringBuilder class and use it everywhere" http://code.activestate.com/lists/python-ideas/11147/ (I just subscribed to this list.) Another alternative is a "string-join" object. It is discussed (and implemented) in the following issue, and PyPy has also an optional implementation: http://bugs.python.org/issue1569040 http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string- join-objects Note: Python 2 has UserString.MutableString (and Python 3 has collections.UserString). Victor
import array import io import sys import time LOOPS = 100000 INITIAL = "initial value" MORE = "more data" class StringBuilder(object): """Use it instead of doing += for building unicode strings from pieces""" def __init__(self, val=""): self.val = val self.appended = [] def __iadd__(self, other): self.appended.append(other) return self def __str__(self): self.val += "".join(self.appended) self.appended = [] return self.val def main_pure(loops): "str += str" b = INITIAL for i in range(loops): b += MORE return b def main_list_append(loops): "''.join()" b = [INITIAL] for i in range(loops): b.append(MORE) return "".join(b) def main_string_builder(loops): "StringBuilder" b = StringBuilder(INITIAL) for i in range(loops): b += MORE return str(b) def main_stringio(loops): "StringIO.join" b = io.StringIO(INITIAL) for i in range(loops): b.write(MORE) return b.getvalue() def main_array(loops): "array('u')" b = array.array('u', INITIAL) for i in range(loops): b.extend(MORE) return b.tounicode() ver = sys.version_info print("Python %s.%s" % (ver.major, ver.minor)) funcs = (main_pure, main_list_append, main_stringio, main_string_builder, main_array) width = 1 + max(len(func.__doc__) for func in funcs) for func in funcs: a = time.time() func(LOOPS) b = time.time() dt = b - a print("%s: %.0f ms" % (func.__doc__.ljust(width), dt * 1000))
_______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com