> Since the integration of the PEP 393, str += str is not more super-fast
> (but just fast).

Oh oh. str+=str is now *1450x* slower than ''.join() pattern. Here is a 
benchmark (see attached script, bench_build_str.py):

Python 3.3

str += str    : 14548 ms
''.join()     : 10 ms
StringIO.write: 12 ms
StringBuilder : 30 ms
array('u')    : 67 ms

Python 3.2

str += str    : 9 ms
''.join()     : 9 ms
StringIO.write: 9 ms
StringBuilder : 30 ms
array('u')    : 77 ms

(FYI results are very different in Python 2)

I expect performances similar to StringIO.write if strarray is implemented 
using a Py_UCS4 buffer, as io.StringIO.

PyPy has a UnicodeBuilder class (in __pypy__.builders): it has append(), 
append_slice() and build() methods. In PyPy, it is the fastest method to build 
a string:

PyPy 1.6

''.join()     : 16 ms
StringIO.join : 24 ms
StringBuilder : 9 ms
array('u')    : 66 ms

It is even faster if you specify the size to the constructor: 3 ms.

> I'm writing this email to ask you if this type solves a real issue, or if
> we can just prove the super-fast str.join(list of str).

Hum, it looks like "What is the most efficient string concatenation method in 
python?" in a frequently asked question. There is a recent thread on python-
ideas mailing list:

"Create a StringBuilder class and use it everywhere"
http://code.activestate.com/lists/python-ideas/11147/
(I just subscribed to this list.)

Another alternative is a "string-join" object. It is discussed (and 
implemented) in the following issue, and PyPy has also an optional 
implementation:

http://bugs.python.org/issue1569040
http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string-
join-objects

Note: Python 2 has UserString.MutableString (and Python 3 has 
collections.UserString).

Victor
import array
import io
import sys
import time

LOOPS = 100000
INITIAL = "initial value"
MORE = "more data"

class StringBuilder(object):
    """Use it instead of doing += for building unicode strings from pieces"""
    def __init__(self, val=""):
        self.val = val
        self.appended = []

    def __iadd__(self, other):
        self.appended.append(other)
        return self

    def __str__(self):
        self.val += "".join(self.appended)
        self.appended = []
        return self.val

def main_pure(loops):
    "str += str"
    b = INITIAL
    for i in range(loops):
        b += MORE
    return b

def main_list_append(loops):
    "''.join()"
    b = [INITIAL]
    for i in range(loops):
        b.append(MORE)
    return "".join(b)

def main_string_builder(loops):
    "StringBuilder"
    b = StringBuilder(INITIAL)
    for i in range(loops):
        b += MORE
    return str(b)

def main_stringio(loops):
    "StringIO.join"
    b = io.StringIO(INITIAL)
    for i in range(loops):
        b.write(MORE)
    return b.getvalue()

def main_array(loops):
    "array('u')"
    b = array.array('u', INITIAL)
    for i in range(loops):
        b.extend(MORE)
    return b.tounicode()

ver = sys.version_info
print("Python %s.%s" % (ver.major, ver.minor))
funcs = (main_pure, main_list_append, main_stringio, main_string_builder, main_array)
width = 1 + max(len(func.__doc__) for func in funcs)
for func in funcs:
    a = time.time()
    func(LOOPS)
    b = time.time()
    dt = b - a
    print("%s: %.0f ms" % (func.__doc__.ljust(width), dt * 1000))

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to