Hi, Since the integration of the PEP 393, str += str is not more super-fast (but just fast). For example, adding a single character to a string has to copy all characters to a new string. I suppose that performances of a lot of applications manipulating text may be affected by this issue, especially text templating libraries.
io.StringIO has also been changed to store characters as Py_UCS4 (4 bytes) instead of Py_UNICODE (2 or 4 bytes). This class doesn't benefit from the new PEP 393. I propose to add a new builtin type to Python to improve both issues (cpu and memory): *strarray*. This type would have the same API than str, except: * has append() and extend() methods * methods results are strarray instead of str I'm writing this email to ask you if this type solves a real issue, or if we can just prove the super-fast str.join(list of str). -- strarray is similar to bytearray, but different: strarray('abc')[0] is 'a', not 97, and strarray can store any Unicode character (not only integers in range 0-255). I wrote a quick and dirty implementation in Python just to be able to play with the API, and to have an idea of the quantity of work required to implement it: https://bitbucket.org/haypo/misc/src/tip/python/strarray.py (Some methods are untested: see the included TODO list.) -- Implement strarray in C is not trivial and it would be easier to implement it in 3 steps: (a) Use Py_UCS4 array (b) The array type depends on the content: best memory footprint, as the PEP 393 (c) Use strarray to implement a new io.StringIO Or we can just stop after step (a). -- strarray API has to be discussed. Most bytearray methods return a new object in most cases. I don't understand why, it's not efficient. I don't know if we can do in-place operations for strarray methods having the same name than bytearray methods (which are not in-place methods). str has some more methods that bytes and bytearary don't have, like format. We may do in-place operation for these methods. Victor _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com