Hi, Internally, CPython has a _PyUnicodeWriter which is an efficient way to create a string but appending substrings or characters. _PyUnicodeWriter changes the internal storage format depending on characters code points (ascii or latin1: 1 byte/character, BMP: 2 b/c, full UCS: 4 b/c). I tried once to expose it in Python, but I wasn't convinced by performances. The overhead of method calls was quite significant, and I wasn't convinced by "writer += str" performance neither. Maybe I should try again. PyPy also has such object. It avoids the "str += str" hack in ceval.c to avoid very poor performance (_PyUnicodeWriter also uses overallocation which can be controlled with multiple parameters to reduce the number of realloc).
Another alternative would be have to add a "strarray" type similar to bytes/bytearray couple. Is is what you are looking for? Or do you really need array.array API? Victor Le ven. 22 mars 2019 à 08:38, Greg Ewing <greg.ew...@canterbury.ac.nz> a écrit : > > A poster on comp.lang.python is asking about array.array('u'). > He wants an efficient mutable collection of unicode characters > that can be initialised from a string. > > According to the docs, the 'u' code is deprecated and will be > removed in 4.0, but no alternative is suggested. > > Why is this being deprecated, instead of keeping it and making > it always 32 bits? It seems like useful functionality that can't > be easily obtained another way. > > -- > Greg > _______________________________________________ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com