On Fri, Mar 22, 2019 at 08:31:33PM +1300, Greg Ewing wrote: > A poster on comp.lang.python is asking about array.array('u'). > He wants an efficient mutable collection of unicode characters > that can be initialised from a string. > > According to the docs, the 'u' code is deprecated and will be > removed in 4.0, but no alternative is suggested. > > Why is this being deprecated, instead of keeping it and making > it always 32 bits? It seems like useful functionality that can't > be easily obtained another way.
I can't answer any of those questions, but perhaps the poster can do this instead: py> a = array('L', 'ℍℰâѵÿ Ϻεταł'.encode('utf-32be')) py> a array('L', [220266496, 807469056, 3791650816, 1963196416, 4278190080, 536870912, 4194500608, 3036872704, 3288530944, 2969763840, 1107361792]) Getting the string out again is no harder: py> bytes(a).decode('utf-32be') 'ℍℰâѵÿ Ϻεταł' But having said that, it would be nice to have an array code which treated the values as single UTF-32 characters: array('?', ['ℍ', 'ℰ', 'â', 'ѵ', 'ÿ', ' ', 'Ϻ', 'ε', 'τ', 'α', 'ł']) if for no other reason than it looks nicer than a bunch of 32 bit ints. -- Steven _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com