Re: Why is array.array('u') deprecated?
On 08/05/2015 15:40, jonathan.slend...@gmail.com wrote: Le vendredi 8 mai 2015 15:11:56 UTC+2, Peter Otten a écrit : So, this works perfectly fine and fast. But it scares me that it's deprecated and Python 4 will not support it anymore. Hm, this doesn't even work with Python 3: My mistake. I should have tested better. data = array.array(u, ux*1000) data[100] = y re.search(y, data) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python3.4/re.py, line 166, in search return _compile(pattern, flags).search(string) TypeError: can't use a string pattern on a bytes-like object You can search for bytes re.search(by, data) _sre.SRE_Match object; span=(400, 401), match=b'y' data[101] = z re.search(by, data) _sre.SRE_Match object; span=(400, 401), match=b'y' re.search(byz, data) re.search(by\0\0\0z, data) _sre.SRE_Match object; span=(400, 405), match=b'y\x00\x00\x00z' but if that is good enough you can use a bytearray in the first place. Maybe I'll try that. Thanks for the suggestions! Jonathan http://sourceforge.net/projects/pyropes/ of any use to you? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Why is array.array('u') deprecated?
Why is array.array('u') deprecated? Will we get an alternative for a character array or mutable unicode string? Thanks! Jonathan -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is array.array('u') deprecated?
Le vendredi 8 mai 2015 12:29:15 UTC+2, Steven D'Aprano a écrit : On Fri, 8 May 2015 07:14 pm, jonathan.slenders wrote: Why is array.array('u') deprecated? Will we get an alternative for a character array or mutable unicode string? Good question. Of the three main encodings for Unicode, two are variable-width: * UTF-8 uses 1-4 bytes per character * UTF-16 uses 2 or 4 bytes per character while UTF-32 is fixed-width (4 bytes per character). So you could try faking it with a 32-bit array and filling it with string.encode('utf-32'). I guess that doesn't work. I need to have something that I can pass to the re module for searching through it. Creating new strings all the time is no option. (Think about gigabyte strings.) -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is array.array('u') deprecated?
On Fri, 8 May 2015 07:14 pm, jonathan.slend...@gmail.com wrote: Why is array.array('u') deprecated? Will we get an alternative for a character array or mutable unicode string? Good question. Of the three main encodings for Unicode, two are variable-width: * UTF-8 uses 1-4 bytes per character * UTF-16 uses 2 or 4 bytes per character while UTF-32 is fixed-width (4 bytes per character). So you could try faking it with a 32-bit array and filling it with string.encode('utf-32'). -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is array.array('u') deprecated?
Le vendredi 8 mai 2015 15:11:56 UTC+2, Peter Otten a écrit : So, this works perfectly fine and fast. But it scares me that it's deprecated and Python 4 will not support it anymore. Hm, this doesn't even work with Python 3: My mistake. I should have tested better. data = array.array(u, ux*1000) data[100] = y re.search(y, data) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python3.4/re.py, line 166, in search return _compile(pattern, flags).search(string) TypeError: can't use a string pattern on a bytes-like object You can search for bytes re.search(by, data) _sre.SRE_Match object; span=(400, 401), match=b'y' data[101] = z re.search(by, data) _sre.SRE_Match object; span=(400, 401), match=b'y' re.search(byz, data) re.search(by\0\0\0z, data) _sre.SRE_Match object; span=(400, 405), match=b'y\x00\x00\x00z' but if that is good enough you can use a bytearray in the first place. Maybe I'll try that. Thanks for the suggestions! Jonathan -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is array.array('u') deprecated?
jonathan.slend...@gmail.com wrote: Can you expand a bit on how array(u) helps here? Are the matches in the gigabyte range? I have a string of unicode characters, e.g.: data = array.array('u', u'x' * 10) Then I need to change some data in the middle of this string, for instance: data[50] = 'y' Then I want to use re to search in this text: re.search('y', data) This has to be fast. I really don't want to split and concatenate strings. Re should be able to process it and the expressions can be much more complex than this. (I think it should be anything that implements the buffer protocol). So, this works perfectly fine and fast. But it scares me that it's deprecated and Python 4 will not support it anymore. Hm, this doesn't even work with Python 3: data = array.array(u, ux*1000) data[100] = y re.search(y, data) Traceback (most recent call last): File stdin, line 1, in module File /usr/lib/python3.4/re.py, line 166, in search return _compile(pattern, flags).search(string) TypeError: can't use a string pattern on a bytes-like object You can search for bytes re.search(by, data) _sre.SRE_Match object; span=(400, 401), match=b'y' data[101] = z re.search(by, data) _sre.SRE_Match object; span=(400, 401), match=b'y' re.search(byz, data) re.search(by\0\0\0z, data) _sre.SRE_Match object; span=(400, 405), match=b'y\x00\x00\x00z' but if that is good enough you can use a bytearray in the first place. -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is array.array('u') deprecated?
jonathan.slend...@gmail.com wrote: Le vendredi 8 mai 2015 12:29:15 UTC+2, Steven D'Aprano a écrit : On Fri, 8 May 2015 07:14 pm, jonathan.slenders wrote: Why is array.array('u') deprecated? Will we get an alternative for a character array or mutable unicode string? Good question. Of the three main encodings for Unicode, two are variable-width: * UTF-8 uses 1-4 bytes per character * UTF-16 uses 2 or 4 bytes per character while UTF-32 is fixed-width (4 bytes per character). So you could try faking it with a 32-bit array and filling it with string.encode('utf-32'). I guess that doesn't work. I need to have something that I can pass to the re module for searching through it. Creating new strings all the time is no option. (Think about gigabyte strings.) Can you expand a bit on how array(u) helps here? Are the matches in the gigabyte range? -- https://mail.python.org/mailman/listinfo/python-list
Re: Why is array.array('u') deprecated?
Can you expand a bit on how array(u) helps here? Are the matches in the gigabyte range? I have a string of unicode characters, e.g.: data = array.array('u', u'x' * 10) Then I need to change some data in the middle of this string, for instance: data[50] = 'y' Then I want to use re to search in this text: re.search('y', data) This has to be fast. I really don't want to split and concatenate strings. Re should be able to process it and the expressions can be much more complex than this. (I think it should be anything that implements the buffer protocol). So, this works perfectly fine and fast. But it scares me that it's deprecated and Python 4 will not support it anymore. -- https://mail.python.org/mailman/listinfo/python-list