Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Ethan Furman Sat, 11 Jan 2014 11:31:39 -0800

On 01/11/2014 10:36 AM, Steven D'Aprano wrote:

On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:


   unicode to bytes
   bytes to unicode using latin1
   unicode to bytes


Where do you get this from? I don't follow your logic. Start with a text
template:

template = """\xDE\xAD\xBE\xEF
Name:\0\0\0%s
Age:\0\0\0\0%d
Data:\0\0\0%s
blah blah blah
"""

data = template % ("George", 42, blob.decode('latin-1'))

Only the binary blobs need to be decoded. We don't need to encode the
template to bytes, and the textual data doesn't get encoded until we're
ready to send it across the wire or write it to disk.


And what if your name field has data not representable in latin-1?

--> '\xd1\x81\xd1\x80\xd0\x83'.decode('utf8')
u'\u0441\u0440\u0403'

--> '\xd1\x81\xd1\x80\xd0\x83'.decode('utf8').encode('latin1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-2: 
ordinal not in range(256)

So really your example should be:

data = template % 
("George".encode('some_non_ascii_encoding_such_as_cp1251').decode('latin-1'), 
42, blob.decode('latin-1'))

Which is a mess.

--
~Ethan~
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Reply via email to