On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano <st...@pearwood.info>wrote:
> On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote: > > > AFAIK (and just for the record), there could be both Latin1 text and > UTF-16 > > in a PDF (and other encodings too), depending on the font used: > [...] > > In Python2, txt is just a str, but in Python3 handling everything as > latin1 > > string obviously doesn't work for TTF in this case. > > Nobody is suggesting that you use Latin-1 for *everything*. We're > suggesting that you use it for blobs of binary data that represent > arbitrary bytes. First you have to get your binary data in the first > place, using whatever technique is necessary. Just to check I understood what you are saying. Instead of writing: content = b'\n'.join([ b'header', b'part 2 %.3f' % number, binary_image_data, utf16_string.encode('utf-16be'), b'trailer']) it should now look like: content = '\n'.join([ 'header', 'part 2 %.3f' % number, binary_image_data.decode('latin-1'), utf16_string.encode('utf-16be').decode('latin-1'), 'trailer']).encode('latin-1') Correct?
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com