Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Mariano Reingart Sat, 11 Jan 2014 14:15:45 -0800

On Fri, Jan 10, 2014 at 9:13 PM, Juraj Sukop <juraj.su...@gmail.com> wrote:


>
>
>
> On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou <solip...@pitrou.net>wrote:
>
>> Also, when you say you've never encountered UTF-16 text in PDFs, it
>>  sounds like those people who've never encountered any non-ASCII data in
>> their programs.
>
>
> Let me clarify: one does not think in "writing text in Unicode"-terms in
> PDF. Instead, one records the sequence of "character codes" which
> correspond to "glyphs" or the glyph IDs directly. That's because one
> Unicode character may have more than one glyph and more characters can be
> shown as one glyph.
>
>
>
AFAIK (and just for the record), there could be both Latin1 text and UTF-16
in a PDF (and other encodings too), depending on the font used:

/Encoding /WinAnsiEncoding (mostly latin1 "standard" fonts)
/Encoding /Identity-H (generally for unicode UTF-16 True Type "embedded"
fonts)

For example, in PyFPDF (a PHP library ported to python), the following code
writes out text that could be encoded in two different encodings:

s = sprintf("BT %.2f %.2f Td (%s) Tj ET", x*self.k, (self.h-y)*self.k, txt)

https://code.google.com/p/pyfpdf/source/browse/fpdf/fpdf.py#602

In Python2, txt is just a str, but in Python3 handling everything as latin1
string obviously doesn't work for TTF in this case.

Best regards

Mariano Reingart
http://www.sistemasagiles.com.ar
http://reingart.blogspot.com

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

Reply via email to