On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop <juraj.su...@gmail.com> wrote:

> What this all means is that the PDF objects are expressed in ASCII,
> "stream" objects like images and fonts may have a binary part and I never
> saw those UTF+16 strings.
>

hmm -- I wonder if they are out there in the wild, though....


>  u"stream\n%s\nendstream\nendobj"%binary_data.decode('latin-1')
>>
>
> The argument for dropping "%f" et al. has been that if something is a
> text, then it should be Unicode. Conversely, if it is not text, then it
> should not be Unicode.
>
>

????

What I'm trying to demostrate / test is that you can use unicode objects
for mixed binary + ascii, if you make sure to encode/decode using latin-1 (
any others?). The idea is that ascii can be seen/used as text, and other
bytes are preserved, and you can ignore whatever meaning latin-1 gives them.

using unicode objects means that you can use the existing string formatting
(%s), and if you want to pass in binary blobs, you need to decode them as
latin-1, creating a unicode object, which will get interpolated into your
unicode object, but then that unicode gets encoded back to latin-1, the
original bytes are preserved.

I think this it confusing, as we are calling it latin-1, but not really
using it that way, but it seems it should work.

-Chris





-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

chris.bar...@noaa.gov
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to