On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop <juraj.su...@gmail.com> wrote:
> What this all means is that the PDF objects are expressed in ASCII, > "stream" objects like images and fonts may have a binary part and I never > saw those UTF+16 strings. > hmm -- I wonder if they are out there in the wild, though.... > u"stream\n%s\nendstream\nendobj"%binary_data.decode('latin-1') >> > > The argument for dropping "%f" et al. has been that if something is a > text, then it should be Unicode. Conversely, if it is not text, then it > should not be Unicode. > > ???? What I'm trying to demostrate / test is that you can use unicode objects for mixed binary + ascii, if you make sure to encode/decode using latin-1 ( any others?). The idea is that ascii can be seen/used as text, and other bytes are preserved, and you can ignore whatever meaning latin-1 gives them. using unicode objects means that you can use the existing string formatting (%s), and if you want to pass in binary blobs, you need to decode them as latin-1, creating a unicode object, which will get interpolated into your unicode object, but then that unicode gets encoded back to latin-1, the original bytes are preserved. I think this it confusing, as we are calling it latin-1, but not really using it that way, but it seems it should work. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com