Re: io module and pdf question

2013-06-26 Thread wxjmfauth
Le mardi 25 juin 2013 06:18:44 UTC+2, jyou...@kc.rr.com a écrit : Would like to get your opinion on this. Currently to get the metadata out of a pdf file, I loop through the guts of the file. I know it's not the greatest idea to do this, but I'm trying to avoid extra modules, etc.

Re: io module and pdf question

2013-06-25 Thread rusi
On Tuesday, June 25, 2013 9:48:44 AM UTC+5:30, jyou...@kc.rr.com wrote: 1. Is there another way to get metadata out of a pdf without having to install another module? 2. Is it safe to assume pdf files should always be encoded as latin-1 (when trying to read it this way)? Is there a chance

Re: io module and pdf question

2013-06-25 Thread Christian Gollwitzer
Am 25.06.13 08:33, schrieb rusi: On Tuesday, June 25, 2013 9:48:44 AM UTC+5:30, jyou...@kc.rr.com wrote: 1. Is there another way to get metadata out of a pdf without having to install another module? 2. Is it safe to assume pdf files should always be encoded as latin-1 (when trying to read it

RE: io module and pdf question

2013-06-25 Thread jyoung79
Thank you Rusi and Christian! So it sounds like I should read the pdf data in as binary: import os pdfPath = '~/Desktop/test.pdf' colorlistData = '' with open(os.path.expanduser(pdfPath), 'rb') as f: for i in f: if 'XYZ:colorList' in i:

Re: io module and pdf question

2013-06-25 Thread rusi
I guess the string constant 'XYZ:colorlist' needs to be a byte-string -- use b prefix? Dunno for sure. Black hole for me -- unicode! -- http://mail.python.org/mailman/listinfo/python-list

Re: io module and pdf question

2013-06-25 Thread MRAB
On 25/06/2013 17:15, jyoun...@kc.rr.com wrote: Thank you Rusi and Christian! So it sounds like I should read the pdf data in as binary: import os pdfPath = '~/Desktop/test.pdf' colorlistData = '' with open(os.path.expanduser(pdfPath), 'rb') as f: for i in f:

Re: io module and pdf question

2013-06-25 Thread Dave Angel
On 06/25/2013 12:15 PM, jyoun...@kc.rr.com wrote: Thank you Rusi and Christian! Something I don't think was mentioned was that reading a text file in Python 3, and specifying latin-1, will work simply because every possible 8-bit byte is a character in Latin-1 That doesn't mean that those