rbt <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> Not really a Python question... but here goes: Is there a way to read 
> the content of a PDF file and decode it with Python? I'd like to read 
> PDF's, decode them, and then search the data for certain strings.

I've had success with both:

  <http://www.boddie.org.uk/david/Projects/Python/pdftools/>

  <http://www.adaptive-enterprises.com.au/~d/software/pdffile/pdffile.py>

although my preference is for the latter as it transparently handles
decryption. (I've previously posted an enhancement to the `pdftools`
utility that adds decryption handling to it, but now use the `pdffile`
library as it handles it better.)

The ease of text extraction depends a lot on how the PDFs have been
created.

--Phil.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to