On Tue, Dec 29, 2009 at 5:49 PM, Shashwat Anand <[email protected]>wrote:
> How can we retrieve images from PDFs. I need both images and the text > beneath the image to form a database. I was able to parse text via PDFMiner > but was crippled when it leads to images. Searching my apt cache for python pdf shows a lot of libraries some of which claim to be able to manage the entire contents of the PDF file in question. I have also come across some tool to break a PDF down into HTML + image files (don't remember it's name anymore) which was free software so I'm sure it's doable. -- ~noufal http://nibrahim.net.in _______________________________________________ BangPypers mailing list [email protected] http://mail.python.org/mailman/listinfo/bangpypers
