On Tue, Dec 29, 2009 at 3:21 PM, Shashwat Anand <[email protected]> wrote: > I used PDFMiner and I was pretty satisfied with the text portions. I > retrieved all the text and was able to manipulate it according to my wish. > However I failed on Image part. So Technically my question reduces to 'If > thereĀ a PDF document and some verbose text below them and the pattern is > followed i.e. per page of PDF there will be one image and some texts > following it, how can I retrieve both the images and the text without loss' > ?
You can use `pdftohtml' [http://pdftohtml.sf.net]. It is available on Ubuntu. Regards, Didar _______________________________________________ Tutor maillist - [email protected] To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
