Thx for the information! > -----Ursprüngliche Nachricht----- > Von: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Im > Auftrag von Bruno Lowagie > Gesendet: Montag, 27. November 2006 16:30 > An: Post all your questions about iText here > Betreff: Re: [iText-questions] How to extract data > > Bernhard Wellhöfer wrote: > > Hello, > > > > I have a PDF document with ~900 pages. The document lists in a long > > table the personal data of people. My job is it now to extract the > > first and last name into a structured document (e.g. XML, > Excel, csv ...). > > I hope you didn't accept that assignment yet. > Is there still time to turn it down? > > > I already managed to open the document via iText. After > studying the > > Documentation I do not understand how to search in the document for > > the table and then process each table cell. Can somebody send me a > > hint or a link to the documenation how to find and process > the table? > > Is the PDF a Tagged PDF file? > If not: congratulations, you have accepted a mission impossible! > PDF is a Page Description Language, not a Word Processing format. > If you add a table to a PDF file, it is painted on a canvas > and all structure is lost (unless the PDF is tagged). > It's similar to creating an image (GIF, JPG) from a table, > and then ask somebody to convert the image to XLS or Word. > > If you want to know what Tagged PDF is, please download the > free chapters from my book: > http://www.manning.com/affiliate/idevaffiliate.php?id=223_53 > and read them carefully. > > Your best chances lie with a product that does OCR (but I > don't know any free ones). > best regards, > Bruno > > -------------------------------------------------------------- > ----------- > Take Surveys. Earn Cash. Influence the Future of IT Join > SourceForge.net's Techsay panel and you'll get the chance to > share your opinions on IT & business topics through brief > surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge > &CID=DEVDEV > _______________________________________________ > iText-questions mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/itext-questions > Buy the iText book: http://itext.ugent.be/itext-in-action/ > > > >
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://itext.ugent.be/itext-in-action/
