Nope. PDF isn't a structured format - extracting structured text is a very, very difficult challenge. If the files all follow a similar format, you may be able to use that knowledge to derive an algorithm that can do it (see the LocationAwareTextExtractionStrategy). There have been other posts about this - search the listserv archives and you'll probably find some other responses I've made to similar questions (I do recall taking the time to outline the strategy that such an algorithm might use).
-- View this message in context: http://itext-general.2136553.n4.nabble.com/Extracting-text-from-PDF-tp4657954p4657980.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php