iText text extraction can definitely get you the text on each page.  See
/itextpdf/src/main/java/com/itextpdf/text/pdf/parser/PdfContentReaderTool.java

Adding that to Lucene is straight forward - check their project page for
ideas.

--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/about-pdf-search-tp3338152p3339816.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText® is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to