Hi, On Tue, Nov 24, 2009 at 5:37 PM, Paco Avila <[email protected]> wrote: > I wonder if I can access the text produced by the TextExtractor from a > document file (like a PDF, for example)
Jackrabbit doesn't store the extracted text anywhere, it is just used to add the document to the inverted Lucene index. You can always use the text extractor directly to get the text content. Check out http://lucene.apache.org/tika/ for more details about the Tika toolkit that we nowadays use for text extraction. BR, Jukka Zitting
