Re: How can I access to the TextExtractor result?

Jukka Zitting Tue, 24 Nov 2009 08:50:50 -0800

Hi,

On Tue, Nov 24, 2009 at 5:37 PM, Paco Avila <[email protected]> wrote:
> I wonder if I can access the text produced by the TextExtractor from a
> document file (like a PDF, for example)


Jackrabbit doesn't store the extracted text anywhere, it is just used
to add the document to the inverted Lucene index.

You can always use the text extractor directly to get the text
content. Check out http://lucene.apache.org/tika/ for more details
about the Tika toolkit that we nowadays use for text extraction.

BR,

Jukka Zitting

Re: How can I access to the TextExtractor result?

Reply via email to