Hi, On 10/18/07, Keith R. Bennett <[EMAIL PROTECTED]> wrote: > After removing those things, the ParserPostProcessor doesn't do anything. > Do you want to remove it altogether? We could also just not instantiate it > -- in TikaConfig, we would add the parser implementation without wrapping it > in a ParserPostProcessor.
I'd be OK replacing it with SummaryContentHandler and OutLinksContentHandler, i.e. ContentHandler classes that would extract the summary text and any matched URIs from the text content. This way we'd still have all the functionality in Tika. BR, Jukka Zitting
