On Mon, 24 Aug 2015, Mikhail Titov wrote:
On Mon, Aug 24, 2015 at  6:14 PM, Mikhail Titov 
<mlt-0UDz38MK/[email protected]> wrote:
While writing a reply, I came to a conclusion that in my particular case
I can move all "detection" into a parser code and wrap standard parsers.

Is parser decorator the way to go if I want to dig few more things on
top of existing parser output?

ContentHandler is another one

If so, is there an example anywhere
how to write it? I tried following CTAKESParser included
with Tika but I'm getting java.lang.StackOverflowError due to endless
recursion.

I commented out my parser from
META-INF/services/org.apache.tika.parser.Parser and I wrapped default
parser into my parser in config xml without any other declarations
similar to [1].

Can you post the config you tried? I can't work out from what you've written exactly what you did

P.S. I'm using Tika 1.9 at this moment.

That's probably part of your issue. Please retry with 1.10, as we did quite a bit of work on tika config for parsers in that release

Nick

Reply via email to