On Thu, 11 Aug 2016, Bob Paulin wrote:
I know it's been a little bit since we talked about 2.0. We had discussed holding off while some API changes that were under consideration. Has any progress been made on this?

I think we're still trying to come up with a plan for how to allow multiple parsers to report text for one document (either for main parser + fallback parser after error, or for two different kinds of parsers). That's then blocking some of the changes around fallback parsers, multiple parsers etc. Probably a few other API breaks / changes that'll fall out of that too

How long we want to wait for a solution for that is a different question... (I'm not that keen on saying "if the content handler is a TikaContentHandler with some extra methods, great, otherwise throw an exception for >1 parser", which is about the only one we've come up with so far)

Nick

Reply via email to