On Tue, 12 Nov 2019, Katsuya Tomioka wrote:
I'm having trouble accessing encoding detectors in OSGi with Tika 1.22. AutoDetectParser returns "Failed to detect the character encoding of a document" for non-Latin text. We are migrating from 1.10, I'm sure many things are different. It seems like my problem is while all the detectors are in tika-parser, the code is loading from tika-core's. I see parsers and detectors are tracked as services. Do I need to do something similar to load encoding detectors as well?

The things which are currently loaded via services are:
 * Parsers
 * Detectors (file type)
 * Translators
 * Encoding Detection
 * Langauge Detection
 * Probability-based type detectors

I think there might be helpers to assist with those, hopefully one of our OSGi experts will be along shortly to advise!

Nick

Reply via email to