Hello list.
I've got a problem using TIKA from multithreaded C++ aplication. I
wrote a simple test wrapper class using Java and call it from C++. It
looks like this:
public class TikaWrapper {
static final int BUFFER_SIZE = 8192;
public TikaWrapper() {
BasicConfigurator.configure(
new WriterAppender(new SimpleLayout(), System.err)
);
Logger.getRootLogger().setLevel(Level.ERROR);
}
public void Process(String inFileName, String outFileName)
throws FileNotFoundException, IOException {
// Open files
InputStream in = new FileInputStream(inFileName);
BufferedWriter out = new BufferedWriter(new FileWriter(outFileName));
try {
Tika tika = new Tika(); // <--- Problem is HERE
Metadata metaData = new Metadata();
Reader reader = tika.parse(in, metaData);
// Read the text and write to the output file
char[] buf = new char[BUFFER_SIZE];
for (int size = reader.read(buf); size != -1; size = reader.read(buf)) {
out.write(buf, 0, size);
}
// Finalize code required
in.close();
out.close();
}
catch (Throwable e) {
e.printStackTrace();
}
}
}
I create an instance of TikaWrapper from C++ code using JNI and call
Process() method.
Also TikaWrapper class has main() method and being running as Java
application everything works fine. Moreover being called from single
thread C++ application it works also. But in multithread environment
at commented line I've got an exception:
sun.misc.ServiceConfigurationError: org.apache.tika.parser.Parser:
Provider org.apache.tika.parser.asm.ClassParser not found
at sun.misc.Service.fail(Service.java:129)
at sun.misc.Service.access$000(Service.java:111)
at sun.misc.Service$LazyIterator.next(Service.java:273)
at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:147)
at
org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:207)
at org.apache.tika.Tika.<init>(Tika.java:81)
at TikaWrapper.Process(TikaWrapper.java:75)
Unfortunately I'm newbie in Java and have no idea whats going on... :(
Also TIKA documentation doesn't contain clean explanation about
TiakConfig class and what tika config file should look like.
Any help appreciated.
//wbr, alex