Hi tika-dev,
Does the default pdf parser using auto detect parser require to tika
to run in server mode? It seems to try and open an http connection to
localhost:8080 by default? Can it run in-process?
...<snip>
FileInputStream stream = new FileInputStream("src/test/resources/somepdf.pdf");
//works fine in-process with other doc types.
Tika tika = new Tika();
tika.parseToString(stream);
...<snip>
24 Feb 2016 17:06:24 WARN PhaseInterceptorChain - Interceptor for
{http://localhost:8080/processHeaderDocument}WebClient has thrown
exception, unwinding now
org.apache.cxf.interceptor.Fault: No message body writer has been
found for class org.apache.cxf.jaxrs.ext.multipart.MultipartBody,
ContentType: multipart/form-data
at
org.apache.cxf.jaxrs.client.WebClient$BodyWriter.doWriteBody(WebClient.java:1220)
at
org.apache.cxf.jaxrs.client.AbstractClient$AbstractBodyWriter.handleMessage(AbstractClient.java:1044)
at
org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307)
at
org.apache.cxf.jaxrs.client.AbstractClient.doRunInterceptorChain(AbstractClient.java:623)
at
org.apache.cxf.jaxrs.client.WebClient.doChainedInvocation(WebClient.java:1084)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:883)
at org.apache.cxf.jaxrs.client.WebClient.doInvoke(WebClient.java:854)
at org.apache.cxf.jaxrs.client.WebClient.invoke(WebClient.java:320)
at org.apache.cxf.jaxrs.client.WebClient.post(WebClient.java:329)
at
org.apache.tika.parser.journal.GrobidRESTParser.parse(GrobidRESTParser.java:74)
at org.apache.tika.parser.journal.JournalParser.parse(JournalParser.java:60)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at org.apache.tika.Tika.parseToString(Tika.java:496)
at org.apache.tika.Tika.parseToString(Tika.java:571)