Hi, Is there any way to check the supported file types before submitting to TIKA, may be using file magic ? For example below it was a bad file and I got a HTTP 415 error code. I am using tika-python as a client and connecting to the Tika server. (May be it can be an incomplete file ?)
Thanks, Latha. 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.server.resource.TikaResource$1@53037cad 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:222) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:402) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:144) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:121) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at java.lang.reflect.Method.invoke(Method.java:498) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:179) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:96) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:193) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:103) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:205) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.Server.handle(Server.java:531) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at java.lang.Thread.run(Thread.java:748) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: Caused by: javax.ws.rs.WebApplicationException: HTTP 415 Unsupported Media Type 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.server.resource.TikaResource$1.parse(TikaResource.java:128) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) 2019-02-26 23:41:48,387 LathaRel14-NOX-15 user.notice tika: #011... 41 more 2019-02-26 23:41:48,389 LathaRel14-NOX-15 user.notice tika: INFO rmeta/text (autodetecting type) 2019-02-26 23:41:48,393 LathaRel14-NOX-15 user.notice tika: WARN rmeta/text: Text extraction failed
