Alan Gibson created TIKA-2481: --------------------------------- Summary: PUTing to /tika/main with fileUrl always returns 415 Unsupported Media Type Key: TIKA-2481 URL: https://issues.apache.org/jira/browse/TIKA-2481 Project: Tika Issue Type: Bug Affects Versions: 1.16 Reporter: Alan Gibson Priority: Minor
While trying to get a feel for what Tika outputs, I noticed that using Boilerpipe with fileUrl requests seems to always fail. {noformat} user@host:~/dev$ curl -v -X PUT -H "fileUrl:http://tika.apache.org/" -H "Accept: text/plain" http://localhost:9998/tika/main * Trying 127.0.0.1... * TCP_NODELAY set * Connected to localhost (127.0.0.1) port 9998 (#0) > PUT /tika/main HTTP/1.1 > Host: localhost:9998 > User-Agent: curl/7.52.1 > fileUrl:http://tika.apache.org/ > Accept: text/plain > < HTTP/1.1 415 Unsupported Media Type < Content-Length: 0 < Server: Jetty(8.y.z-SNAPSHOT) < * Curl_http_done: called premature == 0 * Connection #0 to host localhost left intact {noformat} {noformat} user@host:~/dev$ java -jar tika-server-1.16.jar -enableUnsecureFeatures -enableFileUrl --log debug --includeStack [...] INFO tika/main (autodetecting type) WARN tika/main: Text extraction failed org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.server.resource.TikaResource$1@22a12aed at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135) at org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:322) at org.apache.tika.server.resource.TikaResource$4.write(TikaResource.java:392) at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataProvider.java:169) at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.java:1389) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:243) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:119) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:82) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:83) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:252) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:261) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:76) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1088) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1024) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:973) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1035) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:641) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:231) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:748) Caused by: javax.ws.rs.WebApplicationException: HTTP 415 Unsupported Media Type at org.apache.tika.server.resource.TikaResource$1.parse(TikaResource.java:120) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ... 32 more ERROR Problem with writing the data, class org.apache.tika.server.resource.TikaResource$4, ContentType: text/plain {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)