Hi,
I'm seeing the errors below when using the apache/tika:latest docker image. I'm using Tika solely as a backend to Apache Solr - I haven't done any kind of configuration in Tika - I've just told Solr to load its 'extraction' module, and it's finding Tika on the default port.

I also tried installing tika from tika-server-standard-3.2.3-bin.tgz and running it via systemd with the following in the service file:
    Environment=TIKA_INCLUDE=/etc/default/tika.in.sh
    ExecStart=/usr/bin/java -jar /opt/tika/tika-server.jar

I've seen the same error messages running Tika both ways, and also when using both Solr 9.10.1 or 10.0

A quick web search suggests it might be an issue with non- thread-safe code, but I'm not familiar with Java, so that's just a guess. Is there any configuration I need to do for Tika that will help resolve this, or any other suggestions?
Many thanks,
Carl


WARN [qtp2128961136-26] 10:35:10,759 org.apache.tika.server.core.resource.TikaResource tika: Text extraction failed (4th year transition block slides - prereadingv1.pptx) org.apache.tika.exception.TikaException: TIKA-237: Illegal SAXException from org.apache.tika.parser.DefaultParser@58dce8f3 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:310) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:204) at org.apache.tika.server.core.resource.TikaResource.parse(TikaResource.java:365) at org.apache.tika.server.core.resource.TikaResource.lambda$produceOutput$2(TikaResource.java:659) at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataProvider.java:176) at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.java:1651) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:249) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:122) at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:84) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:244) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:80) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:178) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
        at org.eclipse.jetty.server.Server.handle(Server.java:563)
at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
        at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: org.apache.tika.sax.TaggedSAXException: org.eclipse.jetty.io.EofException at org.apache.tika.sax.TaggedContentHandler.handleException(TaggedContentHandler.java:113) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:134) at org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.java:201) at org.apache.tika.sax.XHTMLContentHandler.endElement(XHTMLContentHandler.java:257) at org.apache.tika.sax.XHTMLContentHandler.endElement(XHTMLContentHandler.java:290) at org.apache.tika.parser.csv.TextAndCSVParser.handleText(TextAndCSVParser.java:135) at org.apache.tika.parser.csv.TextAndCSVParser.handleText(TextAndCSVParser.java:258) at org.apache.tika.parser.csv.TextAndCSVParser.parse(TextAndCSVParser.java:179) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
        ... 35 more
Caused by: org.apache.tika.sax.TaggedSAXException: org.eclipse.jetty.io.EofException at org.apache.tika.sax.TaggedContentHandler.handleException(TaggedContentHandler.java:113) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:134)
        ... 44 more
Caused by: org.xml.sax.SAXException: org.eclipse.jetty.io.EofException
at java.xml/com.sun.org.apache.xml.internal.serializer.ToStream.outputCharacters(ToStream.java:1523) at java.xml/com.sun.org.apache.xml.internal.serializer.ToStream$CharacterBuffer$1.flush(ToStream.java:3417) at java.xml/com.sun.org.apache.xml.internal.serializer.ToStream$CharacterBuffer.flush(ToStream.java:3506) at java.xml/com.sun.org.apache.xml.internal.serializer.ToStream.flushCharactersBuffer(ToStream.java:1559) at java.xml/com.sun.org.apache.xml.internal.serializer.ToStream.endElement(ToStream.java:2092) at java.xml/com.sun.org.apache.xalan.internal.xsltc.trax.TransformerHandlerImpl.endElement(TransformerHandlerImpl.java:282) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:134) at org.apache.tika.sax.ExpandedTitleContentHandler.endElement(ExpandedTitleContentHandler.java:69) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:134) at org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHandler.java:241) at org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:134)
        ... 45 more
Caused by: org.eclipse.jetty.io.EofException
at org.eclipse.jetty.io.SocketChannelEndPoint.flush(SocketChannelEndPoint.java:116)
        at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
        at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:275)
        at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:254)
at org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:386) at org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:843) at org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:243) at org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:224) at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:600) at org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:1051)
        at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:1123)
at org.eclipse.jetty.server.HttpOutput.channelWrite(HttpOutput.java:271) at org.eclipse.jetty.server.HttpOutput.channelWrite(HttpOutput.java:255)
        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:859)
at org.apache.cxf.transport.http_jetty.JettyHTTPDestination$JettyOutputStream.write(JettyHTTPDestination.java:319) at org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:51) at java.base/java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:284) at java.base/java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:232) at java.base/java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:148) at org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:51) at org.apache.cxf.io.AbstractThresholdOutputStream.write(AbstractThresholdOutputStream.java:69) at java.base/sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:309)
        at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:381)
        at java.base/sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:357)
at java.base/sun.nio.cs.StreamEncoder.lockedWrite(StreamEncoder.java:158)
        at java.base/sun.nio.cs.StreamEncoder.write(StreamEncoder.java:139)
at java.base/java.io.OutputStreamWriter.write(OutputStreamWriter.java:219) at java.xml/com.sun.org.apache.xml.internal.serializer.ToStream.outputCharacters(ToStream.java:1515)
        ... 55 more
Caused by: java.io.IOException: Broken pipe
        at java.base/sun.nio.ch.SocketDispatcher.writev0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:66)
        at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:227)
        at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:158)
at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:574) at java.base/java.nio.channels.SocketChannel.write(SocketChannel.java:660) at org.eclipse.jetty.io.SocketChannelEndPoint.flush(SocketChannelEndPoint.java:110)
        ... 82 more
ERROR [qtp2128961136-26] 10:35:10,769 org.apache.cxf.jaxrs.utils.JAXRSUtils Problem with writing the data, class org.apache.tika.server.core.resource.TikaResource$$Lambda/0x00007f71fc2a6208, ContentType: text/xml WARN [qtp2128961136-26] 10:35:10,770 org.apache.cxf.phase.PhaseInterceptorChain Interceptor for {http://resource.core.server.tika.apache.org/}MetadataResource has thrown exception, unwinding now
org.apache.cxf.interceptor.Fault: Could not send Message.
at org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:67) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:244) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:80) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:178) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
        at org.eclipse.jetty.server.Server.handle(Server.java:563)
at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
        at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
        at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: org.eclipse.jetty.io.EofException: Closed
at org.eclipse.jetty.server.HttpOutput.checkWritable(HttpOutput.java:757)
        at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:781)
at org.apache.cxf.transport.http_jetty.JettyHTTPDestination$JettyOutputStream.write(JettyHTTPDestination.java:319) at org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:51) at java.base/java.util.zip.GZIPOutputStream.finish(GZIPOutputStream.java:172) at java.base/java.util.zip.DeflaterOutputStream.close(DeflaterOutputStream.java:267) at org.apache.cxf.io.AbstractWrappedOutputStream.close(AbstractWrappedOutputStream.java:77) at org.apache.cxf.io.AbstractThresholdOutputStream.close(AbstractThresholdOutputStream.java:102) at org.apache.cxf.transport.AbstractConduit.close(AbstractConduit.java:56) at org.apache.cxf.transport.http.AbstractHTTPDestination$BackChannelConduit.close(AbstractHTTPDestination.java:766) at org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:63)
        ... 27 more
WARN [qtp2128961136-26] 10:35:10,771 org.apache.cxf.phase.PhaseInterceptorChain Interceptor for {http://resource.core.server.tika.apache.org/}MetadataResource has thrown exception, unwinding now
java.lang.NullPointerException: Deflater has been closed
        at java.base/java.util.zip.Deflater.ensureOpen(Deflater.java:902)
        at java.base/java.util.zip.Deflater.deflate(Deflater.java:564)
        at java.base/java.util.zip.Deflater.deflate(Deflater.java:464)
at java.base/java.util.zip.DeflaterOutputStream.deflate(DeflaterOutputStream.java:282) at java.base/java.util.zip.DeflaterOutputStream.write(DeflaterOutputStream.java:232) at java.base/java.util.zip.GZIPOutputStream.write(GZIPOutputStream.java:148) at org.apache.cxf.io.AbstractWrappedOutputStream.write(AbstractWrappedOutputStream.java:51) at org.apache.cxf.io.AbstractThresholdOutputStream.write(AbstractThresholdOutputStream.java:69)
        at com.ctc.wstx.io.UTF8Writer.flush(UTF8Writer.java:100)
at com.ctc.wstx.sw.BufferingXmlWriter.flush(BufferingXmlWriter.java:242)
        at com.ctc.wstx.sw.BaseStreamWriter.flush(BaseStreamWriter.java:260)
at org.apache.cxf.jaxrs.interceptor.JAXRSDefaultFaultOutInterceptor.handleMessage(JAXRSDefaultFaultOutInterceptor.java:104) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.interceptor.AbstractFaultChainInitiatorObserver.onMessage(AbstractFaultChainInitiatorObserver.java:112) at org.apache.cxf.phase.PhaseInterceptorChain.wrapExceptionAsFault(PhaseInterceptorChain.java:376) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:334) at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90) at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:267) at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:244) at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:80) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:178) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
        at org.eclipse.jetty.server.Server.handle(Server.java:563)
at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
        at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194) at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
        at java.base/java.lang.Thread.run(Thread.java:1583)

Reply via email to