[ https://issues.apache.org/jira/browse/TIKA-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris A. Mattmann updated TIKA-1953: ------------------------------------ Fix Version/s: (was: 1.15) 1.16 > tika-server NullPointerException while processing rtfs > ------------------------------------------------------ > > Key: TIKA-1953 > URL: https://issues.apache.org/jira/browse/TIKA-1953 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 1.12 > Environment: Python 2.7.11 :: Anaconda 4.0.0 (64-bit) > Red Hat Enterprise Linux Server release 6.7 (Santiago) > java version "1.7.0_95" > OpenJDK Runtime Environment (rhel-2.6.4.0.el6_7-x86_64 u95-b00) > OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode) > Reporter: Ravi > Assignee: Tim Allison > Labels: newbie, rtf, tika-python, tika-server, xmlContent, > Fix For: 1.16 > > Attachments: officeinstallations3.rtf > > > Looks like the xmlContent=True flag causes tika.py: Warn: Tika server > returned status: 422 error > I start the tika server and then run the following code in the python kernel > at bash > import tika > from tika import parser > parsed = parser.from_file('/path/to/file.rtf,'http://localhost:9003',xm > lContent=True) > I get.. tika.py: Warn: Tika server returned status: 422 > Looking at the tika-server log I get the following dump: > Note: The parser seems to work fine without the xmlContent=True flag set. I > get the right output but setting this flag creates the NullPointerException > below > ------------------------------------------------------------------------------ > Apr 15, 2016 2:36:55 PM org.apache.tika.server.resource.TikaResource > logRequest > INFO: rmeta/xml (autodetecting type) > Apr 15, 2016 2:36:55 PM org.apache.tika.server.resource.TikaResource parse > WARNING: rmeta/xml: Text extraction failed > org.apache.tika.exception.TikaException: Unexpected RuntimeException from > org.apache.tika.parser.rtf.RTFParser@21f0dbb9 > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:282) > at > org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:177) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120) > at > org.apache.tika.parser.RecursiveParserWrapper.parse(RecursiveParserWrapper.java:158) > at > org.apache.tika.server.resource.TikaResource.parse(TikaResource.java:281) > at > org.apache.tika.server.resource.RecursiveMetadataResource.parseMetadata(RecursiveMetadataResource.java:138) > at > org.apache.tika.server.resource.RecursiveMetadataResource.getMetadata(RecursiveMetadataResource.java:119) > at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.cxf.service.invoker.AbstractInvoker.performInvocation(AbstractInvoker.java:181) > at > org.apache.cxf.service.invoker.AbstractInvoker.invoke(AbstractInvoker.java:97) > at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:200) > at org.apache.cxf.jaxrs.JAXRSInvoker.invoke(JAXRSInvoker.java:99) > at > org.apache.cxf.interceptor.ServiceInvokerInterceptor$1.run(ServiceInvokerInterceptor.java:59) > at > org.apache.cxf.interceptor.ServiceInvokerInterceptor.handleMessage(ServiceInvokerInterceptor.java:96) > at > org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:307) > at > org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121) > at > org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:251) > at > org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:261) > at > org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:70) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1088) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1024) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) > at org.eclipse.jetty.server.Server.handle(Server.java:370) > at > org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) > at > org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:982) > at > org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1043) > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865) > at > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) > at > org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) > at > org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) > at > org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.tika.sax.ToXMLContentHandler$ElementInfo.access$000(ToXMLContentHandler.java:38) > at > org.apache.tika.sax.ToXMLContentHandler.endElement(ToXMLContentHandler.java:195) > at > org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) > at > org.apache.tika.sax.SecureContentHandler.endElement(SecureContentHandler.java:256) > at > org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) > at > org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) > at > org.apache.tika.sax.ContentHandlerDecorator.endElement(ContentHandlerDecorator.java:136) > at > org.apache.tika.sax.SafeContentHandler.endElement(SafeContentHandler.java:273) > at > org.apache.tika.sax.XHTMLContentHandler.endDocument(XHTMLContentHandler.java:226) > at > org.apache.tika.parser.rtf.TextExtractor.extract(TextExtractor.java:478) > at > org.apache.tika.parser.rtf.TextExtractor.extract(TextExtractor.java:439) > at org.apache.tika.parser.rtf.RTFParser.parse(RTFParser.java:87) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > ... 38 more > ------------------------------------------------------------------------------ -- This message was sent by Atlassian JIRA (v6.3.15#6346)