[ https://issues.apache.org/jira/browse/SOLR-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445694#comment-13445694 ]
Uwe Schindler commented on SOLR-3775: ------------------------------------- Hi, as the exception suggests, this issue has nothing to do with Apache Solr, i is caused by the libraray called Apache TIKA that is bundled with extracting module to do the file parsing stuff. We cannot fix this issue, it would be better to report this to the [TIKA|https://issues.apache.org/jira/browse/TIKA] project. It would be also good to attach the .doc file causing this to their issue. > Unexpected RuntimeException > --------------------------- > > Key: SOLR-3775 > URL: https://issues.apache.org/jira/browse/SOLR-3775 > Project: Solr > Issue Type: Bug > Affects Versions: 4.0-BETA > Reporter: Alex C > > Hi. I'm using Solr 4.0 Beta (no modifications to default installation) to > index, and it's blowing up on Word *.DOC files: > {code}curl > "http://localhost:8983/solr/update/extract?literal.id=doc15&commit=true" -F > "myfile=@15.doc"{code} > Here's the exception. And the same files go through Solr 3.6.1 just fine. > {noformat} <?xml version="1.0" encoding="UTF-8"?> > <response> > <lst name="responseHeader"><int name="status">500</int><int > name="QTime">18</int > ></lst><lst name="error"><str > name="msg">org.apache.tika.exception.TikaException > : Unexpected RuntimeException from > org.apache.tika.parser.microsoft.OfficeParser > @328c62ce</str><str name="trace">org.apache.solr.common.SolrException: > org.apache.tika.exception.TikaException: Unexpected RuntimeException from > org.apache.tika.parser.microsoft.OfficeParser@328c62ce > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr > actingDocumentLoader.java:230) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co > ntentStreamHandlerBase.java:74) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl > erBase.java:129) > at > org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle > Request(RequestHandlers.java:240) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter > .java:454) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte > r.java:275) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet > Handler.java:1337) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java > :484) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j > ava:119) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl > er.java:233) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl > er.java:1065) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java: > 413) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle > r.java:192) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle > r.java:999) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j > ava:117) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont > extHandlerCollection.java:250) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl > ection.java:149) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper > .java:111) > at org.eclipse.jetty.server.Server.handle(Server.java:351) > at > org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(Abstrac > tHttpConnection.java:454) > at > org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(Blockin > gHttpConnection.java:47) > at > org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(Abstra > ctHttpConnection.java:890) > at > org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.header > Complete(AbstractHttpConnection.java:944) > at > org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:642) > at > org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230) > at > org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpCo > nnection.java:66) > at > org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(So > cketConnector.java:254) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo > l.java:599) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool > .java:534) > at java.lang.Thread.run(Unknown Source) > Caused by: org.apache.tika.exception.TikaException: Unexpected > RuntimeException > from org.apache.tika.parser.microsoft.OfficeParser@328c62ce > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244 > ) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242 > ) > at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1 > 20) > at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr > actingDocumentLoader.java:224) > ... 31 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 7 > at > org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:163) > at > org.apache.poi.hwpf.model.Colorref.<init>(Colorref.java:81) > at > org.apache.poi.hwpf.model.types.SHDAbstractType.fillFields(SHDAbstrac > tType.java:56) > at > org.apache.poi.hwpf.usermodel.ShadingDescriptor.<init>(ShadingD > escriptor.java:38) > at > org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.unCompressCHPOpera > tion(CharacterSprmUncompressor.java:582) > at > org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.uncompressCHP(Char > acterSprmUncompressor.java:65) > at > org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:288) > at > org.apache.poi.hwpf.model.StyleSheet.<init>(StyleSheet.java:121 > ) > at > org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:346) > at > org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.ja > va:77) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java > :185) > at > org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java > :160) > at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242 > ) > ... 34 more > </str><int name="code">500</int></lst> > </response>{noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org