[ 
https://issues.apache.org/jira/browse/SOLR-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445694#comment-13445694
 ] 

Uwe Schindler edited comment on SOLR-3775 at 8/31/12 4:49 PM:
--------------------------------------------------------------

Hi,
as the exception suggests, this issue has nothing to do with Apache Solr, i is 
caused by the libraray called [Apache TIKA|http://tika.apache.org/] that 
itsself uses [Apache POI|http://poi.apache.org/] that is bundled with 
extracting module to do the file parsing stuff. We cannot fix this issue, it 
would be better to report this to the [TIKA issue 
tracker|https://issues.apache.org/jira/browse/TIKA] - which may also forward 
you to Apache POI (which is the root cause of this issue) - so you'd better 
open the issue on [their 
side|https://issues.apache.org/bugzilla/buglist.cgi?product=POI]. It would be 
also good to attach the .doc file causing this to their issue.
                
      was (Author: thetaphi):
    Hi,
as the exception suggests, this issue has nothing to do with Apache Solr, i is 
caused by the libraray called Apache TIKA that is bundled with extracting 
module to do the file parsing stuff. We cannot fix this issue, it would be 
better to report this to the [TIKA|https://issues.apache.org/jira/browse/TIKA] 
project. It would be also good to attach the .doc file causing this to their 
issue.
                  
> Unexpected RuntimeException
> ---------------------------
>
>                 Key: SOLR-3775
>                 URL: https://issues.apache.org/jira/browse/SOLR-3775
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.0-BETA
>            Reporter: Alex C
>
> Hi. I'm using Solr 4.0 Beta (no modifications to default installation) to 
> index, and it's blowing up on Word *.DOC files:
> {code}curl
> "http://localhost:8983/solr/update/extract?literal.id=doc15&commit=true"; -F 
> "myfile=@15.doc"{code}
> Here's the exception. And the same files go through Solr 3.6.1 just fine.
> {noformat}    <?xml version="1.0" encoding="UTF-8"?>
>     <response>
>     <lst name="responseHeader"><int name="status">500</int><int 
> name="QTime">18</int
>     ></lst><lst name="error"><str
> name="msg">org.apache.tika.exception.TikaException
>     : Unexpected RuntimeException from
> org.apache.tika.parser.microsoft.OfficeParser
>     @328c62ce</str><str name="trace">org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.microsoft.OfficeParser@328c62ce
>             at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
>     actingDocumentLoader.java:230)
>             at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co
>     ntentStreamHandlerBase.java:74)
>             at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl
>     erBase.java:129)
>             at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle
>     Request(RequestHandlers.java:240)
>             at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656)
>             at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter
>     .java:454)
>             at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte
>     r.java:275)
>             at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
>     Handler.java:1337)
>             at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java
>     :484)
>             at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
>     ava:119)
>             at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
>             at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl
>     er.java:233)
>             at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
>     er.java:1065)
>             at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:
>     413)
>             at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle
>     r.java:192)
>             at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
>     r.java:999)
>             at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
>     ava:117)
>             at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
>     extHandlerCollection.java:250)
>             at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl
>     ection.java:149)
>             at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
>     .java:111)
>             at org.eclipse.jetty.server.Server.handle(Server.java:351)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(Abstrac
>     tHttpConnection.java:454)
>             at
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(Blockin
>     gHttpConnection.java:47)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(Abstra
>     ctHttpConnection.java:890)
>             at
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.header
>     Complete(AbstractHttpConnection.java:944)
>             at
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:642)
>             at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:230)
>             at
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpCo
>     nnection.java:66)
>             at
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(So
>     cketConnector.java:254)
>             at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
>     l.java:599)
>             at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
>     .java:534)
>             at java.lang.Thread.run(Unknown Source)
>     Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException
>     from org.apache.tika.parser.microsoft.OfficeParser@328c62ce
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244
>     )
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
>     )
>             at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
>     20)
>             at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr
>     actingDocumentLoader.java:224)
>             ... 31 more
>     Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
>             at
> org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:163)
>             at
> org.apache.poi.hwpf.model.Colorref.&lt;init&gt;(Colorref.java:81)
>             at
> org.apache.poi.hwpf.model.types.SHDAbstractType.fillFields(SHDAbstrac
>     tType.java:56)
>             at
> org.apache.poi.hwpf.usermodel.ShadingDescriptor.&lt;init&gt;(ShadingD
>     escriptor.java:38)
>             at
> org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.unCompressCHPOpera
>     tion(CharacterSprmUncompressor.java:582)
>             at
> org.apache.poi.hwpf.sprm.CharacterSprmUncompressor.uncompressCHP(Char
>     acterSprmUncompressor.java:65)
>             at
> org.apache.poi.hwpf.model.StyleSheet.createChp(StyleSheet.java:288)
>             at
> org.apache.poi.hwpf.model.StyleSheet.&lt;init&gt;(StyleSheet.java:121
>     )
>             at
> org.apache.poi.hwpf.HWPFDocument.&lt;init&gt;(HWPFDocument.java:346)
>             at
> org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.ja
>     va:77)
>             at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
>     :185)
>             at
> org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java
>     :160)
>             at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242
>     )
>             ... 34 more
>     </str><int name="code">500</int></lst>
>     </response>{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to