[ 
https://issues.apache.org/jira/browse/SOLR-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177290#comment-13177290
 ] 

Hoss Man commented on SOLR-2990:
--------------------------------

have you tried parsing these docs using tika on the command line?

https://tika.apache.org/1.0/gettingstarted.html#Using_Tika_as_a_command_line_utility

...nothing in these stack traces seems to suggests a problem specifically in 
Solr

(It's completely possible that Solr is doing something inefficient (memory 
wise) when using Tika that is contributing the the OOM, but if you're getting 
errors on these docs even when you don't get OOM that suggests a more 
fundamental underlying problem)
                
> solr OOM issues
> ---------------
>
>                 Key: SOLR-2990
>                 URL: https://issues.apache.org/jira/browse/SOLR-2990
>             Project: Solr
>          Issue Type: Bug
>          Components: clients - java
>    Affects Versions: 4.0
>         Environment: CentOS 5.x/6.x
> Solr Build apache-solr-4.0-2011-11-04_09-29-42 (includes tika 1.0)
> java -server -Xms2G -Xmx2G -XX:+HeapDumpOnOutOfMemoryError 
> -XX:HeapDumpPath=/var/log/oom/solr.dump.1 -Dsolr.data.dir=/opt/solr.data 
> -Djava.util.logging.config.file=solr-logging.properties -DSTOP.PORT=8907 
> -DSTOP.KEY=STOP -jar start.jar
>            Reporter: Rob Tulloh
>
> We see intermittent issues with OutOfMemory caused by tika failing to process 
> content. Here is an example:
> Dec 29, 2011 7:12:05 AM org.apache.solr.common.SolrException log
> SEVERE: java.lang.OutOfMemoryError: Java heap space
>         at 
> org.apache.poi.hmef.attribute.TNEFAttribute.<init>(TNEFAttribute.java:50)
>         at 
> org.apache.poi.hmef.attribute.TNEFAttribute.create(TNEFAttribute.java:76)
>         at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:74)
>         at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
>         at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
>         at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
>         at org.apache.poi.hmef.HMEFMessage.<init>(HMEFMessage.java:63)
>         at 
> org.apache.tika.parser.microsoft.TNEFParser.parse(TNEFParser.java:79)
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
>         at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:129)
>         at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:195)
>         at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
>         at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>         at 
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:244)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1478)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
>         at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>         at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>         at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>         at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>         at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>         at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to