[
https://issues.apache.org/jira/browse/SOLR-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177290#comment-13177290
]
Hoss Man commented on SOLR-2990:
--------------------------------
have you tried parsing these docs using tika on the command line?
https://tika.apache.org/1.0/gettingstarted.html#Using_Tika_as_a_command_line_utility
...nothing in these stack traces seems to suggests a problem specifically in
Solr
(It's completely possible that Solr is doing something inefficient (memory
wise) when using Tika that is contributing the the OOM, but if you're getting
errors on these docs even when you don't get OOM that suggests a more
fundamental underlying problem)
> solr OOM issues
> ---------------
>
> Key: SOLR-2990
> URL: https://issues.apache.org/jira/browse/SOLR-2990
> Project: Solr
> Issue Type: Bug
> Components: clients - java
> Affects Versions: 4.0
> Environment: CentOS 5.x/6.x
> Solr Build apache-solr-4.0-2011-11-04_09-29-42 (includes tika 1.0)
> java -server -Xms2G -Xmx2G -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=/var/log/oom/solr.dump.1 -Dsolr.data.dir=/opt/solr.data
> -Djava.util.logging.config.file=solr-logging.properties -DSTOP.PORT=8907
> -DSTOP.KEY=STOP -jar start.jar
> Reporter: Rob Tulloh
>
> We see intermittent issues with OutOfMemory caused by tika failing to process
> content. Here is an example:
> Dec 29, 2011 7:12:05 AM org.apache.solr.common.SolrException log
> SEVERE: java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.poi.hmef.attribute.TNEFAttribute.<init>(TNEFAttribute.java:50)
> at
> org.apache.poi.hmef.attribute.TNEFAttribute.create(TNEFAttribute.java:76)
> at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:74)
> at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
> at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
> at org.apache.poi.hmef.HMEFMessage.process(HMEFMessage.java:98)
> at org.apache.poi.hmef.HMEFMessage.<init>(HMEFMessage.java:63)
> at
> org.apache.tika.parser.microsoft.TNEFParser.parse(TNEFParser.java:79)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:129)
> at
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:195)
> at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
> at
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:244)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1478)
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248)
> at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
> at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
> at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
> at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
> at
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]