Hi all, I tried few changes in DefaultSolrHighter.doHighlighting method to avoid OOM errors. The code changes works fine with 256 MB max heap.
..... // searcher.readDocs(readDocs, docs, fset); //commented out the readDocs call. This method was fetching the stored fields for all rows ..... // Highlight each document DocIterator iterator = docs.iterator(); for (int i = 0; i < docs.size(); i++) { int docId = iterator.nextDoc(); // Document doc = readDocs[i]; Document doc = searcher.doc(i, fset); //commented the line to read the Document from readDocs array. Instead now calling the searcher.doc method to fetch the Document object one by one. .... .... With the above changes Memory usage is extremely reduced. There was one more change required so that highlighting for alternate field also works properly without OOM. ... altList.add( len + altText.length() > alternateFieldLen ? altText.substring( 0, alternateFieldLen - len ) : altText ); ... Modified the above line to : altList.add( len + altText.length() > alternateFieldLen ? new String(altText.substring( 0, alternateFieldLen - len )) : altText ); The substring is passed to create a new string object so that no reference is held for the entire string. Please let me know if this is a valid fix. Should I open an issue in jira for this issue? One issue I observed that search takes around 20-25 seconds. May be because we are reading 1 MB text for 500 documents. Thanks, Siddharth -----Original Message----- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Tuesday, April 28, 2009 4:35 PM To: solr-u...@lucene.apache.org; solr-dev@lucene.apache.org Subject: RE: OutofMemory on Highlightling Is it possible to read only maxAnalyzedChars from the stored field instead of reading the complete field in the memory? For instance, in my case, is it possible to read only first 50K characters instead of complete 1 MB stored text? That will help minimizing the memory usage (Though, it will still take 50K * 500 * 2 = 50 MB for 500 results). I would really appreciate some feedback on this issue... Thanks, Siddharth -----Original Message----- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Friday, April 24, 2009 10:46 AM To: solr-u...@lucene.apache.org Subject: RE: OutofMemory on Highlightling I am not sure whether lazy loading should help solve this problem. I have set enableLazyFieldLoading to true but it is not helping. I went through the code and observed that DefaultSolrHighlighter.doHighlighting is reading all the documents and the fields for highlighting (In my case, 1 MB stored field is read for all documents). Also I am confused over the following code in SolrIndexSearcher.doc() method if(!enableLazyFieldLoading || fields == null) { d = searcher.getIndexReader().document(i); } else { d = searcher.getIndexReader().document(i, new SetNonLazyFieldSelector(fields)); } Are we setting the fields as NonLazy even if lazy loading is enabled? Thanks, Siddharth -----Original Message----- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Wednesday, April 22, 2009 11:12 AM To: solr-u...@lucene.apache.org Subject: RE: OutofMemory on Highlightling Here is the stack trace SEVERE: java.lang.OutOfMemoryError: Java heap space at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:133) at java.lang.StringCoding.decode(StringCoding.java:173) at java.lang.String.<init>(String.java:444) at org.apache.lucene.store.IndexInput.readString(IndexInput.java:125) at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:390) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:230) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:892) at org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.j ava:277) at org.apache.solr.search.SolrIndexReader.document(SolrIndexReader.java:176 ) at org.apache.solr.search.SolrIndexSearcher.doc(SolrIndexSearcher.java:457) at org.apache.solr.search.SolrIndexSearcher.readDocs(SolrIndexSearcher.java :482) at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultS olrHighlighter.java:253) at org.apache.solr.handler.component.HighlightComponent.process(HighlightCo mponent.java:84) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(Search Handler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB ase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja va:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j ava:232) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica tionFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt erChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv e.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv e.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java :128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java :102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve. java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:2 86) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:84 5) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process( Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) -----Original Message----- From: Gargate, Siddharth [mailto:sgarg...@ptc.com] Sent: Wednesday, April 22, 2009 9:29 AM To: solr-u...@lucene.apache.org Subject: RE: OutofMemory on Highlightling I tried disabling the documentCache but still the same issue. <documentCache class="solr.LRUCache" size="0" initialSize="0" autowarmCount="0"/> -----Original Message----- From: Koji Sekiguchi [mailto:k...@r.email.ne.jp] Sent: Monday, April 20, 2009 4:38 PM To: solr-u...@lucene.apache.org Subject: Re: OutofMemory on Highlightling Gargate, Siddharth wrote: > Anybody facing the same issue? Following is my configuration > ... > <field name="content" type="text" indexed="true" stored="false" > multiValued="true"/> > <field name="teaser" type="text" indexed="false" stored="true"/> > <copyField source="content" dest="teaser" maxChars="1000000" /> > ... > > ... > <requestHandler name="standard" class="solr.SearchHandler" > default="true"> > <lst name="defaults"> > <str name="echoParams">explicit</str> > > <int name="rows">500</int> > <str name="hl">true</str> > <str name="fl">id,score</str> > <str name="hl.fl">teaser</str> > <str name="hl.alternateField">teaser</str> > <int name="hl.fragsize">200</int> > <int name="hl.maxAlternateFieldLength">200</int> > <int name="hl.maxAnalyzedChars">500</int> > </lst> > </requestHandler> > ... > > Search works fine if I disable highlighting and it brings 500 results. > But if I enable hightlighting and set the no. of rows to just 20 I get > OOME. > > How about switching documentCache off? Koji