RE: OutofMemory on Highlightling

Gargate, Siddharth Thu, 07 May 2009 04:20:17 -0700

I have opened an issue in jira to fix this issue

https://issues.apache.org/jira/browse/SOLR-1150

-----Original Message-----
From: Gargate, Siddharth [mailto:sgarg...@ptc.com] 
Sent: Monday, May 04, 2009 10:07 AM
To: solr-dev@lucene.apache.org
Subject: RE: OutofMemory on Highlightling

Hi all,

I tried few changes in DefaultSolrHighter.doHighlighting method to avoid
OOM errors. The code changes works fine with 256 MB max heap.

.....
     //  searcher.readDocs(readDocs, docs, fset);         //commented
out the readDocs call. This method was fetching the stored fields for
all rows
.....

// Highlight each document
    DocIterator iterator = docs.iterator();
    for (int i = 0; i < docs.size(); i++) {
       int docId = iterator.nextDoc();
     //  Document doc = readDocs[i];            
       Document doc = searcher.doc(i, fset);    //commented the line to
read the Document from readDocs array. Instead now calling the
searcher.doc method to fetch the Document object one by one. 

....
....

With the above changes Memory usage is extremely reduced. 

There was one more change required so that highlighting for alternate
field also works properly without OOM.

...
altList.add( len + altText.length() > alternateFieldLen ?
                                   altText.substring( 0,
alternateFieldLen - len ) : altText );
...

Modified the above line to :

altList.add( len + altText.length() > alternateFieldLen ?
                                   new String(altText.substring( 0,
alternateFieldLen - len )) : altText );

The substring is passed to create a new string object so that no
reference is held for the entire string.

Please let me know if this is a valid fix. Should I open an issue in
jira for this issue?

One issue I observed that search takes around 20-25 seconds. May be
because we are reading 1 MB text for 500 documents. 

Thanks,
Siddharth

-----Original Message-----
From: Gargate, Siddharth [mailto:sgarg...@ptc.com] 
Sent: Tuesday, April 28, 2009 4:35 PM
To: solr-u...@lucene.apache.org; solr-dev@lucene.apache.org
Subject: RE: OutofMemory on Highlightling

Is it possible to read only maxAnalyzedChars from the stored field
instead of reading the complete field in the memory? For instance, in my
case, is it possible to read only first 50K characters instead of
complete 1 MB stored text? That will help minimizing the memory usage
(Though, it will still take 50K * 500 * 2 = 50 MB for 500 results). 

I would really appreciate some feedback on this issue...

Thanks,
Siddharth

-----Original Message-----
From: Gargate, Siddharth [mailto:sgarg...@ptc.com] 
Sent: Friday, April 24, 2009 10:46 AM
To: solr-u...@lucene.apache.org
Subject: RE: OutofMemory on Highlightling

I am not sure whether lazy loading should help solve this problem. I
have set enableLazyFieldLoading to true but it is not helping.

I went through the code and observed that
DefaultSolrHighlighter.doHighlighting is reading all the documents and
the fields for highlighting (In my case, 1 MB stored field is read for
all documents). 

Also I am confused over the following code in SolrIndexSearcher.doc()
method

if(!enableLazyFieldLoading || fields == null) {
      d = searcher.getIndexReader().document(i);
    } else {
      d = searcher.getIndexReader().document(i, 
             new SetNonLazyFieldSelector(fields));
    }

Are we setting the fields as NonLazy even if lazy loading is enabled?

Thanks,
Siddharth

RE: OutofMemory on Highlightling

Reply via email to