RE: Memory use during merges (OOM)

2010-12-18 Thread Burton-West, Tom
Thanks Robert, We will try the termsIndexInterval as a workaround. I have also opened a JIRA issue: https://issues.apache.org/jira/browse/SOLR-2290. Hope I found the right sections of the Lucene code. I'm just now in the process of looking at the Solr IndexReaderFactory and SolrIndexWriter

Re: Memory use during merges (OOM)

2010-12-16 Thread Upayavira
How long does it take to reach this OOM situation? Is it possible for you to try a merge with each setting in turn, and evaluate what impact they each have? That is, indexing speed and memory consumption? It might be interesting to watch garbage collection too while it is running with jstat, as

Re: Memory use during merges (OOM)

2010-12-16 Thread Michael McCandless
RAM usage for merging is tricky. First off, merging must hold open a SegmentReader for each segment being merged. However, it's not necessarily a full segment reader; for example, merging doesn't need the terms index nor norms. But it will load deleted docs. But, if you are doing deletions (or

RE: Memory use during merges (OOM)

2010-12-16 Thread Robert Petersen
understand the conclusion below. -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, December 16, 2010 2:51 AM To: solr-user@lucene.apache.org Subject: Re: Memory use during merges (OOM) RAM usage for merging is tricky. First off, merging must hold

Re: Memory use during merges (OOM)

2010-12-16 Thread Michael McCandless
that is bad?  I didn't really understand the conclusion below. -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, December 16, 2010 2:51 AM To: solr-user@lucene.apache.org Subject: Re: Memory use during merges (OOM) RAM usage for merging

RE: Memory use during merges (OOM)

2010-12-16 Thread Burton-West, Tom
Thanks Mike, But, if you are doing deletions (or updateDocument, which is just a delete + add under-the-hood), then this will force the terms index of the segment readers to be loaded, thus consuming more RAM. Out of 700,000 docs, by the time we get to doc 600,000, there is a good chance a few

RE: Memory use during merges (OOM)

2010-12-16 Thread Robert Petersen
settings I had not considered before. Rob -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, December 16, 2010 10:24 AM To: solr-user@lucene.apache.org Subject: Re: Memory use during merges (OOM) It's not that it's bad, it's just that Lucene

Re: Memory use during merges (OOM)

2010-12-16 Thread Michael McCandless
On Thu, Dec 16, 2010 at 2:09 PM, Burton-West, Tom tburt...@umich.edu wrote: Thanks Mike, But, if you are doing deletions (or updateDocument, which is just a delete + add under-the-hood), then this will force the terms index of the segment readers to be loaded, thus consuming more RAM. Out of

Re: Memory use during merges (OOM)

2010-12-16 Thread Michael McCandless
McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, December 16, 2010 10:24 AM To: solr-user@lucene.apache.org Subject: Re: Memory use during merges (OOM) It's not that it's bad, it's just that Lucene must do extra work to check if these deletes are real or not, and that extra work

Re: Memory use during merges (OOM)

2010-12-16 Thread Robert Muir
On Thu, Dec 16, 2010 at 2:09 PM, Burton-West, Tom tburt...@umich.edu wrote: I always get confused about the two different divisors and their names in the solrconfig.xml file This one (for the writer) isnt configurable by Solr. want to open an issue? We are setting  termInfosIndexDivisor,

RE: Memory use during merges (OOM)

2010-12-16 Thread Burton-West, Tom
Your setting isn't being applied to the reader IW uses during merging... its only for readers Solr opens from directories explicitly. I think you should open a jira issue! Do I understand correctly that this setting in theory could be applied to the reader IW uses during merging but is not

Re: Memory use during merges (OOM)

2010-12-16 Thread Robert Muir
On Thu, Dec 16, 2010 at 4:03 PM, Burton-West, Tom tburt...@umich.edu wrote: Your setting isn't being applied to the reader IW uses during merging... its only for readers Solr opens from directories explicitly. I think you should open a jira issue! Do I understand correctly that this setting in

Re: Memory use during merges (OOM)

2010-12-16 Thread Yonik Seeley
On Thu, Dec 16, 2010 at 5:51 AM, Michael McCandless luc...@mikemccandless.com wrote: If you are doing false deletions (calling .updateDocument when in fact the Term you are replacing cannot exist) it'd be best if possible to change the app to not call .updateDocument if you know the Term