Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Hoping I can get a better response with a more directed question:

With facet queries and the fields used, what qualifies as a large number
of values?  The wiki uses U.S. states as an example, so the number of unique
values = 50.  More to the point, is there an algorithm that I can use to
estimate the cache consumption rate for facet queries?

-- j




On 4/1/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:


I've read through the list entries here, the Lucene list, and the wiki
docs and am not resolving a major pain point  for us.  We've been trying to
determine what could possibly cause us to hit this in our given environment,
and am hoping more eyes on this issue can help.

Our scenario: 150MB index, 14 documents, read/write servers in place
using standard replication.  Running Tomcat 5.5.17 on Redhat Enterprise
Linux 4.  Java configured to start with -Xmx1024m.  We encounter java heap
out-of-memory issues on the read server at staggered times, but usually once
every 48 hours.  Search request load is roughly 2 searches every 3 seconds,
with some spikes here or there.  We are using facets: 3 are based on type
integer, one is based on type string.  We are using sorts: 1 is based on
type sint, 2 are based on type date.  Caching is disabled.  Solr bits are
also from September 2006.

Is there anything in that configuration that we should interrogate?

thanks,
j



Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Mike Klaas

On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:

Hoping I can get a better response with a more directed question:


I haven't answered your original question as it seems that general
java memory debugging techniques would be the most useful thing here.


With facet queries and the fields used, what qualifies as a large number
of values?  The wiki uses U.S. states as an example, so the number of unique
values = 50.  More to the point, is there an algorithm that I can use to
estimate the cache consumption rate for facet queries?


The cache consumption rate is one entry per unique value in all
faceted fields, excluding fields that have faceting satisfied via
FieldCache (single-valued fields with exacly one token per document).

The size of each cached filter is num docs / 8 bytes, unless the
number of maching docs is less than the useHashSet threshold in
solrconfig.xml.

Sorting requires FieldCache population, which consists of an integer
per document plus the sum of the lengths of the unique values in the
field (less for pure int/float fields, but I'm not sure if Solr's sint
qualifies).

Both faceting and sorting shouldn't consume more memory after their
datastructures have been built, so it would be odd to see OOM after 48
hours if they were the cause.

-Mike


Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Yonik Seeley

On 4/1/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:

Our scenario: 150MB index, 14 documents, read/write servers in place
using standard replication.  Running Tomcat 5.5.17 on Redhat Enterprise
Linux 4.  Java configured to start with -Xmx1024m.  We encounter java heap
out-of-memory issues on the read server at staggered times, but usually once
every 48 hours.


Could you do a grep through your server logs for WARNING, to
eliminate the possibility of multiple overlapping searchers causing
the OOM issue?

Are you doing incremental updates?  If so, try lowering your
mergeFactor for the index, or optimize more frequently.  As an index
is incrementally updated, old docs are marked as deleted and new docs
are added.  This leaves holes in the document id space which can
increase memory usage.  Both BitSet filters and FieldCache entry sizes
are proportionally related to maxDoc (the maximum internal docid in
the index).

You can see maxDoc from the statistics page... there might be a correlation.

-Yonik


Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

On 4/2/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 4/1/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 Our scenario: 150MB index, 14 documents, read/write servers in place
 using standard replication.  Running Tomcat 5.5.17 on Redhat Enterprise
 Linux 4.  Java configured to start with -Xmx1024m.  We encounter java
heap
 out-of-memory issues on the read server at staggered times, but usually
once
 every 48 hours.

Could you do a grep through your server logs for WARNING, to
eliminate the possibility of multiple overlapping searchers causing
the OOM issue?



We're not seeing warnings for overlapping searchers prior to the oom
events.  Only SEVERE -- java.lang.OutOfMemoryError: Java heap space.

Are you doing incremental updates?  If so, try lowering your

mergeFactor for the index, or optimize more frequently.  As an index
is incrementally updated, old docs are marked as deleted and new docs
are added.  This leaves holes in the document id space which can
increase memory usage.  Both BitSet filters and FieldCache entry sizes
are proportionally related to maxDoc (the maximum internal docid in
the index).

You can see maxDoc from the statistics page... there might be a
correlation.



We are doing incremental updates, and we optimize quite a bit.  mergeFactor
presently set to 10.
maxDoc count = 144156
numDocs count = 144145


Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Thanks for the pointers, Mike.  I'm trying to determine the math to resolve
some strange numbers we're seeing.  Here's the top dozen lines from a jmap
analysis on a heap dump:

SizeCount Class description
-
428246064   1792204   int[]
931751763213131   char[]
771950403216460   java.lang.String
674791123945  long[]
530738881658559   java.util.LinkedHashMap$Entry
396683521652848   org.apache.solr.search.HashDocSet
2819528027131 byte[]
271654561697841   org.apache.lucene.index.Term
270240161689001   org.apache.lucene.search.TermQuery
22265920695810org.apache.lucene.document.Field
4931568 5974  java.lang.Object[]
4366768 77978 org.apache.lucene.store.FSIndexInput

I see the HashDocSet numbers (count=1.65 million), assume they have
references to the int arrays (count=1.79 million)  and wonder how I could
have so many of those in memory.  A few more data tidbits:

- Facet field Id1 = type int, unique values = 2710
- Facet field Id2 = type int, unique values = 65
- Facet field Id3 = type string, unique values = 15179

Thanks for the extra eyes on this, much appreciated.

-- j



On 4/2/07, Mike Klaas [EMAIL PROTECTED] wrote:


On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 With facet queries and the fields used, what qualifies as a large
number
 of values?  The wiki uses U.S. states as an example, so the number of
unique
 values = 50.  More to the point, is there an algorithm that I can use to
 estimate the cache consumption rate for facet queries?

The cache consumption rate is one entry per unique value in all
faceted fields, excluding fields that have faceting satisfied via
FieldCache (single-valued fields with exacly one token per document).

The size of each cached filter is num docs / 8 bytes, unless the
number of maching docs is less than the useHashSet threshold in
solrconfig.xml.

Sorting requires FieldCache population, which consists of an integer
per document plus the sum of the lengths of the unique values in the
field (less for pure int/float fields, but I'm not sure if Solr's sint
qualifies).

Both faceting and sorting shouldn't consume more memory after their
datastructures have been built, so it would be odd to see OOM after 48
hours if they were the cause.

-Mike



Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Yonik Seeley

On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:

We are doing incremental updates, and we optimize quite a bit.  mergeFactor
presently set to 10.
maxDoc count = 144156
numDocs count = 144145


What version of Solr are you using?  Another potential OOM (multiple
threads generating the same FieldCache entry) was fixed in later
versions of Lucene included with Solr.

-Yonik


Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Sorry for the confusion.  We do have caching disabled.  I was asking the
question because I wasn't certain if the configurable cache settings applied
throughout, or if the FieldCache in lucene still came in play.

The two integer-based facets are single valued per document.  The
string-based facet is multiValued.



On 4/2/07, Chris Hostetter [EMAIL PROTECTED] wrote:



: values = 50.  More to the point, is there an algorithm that I can use to
: estimate the cache consumption rate for facet queries?

I'm confused ... i thought you said in your orriginal mail that you had
all the caching disabled? (except for FieldCache which is so low level in
Lucene it's always used)

are the fields you are faceting on multiValued or single valued?


-Hoss




Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Major version is 1.0.  The bits are from a nightly build from early
September 2006.

We do have plans to upgrade solr soon.

On 4/2/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 We are doing incremental updates, and we optimize quite a
bit.  mergeFactor
 presently set to 10.
 maxDoc count = 144156
 numDocs count = 144145

What version of Solr are you using?  Another potential OOM (multiple
threads generating the same FieldCache entry) was fixed in later
versions of Lucene included with Solr.

-Yonik



Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Jeff Rodenburg

Yonik - is this the JIRA entry you're referring to?

http://issues.apache.org/jira/browse/LUCENE-754



On 4/2/07, Yonik Seeley [EMAIL PROTECTED] wrote:


On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:
 We are doing incremental updates, and we optimize quite a
bit.  mergeFactor
 presently set to 10.
 maxDoc count = 144156
 numDocs count = 144145

What version of Solr are you using?  Another potential OOM (multiple
threads generating the same FieldCache entry) was fixed in later
versions of Lucene included with Solr.

-Yonik



Re: Troubleshooting java heap out-of-memory

2007-04-02 Thread Yonik Seeley

On 4/2/07, Jeff Rodenburg [EMAIL PROTECTED] wrote:

Yonik - is this the JIRA entry you're referring to?

http://issues.apache.org/jira/browse/LUCENE-754


Yes.  But from the heap dump you provided, that doesn't look like the issue.

-Yonik