Thanks Mike,
> OK. It would be good to know where all your RAM is being consumed,
> and how much of that is really the terms index: it ought to be a very
> small part of it.
>
> I made a bunch of heap dumps. I just watched with jconsole and ran jmap
-histo when memory use got high.
I've appended a bit more from the error trace and the top memory users
from one of the heap dumps below..
I tried to send a bunch of heap dumps to the mailing list but the message
got rejected. I'll send them directly to you.
Tom
----
java.lang.OutOfMemoryError: Java heap space
at
org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.<init>(FreqProxTermsWriterPerField.java:212)
at
org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray.newInstance(FreqProxTermsWriterPerField.java:230)
at
org.apache.lucene.index.ParallelPostingsArray.grow(ParallelPostingsArray.java:48)
at
org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray.grow(TermsHashPerField.java:252)
at org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:292)
at
org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:151)
at
org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:659)
at
org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:359)
---
top memory users from one of the heap dumps:
1: 1131932 2546933736 [B
2: 308670 743033280 [I
3: 696803 203038680 [C
4: 383039 36771744
org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter$IntBlockTermState
5: 1089864 26156736
org.apache.lucene.util.AttributeSource$State
6: 544870 26153760
org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl
7: 687500 16500000 org.apache.lucene.util.BytesRef
8: 135820 9779040 org.apache.lucene.util.fst.FST$Arc
9: 382519 9180456
org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$PendingTerm
10: 382037 9168888 org.apache.lucene.codecs.TermStats
11: 544952 8719232 org.apache.lucene.util.BytesRefBuilder