Huge number of Term objects in memory gives OutOfMemory error

Richard.Bolen Mon, 17 Mar 2008 14:58:00 -0700

I'm running Lucene 2.3.1 with Java 1.5.0_14 on 64 bit linux.  We have fairly 
large collections (~1gig collection files, ~1,000,000 documents).  When I try 
to load test our application with 50 users, all doing simple searches via a web 
interface, we quickly get an OutOfMemory exception.  When I do a jmap dump of 
the heap, this is what I see:


Size    Count   Class description
-------------------------------------------------------
195818576       4263822 char[]
190889608       13259   byte[]
172316640       4307916 java.lang.String
164813120       4120328 org.apache.lucene.index.TermInfo
131823104       4119472 org.apache.lucene.index.Term
37729184        604     org.apache.lucene.index.TermInfo[]
37729184        604     org.apache.lucene.index.Term[]

So 4 of the top 7 memory consumers are Term related.  We have 2 gig of RAM 
available on the system but we get OOM errors no matter the java heap settings. 
 Has anyone seen this issue and know how to solve it?

We do use separate MultiSearcher instances for each search.  (We actually have 
2 collections that we search via a MultiSearcher.) We tried using a singleton 
searcher instance but our collections are constantly being updated and the 
singleton searcher only gives you results since the searcher was opened.  
Creating new searcher objects at search time gives you up to the minute search 
results.

I've seen some postings referring to an Index Divisor setting which could 
reduce the Terms in memory, but I have not seen how to set this value for 
Lucene.

Any help would be greatly appreciated.

Rich

Huge number of Term objects in memory gives OutOfMemory error

Reply via email to