Solr memory use, jmap and TermInfos/tii

Burton-West, Tom Fri, 10 Sep 2010 15:47:11 -0700

Hi all,

When we run the first query after starting up Solr, memory use goes up from 
about 1GB to 15GB and never goes below that level.  In debugging a recent OOM 
problem I ran jmap with the output appended below.  Not surprisingly, given the 
size of our indexes, it looks like the TermInfo and Term data structures which 
are the in-memory representation of the tii file are taking up most of the 
memory. This is running Solr under Tomcat with 16GB allocated to the jvm and 3 
shards each with a tii file of about 600MB.


Total index size is about 400GB for each shard (we are indexing about 600,000 
full-text books in each shard).

In interpreting the jmap output, can we assume that the listings for utf8 
character arrays ("[C"), java.lang.String, long int arrays ("[J), and int 
arrays ("[i) are all part of the data structures involved in representing the 
tii file in memory?

Tom Burton-West
http://www.hathitrust.org/blogs/large-scale-search

(jmap output, commas in numbers added)

num     #instances         #bytes  class name
----------------------------------------------
   1:      82,496,803     4,273,137,904  [C
   2:      82,498,673     3,299,946,920  java.lang.String
   3:      27,810,887     1,112,435,480  org.apache.lucene.index.TermInfo
   4:      27,533,080     1,101,323,200  org.apache.lucene.index.TermInfo
   5:      27,115,577     1,084,623,080  org.apache.lucene.index.TermInfo
   6:      27,810,894      889,948,608  org.apache.lucene.index.Term
   7:      27,533,088      881,058,816  org.apache.lucene.index.Term
   8:      27,115,589      867,698,848  org.apache.lucene.index.Term
   9:           148      659,685,520  [J
  10:             2      222,487,072  [Lorg.apache.lucene.index.Term;
  11:             2      222,487,072  [Lorg.apache.lucene.index.TermInfo;
  12:             2      220,264,600  [Lorg.apache.lucene.index.Term;
  13:             2      220,264,600  [Lorg.apache.lucene.index.TermInfo;
  14:             2      216,924,560  [Lorg.apache.lucene.index.Term;
  15:             2      216,924,560  [Lorg.apache.lucene.index.TermInfo;
  16:        737,060      155,114,960  [I
  17:        627,793       35,156,408  java.lang.ref.SoftReference

Solr memory use, jmap and TermInfos/tii

Reply via email to