We do have a custom query parser that is responsible for expanding the user input query into a bunch of prefix, phrase and regular boolean queries in a manner similar to that done by DisMax.
Analyzing heap with jhat/YourKit is on my list of things to do but I haven't gotten around to doing it yet. Our big heap size (13G) makes it a little difficult to do a full blown heap dump analysis. Thanks a ton for the reply Otis! Prasanna On Mon, Nov 12, 2012 at 5:42 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Hi, > > I've never seen this. You don't have a custom query parser or anything > else custom, do you? > Have you tried dumping and analyzing heap? YourKit has a 7 day eval, or > you can use things like jhat, which may be included on your machine already > (see http://docs.oracle.com/javase/6/docs/technotes/tools/share/jhat.html). > > Otis > -- > Performance Monitoring - http://sematext.com/spm/index.html > > > On Mon, Nov 12, 2012 at 8:35 PM, Prasanna R <plistma...@gmail.com> wrote: > > > We have been using Solr in a custom setup where we generate results for > > user queries by expanding it to a large boolean query consisting of > > multiple prefix queries. There have been some GC issues recently with the > > Old/tenured generation becoming nearly 100% full leading to near constant > > full GC cycles. > > > > We are running Solr 3.1 on servers with 13G of heap. jmap live object > > histogram is as follows: > > > > num #instances #bytes class name > > ---------------------------------------------- > > 1: 27441222 1550723760 [Ljava.lang.Object; > > 2: 23546318 879258496 [C > > 3: 23813405 762028960 java.lang.String > > 4: 22700095 726403040 org.apache.lucene.search.BooleanQuery > > 5: 27431515 658356360 java.util.ArrayList > > 6: 22911883 549885192 > org.apache.lucene.search.BooleanClause > > 7: 21651039 519624936 org.apache.lucene.index.Term > > 8: 6876651 495118872 > > org.apache.lucene.index.FieldsReader$LazyField > > 9: 11354214 363334848 org.apache.lucene.search.PrefixQuery > > 10: 4281624 137011968 java.util.HashMap$Entry > > 11: 3466680 83200320 org.apache.lucene.search.TermQuery > > 12: 1987450 79498000 org.apache.lucene.search.PhraseQuery > > 13: 631994 70148624 [Ljava.util.HashMap$Entry; > > ..... > > > > I have looked at the Solr cache settings multiple times but am not able > to > > figure out how/why the high number of BooleanQuery and BooleanClause > object > > instances stay alive. These objects are live and do not get collected > even > > when the traffic is disabled and a manual GC is triggered which indicates > > that someone is holding onto references. > > > > Can anyone provide more details on the circumstances under which these > > objects stay alive and/or cached? If they are cached then is the caching > > configurable? > > > > Any and all tips/suggestions/pointers will be much appreciated. > > > > Thanks, > > > > Prasanna > > >