We do have a custom query parser that is responsible for expanding the user
input query into a bunch of prefix, phrase and regular boolean queries in a
manner similar to that done by DisMax.

Analyzing heap with jhat/YourKit is on my list of things to do but I
haven't gotten around to doing it yet. Our big heap size (13G) makes it a
little difficult to do a full blown heap dump analysis.

Thanks a ton for the reply Otis!

Prasanna

On Mon, Nov 12, 2012 at 5:42 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hi,
>
> I've never seen this.  You don't have a custom query parser or anything
> else custom, do you?
> Have you tried dumping and analyzing heap?  YourKit has a 7 day eval, or
> you can use things like jhat, which may be included on your machine already
> (see http://docs.oracle.com/javase/6/docs/technotes/tools/share/jhat.html).
>
> Otis
> --
> Performance Monitoring - http://sematext.com/spm/index.html
>
>
> On Mon, Nov 12, 2012 at 8:35 PM, Prasanna R <plistma...@gmail.com> wrote:
>
> >  We have been using Solr in a custom setup where we generate results for
> > user queries by expanding it to a large boolean query consisting of
> > multiple prefix queries. There have been some GC issues recently with the
> > Old/tenured generation becoming nearly 100% full leading to near constant
> > full GC cycles.
> >
> > We are running Solr 3.1 on servers with 13G of heap. jmap live object
> > histogram is as follows:
> >
> > num     #instances         #bytes  class name
> > ----------------------------------------------
> >    1:      27441222     1550723760  [Ljava.lang.Object;
> >    2:      23546318      879258496  [C
> >    3:      23813405      762028960  java.lang.String
> >    4:      22700095      726403040  org.apache.lucene.search.BooleanQuery
> >    5:      27431515      658356360  java.util.ArrayList
> >    6:      22911883      549885192
>  org.apache.lucene.search.BooleanClause
> >    7:      21651039      519624936  org.apache.lucene.index.Term
> >    8:       6876651      495118872
> > org.apache.lucene.index.FieldsReader$LazyField
> >    9:      11354214      363334848  org.apache.lucene.search.PrefixQuery
> >   10:       4281624      137011968  java.util.HashMap$Entry
> >   11:       3466680       83200320  org.apache.lucene.search.TermQuery
> >   12:       1987450       79498000  org.apache.lucene.search.PhraseQuery
> >   13:        631994       70148624  [Ljava.util.HashMap$Entry;
> > .....
> >
> > I have looked at the Solr cache settings multiple times but am not able
> to
> > figure out how/why the high number of BooleanQuery and BooleanClause
> object
> > instances stay alive. These objects are live and do not get collected
> even
> > when the traffic is disabled and a manual GC is triggered which indicates
> > that someone is holding onto references.
> >
> > Can anyone provide more details on the circumstances under which these
> > objects stay alive and/or cached? If they are cached then is the caching
> > configurable?
> >
> > Any and all tips/suggestions/pointers will be much appreciated.
> >
> > Thanks,
> >
> > Prasanna
> >
>

Reply via email to