Re: request handler and caches

Chris Hostetter Wed, 10 May 2006 21:56:57 -0700

I was so preoccupied with trying to understand why your cache wasn't
working, that i didnt' even register what you said about how you are using
it...


: My cache is really just a static cache of BitSet's for a fixed set of
: fields and their values.  With my current index size, creating the
: cache is incredibly fast (a second or so), but the index will grow
: much larger.

        ...

: For a fixed set of fields (currently 4 or so of them) I'm building a
: HashMap keyed by field name, with the values of each key also a
: HashMap, keyed by term value.  The value of the inner HashMap is a
: BitSet representing all documents that have that value for that
: field.  These BitSets are used for a faceted browser and ANDed
: together based on user criteria, as well as combined with full-text
: queries using QueryFilter's BitSet.  Nothing fancy, and perhaps
: something Solr already helps provide?

Solr definitely makes this easier.  All you really need to keep track of
(either in your user cache, or in hardcoded logic) is the Queries you want
to have faceting on (TermQueries grouped by field it sounds like).  if you
want to know how many docs any two facets have in common (or that
your user's query has in common with a facet) use...

    int count = searcher.numDocs(facetQ1, facetQ2);

...or if you just wnat to know the number of docs in a single facet use
searcher.getDocSet(q).size().  (there's also getDocSet(List<Query>) if you
have an arbitrary number of facets you want to intersect)

Just about all of the methods in SolrIndexSearcher will automatically
cache the that DocSet in the filterCache so that any time you do
anything involving those Queries no acctual search is done, and
the cache will be autowarmed whenever a newSearcher is opened.

If you size the filterCache big enough, and register a seed query in the
firstSearcher listener you'll never spend time waiting for any of the
facet DocSets to be calculated.



-Hoss

Re: request handler and caches

Reply via email to