On Fri, Oct 3, 2014 at 4:35 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 10/3/2014 1:57 PM, Yonik Seeley wrote: >> On Fri, Oct 3, 2014 at 3:42 PM, Peter Keegan <peterlkee...@gmail.com> wrote: >>> Say I have a boolean field named 'hidden', and less than 1% of the >>> documents in the index have hidden=true. >>> Do both these filter queries use the same docset cache size? : >>> fq=hidden:false >>> fq=!hidden:true >> >> Nope... !hidden:true will be smaller in the cache (it will be cached >> as hidden:true and then inverted) >> The downside is that you'll pay the cost of that inversion. > > I would think that unless it's using hashDocSet, the cached data for > every filter would always be the same size. The wiki says that > hashDocSet is no longer used for filter caching as of 1.4.0. Is that > actually true?
Yes, SortedIntDocSet is used instead. It stores an int per match (i.e. 4 bytes per match). This change was made so in-order traversal could be done efficiently. -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data