Re: Optimizing unordered queries

Michael McCandless Mon, 06 Jul 2009 09:37:40 -0700

On Mon, Jun 29, 2009 at 9:33 AM, Nigel<[email protected]> wrote:

> Ah, I was confused by the index divisor being 1 by default: I thought it
> meant that all terms were being loaded.  I see now in SegmentTermEnum that
> the every-128th behavior is implemented at a lower level.
>
> But I'm even more confused about why we have so many terms in memory.  A
> heap dump shows over 270 million TermInfos, so if that's only 128th of the
> total then we REALLY have a lot of terms.  (-:  We do have a lot of docs
> (about 250 million), and we do have a couple unique per-document values, but
> even so I can't see how we could get to 270 million x 128 terms.  (The heap
> dump numbers are stable across the index close-and-reopen cycle, so I don't
> think we're leaking.)


You could use CheckIndex to see how many terms are in your index.

If you do the heap dump after opening a fresh reader and not running
any searches yet, you see 270 million TermInfos?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Optimizing unordered queries

Reply via email to