One thing to keep in mind about using the field cache for filter
caching.
The filter bitset cache at worst holds 8 documents per byte (and with
bitset compression this can be even more efficient).
Using the field cache is going to rather be bytes per document, most
likely at least an order of magnitude greater, and maybe 2 (if using
strings fields, due to object overhead, length of string, plus the
management fields...).
This is why I think for many users the field cache is not the best
solution. If you have lots of documents but searchers that return
relatively few, then using filters and sorting the results using
stored fields is far more efficient.
It seems to me that the field cache is only appropriate when the
documents have very few fields in play (1-3 ?), otherwise cached
range filters will work better. If we also have partitioned (trie
query) and compressed filters, then the cache is only useful for
sorting.
The most important use for the field cache seems to be the case where
a query returns lots of documents, say by date range, AND you want
the most recent ones to score higher (needing the sort) - basically
using the cache for the selection and the sort.
On Dec 7, 2008, at 3:42 AM, Michael McCandless wrote:
Mark Miller wrote:
MultiSearcher has a few aspects I don't like.
Do you mean the score differences vs IndexSearcher(MultiReader),
or is there something else?
And rewrite does not work properly. And to get 30 docs over 3
indexes, you ask for 90. And sort twice.
I'm thinking we stick with MultiReader, but improve it so that when
sorting by fields can use a [new FieldCache like] API such that
gives it the benefits that MultiSearcher has. Ie, best of both
worlds.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]