Hi,
I have a search request that uses a couple of filters. I'm using bool+must,
and I'm trying to optimize the request as much as possible.
- Some filters are used by all users of my platform, but aren't very
selective.
- Some filters are very specific to individual users, and are highly
selective.
I've read that I should use the most selective filters first, to ease the
work performed by the subsequent filters.
However one thing that's not 100% clear is how the filter cache bitmaps
works. Do they store the result of a filter if performed across the entire
dataset, or does it store the filtered result of the previous filter's
output?
Example. Querying the paid invoices of an account:
{ "query":
{ "filtered":
{ "filter":
{ "bool":
{" must": [
{ "term": { "status": "paid" } }, (all users use this, but
it's not very selective)
{ "term": { "account": "123456" } }
]}
}
}
}
}
Following the advice of using the most highly selective filter first, I
should place the "account" filter first. On the other hand I want to be
sure that all users will re-use the cached output of the "status" filter.
Question: will the "status" filter cache contain *all* paid invoices of all
accounts, no matter in which order I use the filters?
The above code is just an example - I'm trying to optimize the code for a
dataset for 1B+ documents, so please take this into consideration.
Thanks,
Lasse
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7ea47711-38c1-4bc7-bc7c-41d85fb5cf81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.