Term filters already use lucene's term dictionary as an index. Almost everything Elasticsearch does uses it. In fact term queries are so fast that Elasticsearch switched them from being cached by default to uncached by default (don't have version number handy). For the most part I wouldn't worry about them. When I have a request that is slow I tend to remove parts of it until I find the slow bit. That works well if the speed issue is CPU related which most stuff seems to be.
Nik On Nov 11, 2014 4:47 PM, "Lasse Schou" <[email protected]> wrote: > Thanks for the explanation. > > A follow-up question. If caching the filter for a specific value, say "{ > "term": { "status": "paid" } }", will this somehow magically speed up the > query if searching for "status": "unpaid"? I'm not talking about a "not" > operation, but simply replacing the value with something else (like when > creating an index in a RDBMS). > > 2014-11-11 21:35 GMT+01:00 Ivan Brusic <[email protected]>: > >> The status filter cache will indeed contain all entries. And technically, >> the cache is per segment, and not across all documents, but this should be >> transparent. >> >> Caching is enabled by default for the term filters, but disabled for the >> bool filter. You can enable it if you think users will be reusing the >> filter. >> >> -- >> Ivan >> >> On Tue, Nov 11, 2014 at 3:23 AM, Lasse Schou <[email protected]> >> wrote: >> >>> Hi, >>> >>> I have a search request that uses a couple of filters. I'm using >>> bool+must, and I'm trying to optimize the request as much as possible. >>> >>> - Some filters are used by all users of my platform, but aren't very >>> selective. >>> - Some filters are very specific to individual users, and are highly >>> selective. >>> >>> I've read that I should use the most selective filters first, to ease >>> the work performed by the subsequent filters. >>> >>> However one thing that's not 100% clear is how the filter cache bitmaps >>> works. Do they store the result of a filter if performed across the entire >>> dataset, or does it store the filtered result of the previous filter's >>> output? >>> >>> Example. Querying the paid invoices of an account: >>> >>> { "query": >>> { "filtered": >>> { "filter": >>> { "bool": >>> {" must": [ >>> { "term": { "status": "paid" } }, (all users use this, but >>> it's not very selective) >>> { "term": { "account": "123456" } } >>> ]} >>> } >>> } >>> } >>> } >>> >>> Following the advice of using the most highly selective filter first, I >>> should place the "account" filter first. On the other hand I want to be >>> sure that all users will re-use the cached output of the "status" filter. >>> >>> Question: will the "status" filter cache contain *all* paid invoices of >>> all accounts, no matter in which order I use the filters? >>> >>> The above code is just an example - I'm trying to optimize the code for >>> a dataset for 1B+ documents, so please take this into consideration. >>> >>> Thanks, >>> Lasse >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/7ea47711-38c1-4bc7-bc7c-41d85fb5cf81%40googlegroups.com >>> <https://groups.google.com/d/msgid/elasticsearch/7ea47711-38c1-4bc7-bc7c-41d85fb5cf81%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to a topic in the >> Google Groups "elasticsearch" group. >> To unsubscribe from this topic, visit >> https://groups.google.com/d/topic/elasticsearch/W5p-eeoUnr0/unsubscribe. >> To unsubscribe from this group and all its topics, send an email to >> [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBXWb82GwrBgAyHKbGXbwtRJ8JaVZhEYB72EnTm%2Brp1qw%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQBXWb82GwrBgAyHKbGXbwtRJ8JaVZhEYB72EnTm%2Brp1qw%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CADERWXpL6%3DEFF68jKaZkADAQLmLRNW_F%2BVDU%2ByN8Z_PbaQ29Ew%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CADERWXpL6%3DEFF68jKaZkADAQLmLRNW_F%2BVDU%2ByN8Z_PbaQ29Ew%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd1hoiLDtpwoQ9o-dNB7vHOTiJ_3srZ61Y5FUVvXOSHEGg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
