Hi,

I have a search request that uses a couple of filters. I'm using bool+must, 
and I'm trying to optimize the request as much as possible.

- Some filters are used by all users of my platform, but aren't very 
selective.
- Some filters are very specific to individual users, and are highly 
selective.

I've read that I should use the most selective filters first, to ease the 
work performed by the subsequent filters.

However one thing that's not 100% clear is how the filter cache bitmaps 
works. Do they store the result of a filter if performed across the entire 
dataset, or does it store the filtered result of the previous filter's 
output?

Example. Querying the paid invoices of an account:

{ "query":
  { "filtered":
    { "filter":
      { "bool":
        {" must": [
          { "term": { "status": "paid" } },     (all users use this, but 
it's not very selective)
          { "term": { "account": "123456" } }
        ]}
      }
    }
  }
}   

Following the advice of using the most highly selective filter first, I 
should place the "account" filter first. On the other hand I want to be 
sure that all users will re-use the cached output of the "status" filter.

Question: will the "status" filter cache contain *all* paid invoices of all 
accounts, no matter in which order I use the filters?

The above code is just an example - I'm trying to optimize the code for a 
dataset for 1B+ documents, so please take this into consideration.

Thanks,
Lasse

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7ea47711-38c1-4bc7-bc7c-41d85fb5cf81%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to