Hi folks, I'm optimising our queries based on the advice in Zachary Tong's presentation: https://speakerdeck.com/polyfractal/elasticsearch-query-optimization So far just switching all our query elements to filters has given a 6x speedup on a monster query (65Kchars of compact json), which is very encouraging :-)
All our queries are auto-generated from our own query syntax, though, so if we switch to filters it's gonna have to be pretty much across the board (all terminals in the query AST, or all boolean nodes, or some similarly blunt instrument). Which makes me worry about cache churn. Actually I have two questions: 1. Can I monitor the *filter* cache size and eviction rate somehow? (REST for preference, but jmx would be fine too.) I only seem to see documentation for the field data cache. 2. Any advice for caching/not caching the intermediate boolean nodes in a complex query? In our case many of these intermediate nodes *will* recur in other queries, so my default feeling is to cache them, but that has to be balanced against the extra cache usage (and risk of churn). So I guess the question is, just how fast is the bitset bool filter (we frequently have ANDs and ORs with 10 to 20 children) compared to caching the node? Should I even be considering caching these, or is the bitset combination fast enough to make it a no-brainer? Cheers, Tikitu -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ad493810-dad7-4018-9d71-256df58eebc1%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
