A short-term and longer-term option for this: 1) Short-term - use "Doc Values" in your index mappings to hit disk instead of using es FieldData caches that cause the CircuitBreakingException (you are then more reliant on OS file-system caches for speed) 2) Longer-term - we're working on a sample-based option for significant terms [1]
[1] https://github.com/elasticsearch/elasticsearch/pull/6796 On Friday, September 5, 2014 9:19:13 AM UTC+1, Christoffer Vig wrote: > > The significant terms aggregation is a really great feature that allows > for some really interesting data analysis. We quite often experience out of > memory errors, "CircuitBreakingException: Data too large, data would be > larger than limit" > Which is not hard to understand, due to the amount of data and the speed > requirements. > > I think it would be interesting if it was possible to "trade off" speed to > allow deeper analysis. To run significant terms, and possibly other > aggregations, allow them to run for as long as needed, just to return some > (presumably correct) results. > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/88920eb2-a924-4295-bfb4-cb95d4c37173%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
