A short-term and longer-term option for this:

1) Short-term - use "Doc Values" in your index mappings to hit disk instead 
of using es FieldData caches that cause the CircuitBreakingException (you 
are then more reliant on OS file-system caches for speed)
2) Longer-term - we're working on a sample-based option for significant 
terms [1]


[1] https://github.com/elasticsearch/elasticsearch/pull/6796


On Friday, September 5, 2014 9:19:13 AM UTC+1, Christoffer Vig wrote:
>
> The significant terms aggregation is a really great feature that allows 
> for some really interesting data analysis. We quite often experience out of 
> memory errors, "CircuitBreakingException: Data too large, data would be 
> larger than limit"
> Which is not hard to understand, due to the amount of data and the speed 
> requirements. 
>
> I think it would be interesting if it was possible to "trade off" speed to 
> allow deeper analysis. To run significant terms, and possibly other 
> aggregations, allow them to run for as long as needed, just to return some 
> (presumably correct) results. 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/88920eb2-a924-4295-bfb4-cb95d4c37173%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to