Hi,

We are using terms aggregation on high cardinality field and limiting 
results to 5000 (using “size” parameter). We also have a cardinality sub 
aggregation on this terms aggregation to get the number of unique values on 
a separate field for each term returned. Such combination of aggregations 
requires a lot of memory and we are getting Out Of Memory error. 

We tried this new "collect_mode" option with "breadth_first" setting but 
without success. Memory consumption is the same and OOM is still there. 

We identified that almost all memory consumed by ByteArray object in 
HyperLogLogPlusPlus class. This object is created in HyperLogLogPlusPlus 
constructor and initialized with “initialBucketCount  << p” value as size 
(where initialBucketCount  is estimated buckets count passed from terms 
aggregation, p is precision). We believe that with "breadth_first" setting 
initial bucket count should be limited to 5000 (the value we use to limit 
terms aggregation results). But what we see is that initial bucket count is 
much greater than 5000 and it’s the same as without "breadth_first" setting 
(235000 in our case).

Is it correct behavior for cardinality sub aggregation? Is there any way to 
run this set of aggregation without OOM? 

Thanks in advance,

Mikalai

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff428e99-5ace-484f-97c6-b7dfa417799f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to