Hi,
We are using terms aggregation on high cardinality field and limiting results to 5000 (using “size” parameter). We also have a cardinality sub aggregation on this terms aggregation to get the number of unique values on a separate field for each term returned. Such combination of aggregations requires a lot of memory and we are getting Out Of Memory error. We tried this new "collect_mode" option with "breadth_first" setting but without success. Memory consumption is the same and OOM is still there. We identified that almost all memory consumed by ByteArray object in HyperLogLogPlusPlus class. This object is created in HyperLogLogPlusPlus constructor and initialized with “initialBucketCount << p” value as size (where initialBucketCount is estimated buckets count passed from terms aggregation, p is precision). We believe that with "breadth_first" setting initial bucket count should be limited to 5000 (the value we use to limit terms aggregation results). But what we see is that initial bucket count is much greater than 5000 and it’s the same as without "breadth_first" setting (235000 in our case). Is it correct behavior for cardinality sub aggregation? Is there any way to run this set of aggregation without OOM? Thanks in advance, Mikalai -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ff428e99-5ace-484f-97c6-b7dfa417799f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
