if you take a terms aggregation, the heavy lifting of the aggregation is done on each node then aggregated results are combined on the master node. So if you have thousands of nodes and very high cardinality nested aggs the merging part may become a bottleneck but cost of doing actual aggregation in most cases is far higher than cost of merging results from reasonable number of shards. So in practice I think it balances pretty well. Of course you are not limited to one master to handle concurrent requests
On Wednesday, December 17, 2014 4:12:44 PM UTC-5, Yifan Wang wrote: > > I thought ES only "Collect" on individual shards, and "Reduce" on Client > Node (master if you call it), nothing is done at the data node level. > > On Tuesday, December 16, 2014 1:31:30 PM UTC-5, AlexR wrote: >> >> ES already doing aggregations on each node. it is not like it is shipping >> row level query data back to master for aggregation. >> In fact, one unpleasant effect of it is that aggregation results are not >> guaranteed to be precise due to distributed nature of the aggregation for >> multibucket aggs ordered by count such as terms > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/61122d28-8f62-4ee2-b9e7-6fd99048ee8e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
