Also, I've just noticed that this also breaks the assumption of groupBy v2. In groupBy v2, the broker assumes that the intermediate aggregates are always sorted by the grouping keys, so that it can perform the merge-sorted aggregation. However, calling `QueryRunnerFactory.mergeRunners()` internally performs hash-aggregation (or array-based aggregation) and then sort again which is inefficient. For groupBy v2, merge-sorted aggregation should be performed in parallel. Maybe we need to add a new method to QueryToolChest which is different from the merge in historicals and the final merge in brokers.
We've recently had a discussion about this on dev mailing. See https://lists.apache.org/thread.html/b4c1cbe0c97e52ae5a137f4315af6a202a24d3034f53ce92c0d30150@%3Cdev.druid.apache.org%3E for more details. [ Full content available at: https://github.com/apache/incubator-druid/pull/5913 ] This message was relayed via gitbox.apache.org for [email protected]
