songwdfu opened a new pull request, #16336: URL: https://github.com/apache/pinot/pull/16336
Implemented partitioned group-by combine for non-order-by, non-trim case. This technique is from of DuckDB's [parallel grouped aggregate](https://duckdb.org/2022/03/07/aggregate-hashtable.html). This algorithm has 2 phases. In the first phase per-segment results are radix-partitioned. In the second phase each worker thread picks up a partition to merge the results into a single hashtable. Then the result hashtables, which are still radix-partitioned, are logically stitched together since there are no key collisions between them. This enables full inter-thread parallism by eliminating contention between worker threads, in contrast to the previous approach where every thread writes into the same shared indexedTable. Essentially, this is applicable to the broker as well, since the combine output is still radix-partitioned. To be tested. Will consider order-by / trim case. Will add detail optimizations later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
