Github user hbdeshmukh commented on the issue: https://github.com/apache/incubator-quickstep/pull/96 Sure. Right now we are using a single threaded implementation for the FinalizeAggregation operator. It does two things - 1. Merging the multiple aggregation hash tables in one single global hash table. Note that same group by key can be present in different hash tables, hence the merging step is necessary. 2. Finalizing the aggregate value. e.g. in case of AVG, we delay the computation of actual average until very last. Until that point we keep the running sum and count. There are multiple ways in which the FinalizeAggregation can be parallelized. One of them is divide the aggregation hash tables such that one hash table belongs to a given partition. We partition an input tuple using the group by key attributes and route that tuple to the appropriate hash table. Now, there are no overlapping keys across any two hash tables, hence the step 1 mentioned above is not necessary. As step 2 naturally relies on the output of 1, it can be done independently for each hash table. Therefore 1 and 2 can be parallelized.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---