rtpsw commented on PR #34885: URL: https://github.com/apache/arrow/pull/34885#issuecomment-1496464554
> @rtpsw Not sure I follow what you are doing - seems like a lot of refactor is done. Can you explain your approach? In this PR, the main goal is a single method `MakeOutputSchema` providing the output schema for an aggregation. The problem is that the original code has two classes , `ScalarAggregateNode` and `GroupByNode`, for aggregation that do not share much code between them for the purpose of constructing the output schema. To prepare the stage, I started with refactoring the original code to make them share code for this purpose.For this, I needed to encapsulate certain differences between them: - Get kernel: This is [encapsulated by `GetKernel`](https://github.com/apache/arrow/pull/34885/files#diff-806b2b905d7908f2ce860f38eca2090b9504753ebd72d5fef5f97bd2611359b1R101-R102). The two implementations are `GetScalarAggregateKernel` and `GetHashAggregateKernel`. The latter has the function dispatch on the [types extended with the group-id](https://github.com/apache/arrow/pull/34885/files#diff-806b2b905d7908f2ce860f38eca2090b9504753ebd72d5fef5f97bd2611359b1R81-R87). - Init kernel: This is [encapsulated by `InitKernel`](https://github.com/apache/arrow/pull/34885/files#diff-806b2b905d7908f2ce860f38eca2090b9504753ebd72d5fef5f97bd2611359b1R143-R144). The two implementations are `InitScalarAggregateKernel` and `InitHashAggregateKernel`. The latter has the kernel-args configured using the types extended with the group-id. - Resolve kernels: This is [encapsulated by `ResolveKernels`](https://github.com/apache/arrow/pull/34885/files#diff-806b2b905d7908f2ce860f38eca2090b9504753ebd72d5fef5f97bd2611359b1R218-R221). The two implementations are `ResolveScalarAggregateKernels` and `ResolveHashAggregateKernels`. The latter resolves each kernel using the types extended with the group-id. Additional parts of the refactoring are: - Adding `MakeAggregateNodeArgs` as a common method for setting up the arguments needed for constructing an aggregation node, whether it is a `ScalarAggregateNode` or a `GroupByNode`. - Cleaning up `ScalarAggregateNode::Make` and `GroupByNode::Make` to use the above consistently. - Adding `MakeOutputSchema` that uses `MakeAggregateNodeArgs` to return the output schema that the aggregation node is constructed with. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
