alamb opened a new pull request, #8766: URL: https://github.com/apache/arrow-datafusion/pull/8766
Draft ## Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/8738 ## Rationale for this change This is a second sketch of how to close https://github.com/apache/arrow-datafusion/issues/8738 https://github.com/apache/arrow-datafusion/pull/8291 / https://github.com/apache/arrow-datafusion/issues/7647 changed DataFusion's Grouping operator so that it never dictionary encoded the output grouping columns. Previously, the types of the input grouping expressions were the same as the types of the output group by The idea I think is that since the values in the group columns are unique, there is no reason to dictionary encode them (as each dictionary entry would have a single value). I actually am not sure about this for reasons I will explain shortly. ## What changes are included in this PR? This PR changes `LogicalPlan::Aggregate` and `LogicalPlan::Distinct` (which both use the HashAggregate `ExecutionPlan`) to report a schema that is dictionary unencoded. ## Are these changes tested? Yes, there is a regression test. ## Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please add the `api change` label. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
