jonahgao opened a new pull request, #9228: URL: https://github.com/apache/arrow-datafusion/pull/9228
## Which issue does this PR close? Closes #9162. ## Rationale for this change When a column referenced by group-by exists both in the select list and the input, the one from the input should be given priority. In issue 9162, there are two references with the same name: one is an unqualified "t.a," and the other is a qualified t.a. This is the practice of many databases, including PostgreSQL, Oracle, MySQL, Duckdb, etc. In the PostgreSQL documentation, there is an [explanation](https://www.postgresql.org/docs/current/sql-select.html#SQL-GROUPBY) about it. > An expression used inside a grouping_element can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a GROUP BY name will be interpreted as an **input-column** name rather than an output column name. ## What changes are included in this PR? Prioritize searching the schema of the base plan when generating GROUP BY expressions. ## Are these changes tested? Yes ## Are there any user-facing changes? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
