DaZuiZui commented on issue #17797: URL: https://github.com/apache/iotdb/issues/17797#issuecomment-4608886769
Hi! I would like to work on **Part 1** of this issue: allowing SELECT column aliases to be referenced in `GROUP BY` and `ORDER BY`. I will keep **Lateral Column Alias references inside the SELECT list** out of scope for this PR and handle that as a separate follow-up after Part 1 is merged. ## Feature Definition ### Scope This feature applies to the **table model SQL analyzer**. It allows an explicit alias defined by a SELECT item to be used as a grouping or ordering key. ### Supported Syntax ```sql SELECT <expression> [AS] <alias>, ... FROM <table> GROUP BY <alias> ORDER BY <alias> [ASC | DESC] [NULLS FIRST | NULLS LAST]; ``` Components <expression>: any SELECT expression already supported by the table model analyzer. <alias>: an explicit SELECT item alias, for example date_bin(1h, time) AS hour_time. GROUP BY <alias>: resolves the alias to the corresponding SELECT expression and uses that expression as the grouping key. ORDER BY <alias>: resolves the alias to the corresponding SELECT expression and uses that expression as the ordering key. ASC, DESC, NULLS FIRST, and NULLS LAST: keep the existing ORDER BY semantics. Type Rules The alias itself does not introduce a new data type. Its type is the type of the SELECT expression it refers to. For GROUP BY <alias>: The resolved expression must be comparable, following the existing GROUP BY type rules. The resolved expression must not contain aggregate, window, or grouping functions. Existing semantic checks should run after alias resolution. For ORDER BY <alias>: The resolved expression must be orderable, following the existing ORDER BY type rules. Existing semantic checks should run after alias resolution. In other words, supported data types are the same as the currently supported comparable/orderable types in the table model analyzer. Non-comparable or non-orderable types should continue to be rejected. Name Resolution Rules For a single unqualified identifier in GROUP BY: Try to resolve it as an input column first. If it does not resolve to an input column, try to resolve it as a SELECT alias. If multiple SELECT aliases match, report an ambiguity error. For a single unqualified identifier in ORDER BY: Try to resolve it as a SELECT output alias first. If no SELECT alias matches, fall back to the existing ORDER BY name resolution behavior. If multiple SELECT aliases match, report an ambiguity error. This follows the convention discussed in this issue: GROUP BY prefers the input column. ORDER BY prefers the SELECT output alias. Out of Scope This PR will not introduce SELECT aliases into: WHERE HAVING later SELECT items, namely Lateral Column Alias references Lateral Column Alias support will be handled in a separate follow-up PR. Examples Basic GROUP BY / ORDER BY alias SELECT date_bin(1h, time) AS hour_time, AVG(temperature) AS avg_temp FROM table1 GROUP BY hour_time ORDER BY hour_time; This should be analyzed as if the user wrote: SELECT date_bin(1h, time) AS hour_time, AVG(temperature) AS avg_temp FROM table1 GROUP BY date_bin(1h, time) ORDER BY date_bin(1h, time); Existing behavior should remain unchanged GROUP BY 1 ORDER BY 1 GROUP BY date_bin(1h, time) ORDER BY AVG(temperature) GROUP BY name collision prefers input column If table1 has an input column named hour_time: SELECT date_bin(1h, time) AS hour_time, AVG(temperature) FROM table1 GROUP BY hour_time; GROUP BY hour_time should resolve to the input column hour_time, not the SELECT alias. ORDER BY name collision prefers SELECT alias If table1 has an input column named hour_time: SELECT date_bin(1h, time) AS hour_time FROM table1 ORDER BY hour_time; ORDER BY hour_time should resolve to the SELECT alias. Duplicate alias ambiguity SELECT s1 AS x, s2 AS x FROM table1 ORDER BY x; This should fail with a clear ambiguity error. Aggregate alias in GROUP BY should fail SELECT AVG(temperature) AS avg_temp FROM table1 GROUP BY avg_temp; This should fail because GROUP BY cannot use an aggregate expression as a grouping key. Test Plan I plan to add tests for: GROUP BY alias ORDER BY alias alias/input-column collision behavior duplicate alias ambiguity aggregate alias used in GROUP BY should fail existing ordinal references should keep working existing full-expression references should keep working SELECT DISTINCT with ORDER BY alias, if applicable Could you please assign Part 1 of this issue to me? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
