Hi Bryan, The definition looks good to me, I'll assign this issue to you.
Best regards, ------------------------- Yuan Tian On Wed, Jun 3, 2026 at 11:43 AM Bryan Yang <[email protected]> wrote: > Hi Apache IoTDB Community, > > I would like to work on Part 1 of issue #17797: > https://github.com/apache/iotdb/issues/17797 > > This mail is to clarify the feature definition before implementation. > > Scope > > This PR will focus only on allowing SELECT column aliases to be referenced > in GROUP BY and ORDER BY in the table model SQL analyzer. > > Lateral Column Alias references inside the SELECT list are out of scope for > this PR and will be handled in a separate follow-up PR. > > Syntax > > The supported syntax is: > > SELECT <expression> [AS] <alias>, ... > FROM <table> > GROUP BY <alias> > ORDER BY <alias> [ASC | DESC] [NULLS FIRST | NULLS LAST]; > > Components > > - <expression>: any valid SELECT expression already supported by the table > model analyzer. > - <alias>: an explicit alias defined by a SELECT item, for example: > date_bin(1h, time) AS hour_time > - GROUP BY <alias>: uses the corresponding SELECT expression as the > grouping key. > - ORDER BY <alias>: uses the corresponding SELECT expression as the > ordering key. > - ASC / DESC / NULLS FIRST / NULLS LAST keep the existing ORDER BY > semantics. > > Type Rules > > The alias itself does not introduce a new data type. Its type is the type > of the SELECT expression it refers to. > > For GROUP BY alias: > - The resolved expression must be comparable. > - The resolved expression must not contain aggregate, window, or grouping > functions. > - Existing GROUP BY semantic checks should run after alias resolution. > > For ORDER BY alias: > - The resolved expression must be orderable. > - Existing ORDER BY semantic checks should run after alias resolution. > > Name Resolution Rules > > For a single unqualified identifier in GROUP BY: > > 1. Try to resolve it as an input column first. > 2. If it does not resolve to an input column, try to resolve it as a SELECT > alias. > 3. If multiple SELECT aliases match, report an ambiguity error. > > For a single unqualified identifier in ORDER BY: > > 1. Try to resolve it as a SELECT output alias first. > 2. If no SELECT alias matches, fall back to the existing ORDER BY name > resolution behavior. > 3. If multiple SELECT aliases match, report an ambiguity error. > > This follows the convention mentioned in the issue: > - GROUP BY prefers the input column. > - ORDER BY prefers the SELECT output alias. > > Existing Behavior To Preserve > > The following existing forms should keep working unchanged: > > GROUP BY 1 > ORDER BY 1 > GROUP BY date_bin(1h, time) > ORDER BY AVG(temperature) > > Aliases will not be introduced into WHERE or HAVING in this PR. > > Examples > > Example 1: GROUP BY and ORDER BY alias > > SELECT date_bin(1h, time) AS hour_time, AVG(temperature) AS avg_temp > FROM table1 > GROUP BY hour_time > ORDER BY hour_time; > > This should be analyzed as if the user wrote: > > SELECT date_bin(1h, time) AS hour_time, AVG(temperature) AS avg_temp > FROM table1 > GROUP BY date_bin(1h, time) > ORDER BY date_bin(1h, time); > > Example 2: GROUP BY collision prefers input column > > If table1 has an input column named hour_time: > > SELECT date_bin(1h, time) AS hour_time, AVG(temperature) > FROM table1 > GROUP BY hour_time; > > GROUP BY hour_time should resolve to the input column hour_time, not the > SELECT alias. > > Example 3: ORDER BY collision prefers SELECT alias > > If table1 has an input column named hour_time: > > SELECT date_bin(1h, time) AS hour_time > FROM table1 > ORDER BY hour_time; > > ORDER BY hour_time should resolve to the SELECT alias. > > Example 4: Duplicate alias ambiguity > > SELECT s1 AS x, s2 AS x > FROM table1 > ORDER BY x; > > This should fail with a clear ambiguity error. > > Example 5: Aggregate alias in GROUP BY should fail > > SELECT AVG(temperature) AS avg_temp > FROM table1 > GROUP BY avg_temp; > > This should fail because GROUP BY cannot use an aggregate expression as a > grouping key. > > Implementation Plan > > - Collect explicit SELECT aliases during SELECT analysis, keeping the alias > name, original expression, and position. > - During GROUP BY analysis, resolve a single identifier against SELECT > aliases only when it does not resolve to an input column first. > - During ORDER BY analysis, resolve a single identifier against SELECT > aliases before falling back to the existing ORDER BY scope behavior. > - Run the existing type, aggregation, and expression validation after alias > resolution. > - Keep Lateral Column Alias support out of this PR. > > Test Plan > > I plan to add tests for: > > - GROUP BY alias > - ORDER BY alias > - GROUP BY alias with date_bin/date_bin_gapfill-style expressions > - alias/input-column collision behavior > - duplicate alias ambiguity > - aggregate alias used in GROUP BY should fail > - existing ordinal references should keep working > - existing full-expression references should keep working > - SELECT DISTINCT with ORDER BY alias, if applicable > > Could you please confirm whether this feature definition looks reasonable? > If there are no objections, I would like to work on Part 1 of this issue. > > Thank > Bryan Yang(杨易达) >
