[ https://issues.apache.org/jira/browse/FLINK-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16262540#comment-16262540 ]
Fabian Hueske commented on FLINK-6101: -------------------------------------- You are right that SQL doesn't support `AS`. But also SQL is not a perfect language and has shortcomings. IMO, the Table API should follow SQL semantics and where it applies also SQL syntax. Having optional support for `AS` in `groupBy()` would not be a problem in that regard because it does not change the semantics and is a nice shortcut. > GroupBy fields with arithmetic expression (include UDF) can not be selected > --------------------------------------------------------------------------- > > Key: FLINK-6101 > URL: https://issues.apache.org/jira/browse/FLINK-6101 > Project: Flink > Issue Type: Bug > Components: Table API & SQL > Reporter: lincoln.lee > Assignee: lincoln.lee > Priority: Minor > > currently the TableAPI do not support selecting GroupBy fields with > expression either using original field name or the expression > {code} > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, 'b % 3) > .select('b, 'c.min, 'e, 'a.avg, 'd.count) > {code} > caused > {code} > org.apache.flink.table.api.ValidationException: Cannot resolve [b] given > input [e, ('b % 3), TMP_0, TMP_1, TMP_2]. > {code} > (BTW, this syntax is invalid in RDBMS which will indicate the selected column > is invalid in the select list because it is not contained in either an > aggregate function or the GROUP BY clause in SQL Server.) > and > {code} > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, 'b % 3) > .select('b%3, 'c.min, 'e, 'a.avg, 'd.count) > {code} > will also cause > {code} > org.apache.flink.table.api.ValidationException: Cannot resolve [b] given > input [e, ('b % 3), TMP_0, TMP_1, TMP_2]. > {code} > and add an alias in groupBy clause "group(e, 'b%3 as 'b)" work without avail. > and apply an UDF doesn’t work either > {code} > table.groupBy('a, Mod('b, 3)).select('a, Mod('b, 3), 'c.count, 'c.count, > 'd.count, 'e.avg) > org.apache.flink.table.api.ValidationException: Cannot resolve [b] given > input [a, org.apache.flink.table.api.scala.batch.table.Mod$('b, 3), TMP_0, > TMP_1, TMP_2]. > {code} > the only way to get this work can be > {code} > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .select('a, 'b%3 as 'b, 'c, 'd, 'e) > .groupBy('e, 'b) > .select('b, 'c.min, 'e, 'a.avg, 'd.count) > {code} > One way to solve this is to add support alias in groupBy clause ( it seems a > bit odd against SQL though TableAPI has a different groupBy grammar), > and I prefer to support select original expressions and UDF in groupBy > clause(make consistent with SQL). > as thus: > {code} > // use expression > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, 'b % 3) > .select('b % 3, 'c.min, 'e, 'a.avg, 'd.count) > // use UDF > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, Mod('b,3)) > .select(Mod('b,3), 'c.min, 'e, 'a.avg, 'd.count) > {code} > After had a look into the code, found there was a problem in the groupBy > implementation, validation hadn't considered the expressions in groupBy > clause. it should be noted that a table has been actually changed after > groupBy operation ( a new Table) and the groupBy keys replace the original > field reference in essence. > > What do you think? -- This message was sent by Atlassian JIRA (v6.4.14#64029)