[
https://issues.apache.org/jira/browse/FLINK-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235335#comment-15235335
]
Fabian Hueske commented on FLINK-3723:
--------------------------------------
Hi [~yijieshen], welcome to the Flink community. Great that you are interested
to contribute :-)
The Table API is currently under heavy development and we can certainly use
some help here.
Thanks for the TPC-H Q1 example. How would you add the {{l_returnflag}} or
{{l_linestatus}} fields to the output? Or do you assume that these will be
implicitly added because they are grouping fields?
I am not sure about the benefits of the proposed {{agg}} method compared to the
existing {{select}} method.
In {{select}} we do also check for non-grouped and non-aggregated columns, so
it is not possible to have nondeterministic fields in the result.
In addition, {{select}} allows to explicitly add (or leave out) grouped fields
or directly apply expressions on grouped or aggregated fields. Of course this
would also be possible by using {{agg}} followed by {{select}}. {{agg}} would
make {{select}} more specific and maybe easier to use. On the other hand, the
current implementation of {{select}} is closer to the original SQL notation.
> Aggregate Functions and scalar expressions shouldn't be mixed in select
> -----------------------------------------------------------------------
>
> Key: FLINK-3723
> URL: https://issues.apache.org/jira/browse/FLINK-3723
> Project: Flink
> Issue Type: Improvement
> Components: Table API
> Affects Versions: 1.0.1
> Reporter: Yijie Shen
>
> When we type {code}select deptno, name, max(age) from dept group by
> deptno;{code} in calcite or Oracle, it will complain {code}Expression 'NAME'
> is not being grouped{code} or {code}Column 'dept.name' is invalid in the
> select list because it is not contained in either an aggregate function or
> the GROUP BY clause.{code} because of the nondeterministic result.
> Therefore, I suggest to separate the current functionality of `select` into
> two api, the new `select` only handle scalar expressions, and an `agg` accept
> Aggregates.
> {code}
> def select(exprs: Expression*)
> def agg(aggs: Aggregation*)
> ....
> tbl.groupBy('deptno)
> .agg('age.max, 'age.min)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)