[ 
https://issues.apache.org/jira/browse/FLINK-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235085#comment-15235085
 ] 

Yijie Shen commented on FLINK-3723:
-----------------------------------

Hi [~fhueske], thanks for your reply.
I'm new to flink and reading the table's implementation(on master branch) for 
now and want to do some contribution.
What I'm suggesting is (using TPCH Q1 as an example):
{code}
lineitem.filter('l_shipdate <= "1998-09-02")
          .groupBy('l_returnflag, 'l_linestatus)
          .agg(
            sum('l_quantity),
            sum('l_extendedprice),
            sum('l_extendedprice * (lit(1) - 'l_discount)),
            sum('l_extendedprice * (lit(1) - 'l_discount) * (lit(1) + 'l_tax)),
            avg('l_quantity),
            avg('l_extendedprice),
            avg('l_discount),
            count(lit(1)))
          .sort('l_returnflag, 'l_linestatus)
{code}
Since agg is applied on `GroupedTable` (aggregate directly on Table is just a 
short hand for Empty GroupByKey), I think we can safely left group fields out 
from `agg` method.

> Aggregate Functions and scalar expressions shouldn't be mixed in select
> -----------------------------------------------------------------------
>
>                 Key: FLINK-3723
>                 URL: https://issues.apache.org/jira/browse/FLINK-3723
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API
>    Affects Versions: 1.0.1
>            Reporter: Yijie Shen
>
> When we type {code}select deptno, name, max(age) from dept group by 
> deptno;{code} in calcite or Oracle, it will complain {code}Expression 'NAME' 
> is not being grouped{code} or {code}Column 'dept.name' is invalid in the 
> select list because it is not contained in either an aggregate function or 
> the GROUP BY clause.{code} because of the nondeterministic result.
> Therefore, I suggest to separate the current functionality of `select` into 
> two api, the new `select` only handle scalar expressions, and an `agg` accept 
> Aggregates.
> {code}
> def select(exprs: Expression*)
> def agg(aggs: Aggregation*)
> ....
> tbl.groupBy('deptno)
>    .agg('age.max, 'age.min)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to