[jira] [Commented] (FLINK-5266) Eagerly project unused fields when selecting aggregation fields

ASF GitHub Bot (JIRA) Fri, 09 Dec 2016 02:24:08 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734912#comment-15734912
 ]


ASF GitHub Bot commented on FLINK-5266:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/2961#discussion_r91691224
  
    --- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/table.scala
 ---
    @@ -881,24 +883,21 @@ class GroupWindowedTable(
         * }}}
         */
       def select(fields: Expression*): Table = {
    --- End diff --
    
    Watermarks and timestamps should not be affected by this change. They are 
treated as metadata by Flink and not part of the schema. Also watermarks and 
timestamps should be assigned before the query. We do not support assigning 
watermarks within a query.
    
    I also had a quick look into it. One problem I found was that a window 
alias is handled as an `UnresolvedFieldReference` in `select` here and 
therefore added to the projection. However, the input does to have a field like 
that and validation fails.
    
    During validation, the window alias is correctly recognized. Maybe it makes 
more sense to add the projection at this point by injection an additional 
`Project` with the `RelBuilder`. Another solution could be a `RelOptRule`.


> Eagerly project unused fields when selecting aggregation fields
> ---------------------------------------------------------------
>
>                 Key: FLINK-5266
>                 URL: https://issues.apache.org/jira/browse/FLINK-5266
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>            Reporter: Kurt Young
>            Assignee: Kurt Young
>
> When we call table's {{select}} method and if it contains some aggregations, 
> we will project fields after the aggregation. Would be better to project 
> unused fields before the aggregation, and can furthermore leave the 
> opportunity to push the project into scan.
> For example, the current logical plan of a simple query:
> {code}
> table.select('a.sum as 's, 'a.max)
> {code}
> is
> {code}
> LogicalProject(s=[$0], TMP_2=[$1])
>   LogicalAggregate(group=[{}], TMP_0=[SUM($5)], TMP_1=[MAX($5)])
>     LogicalTableScan(table=[[supplier]])
> {code}
> Would be better if we can project unused fields right after scan, and looks 
> like this:
> {code}
> LogicalProject(s=[$0], EXPR$1=[$0])
>   LogicalAggregate(group=[{}], EXPR$1=[SUM($0)])
>     LogicalProject(a=[$5])
>       LogicalTableScan(table=[[supplier]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-5266) Eagerly project unused fields when selecting aggregation fields

Reply via email to