[jira] [Commented] (FLINK-7206) Implementation of DataView to support state access for UDAGG

ASF GitHub Bot (JIRA) Fri, 28 Jul 2017 09:15:58 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105203#comment-16105203
 ]


ASF GitHub Bot commented on FLINK-7206:
---------------------------------------

Github user fhueske commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4355#discussion_r130126823
  
    --- Diff: 
flink-libraries/flink-table/src/test/java/org/apache/flink/table/runtime/utils/JavaUserDefinedAggFunctions.java
 ---
    @@ -135,4 +138,172 @@ public void retract(WeightedAvgAccum accumulator, int 
iValue, int iWeight) {
                        accumulator.count -= iWeight;
                }
        }
    +
    +   /**
    +    * CountDistinct accumulator.
    +    */
    +   public static class CountDistinctAccum {
    +           public MapView<String, Integer> map;
    +           public long count;
    +   }
    +
    +   /**
    +    * CountDistinct aggregate.
    +    */
    +   public static class CountDistinct extends AggregateFunction<Long, 
CountDistinctAccum> {
    --- End diff --
    
    I don't think we should implement `COUNT DISTINCT` as a special 
`AggregateFunction`. At least not in the long term. 
    
    I think it would be better to handle this inside of the 
`GeneratedAggregations` and only accumulate and retract distinct values from 
user-defined aggregate functions. With this approach, any aggregation function 
can be used with `DISTINCT` and the state for distinction can also be shared 
across multiple aggregation functions. This is also the approach that has been 
started in PR #3783.
    
    For now this is fine, but in the long run we should go for something like 
PR #3783 (which also requires the `GeneratedAggregations.initialize()` method.)


> Implementation of DataView to support state access for UDAGG
> ------------------------------------------------------------
>
>                 Key: FLINK-7206
>                 URL: https://issues.apache.org/jira/browse/FLINK-7206
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>            Reporter: Kaibo Zhou
>            Assignee: Kaibo Zhou
>
> Implementation of MapView and ListView to support state access for UDAGG.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-7206) Implementation of DataView to support state access for UDAGG

Reply via email to