[jira] [Commented] (FLINK-6373) Add runtime support for distinct aggregation over grouped windows

ASF GitHub Bot (JIRA) Fri, 28 Apr 2017 00:51:24 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15988376#comment-15988376
 ]


ASF GitHub Bot commented on FLINK-6373:
---------------------------------------

Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/3765
  
    Hi @haohui, 
    
    I suggested before that PR #3771 might be used for DISTINCT group window 
functions. However, this does not work because we cannot register state for an 
AggregateFunction. The benefit of the approach of #3771 would have been that it 
does not need to deserialize the Map every time a record is accumulated (or 
retracted). Instead the distinct values are kept in a MapState that can be 
accessed (and deserialized) per look up key. But this approach does not work 
with the AggregateFunction that we use for early aggregation. 
    
    To be honest, I'm a bit concerned about the performance of the approach of 
this PR because the  state of the DistinctAccumulator accumulator (i.e., the 
complete map) will be de/serialized every time we access it. 
    
    I think we can use this approach for now, but should look out, whether we 
can use an approach similar to the batch side where distinct aggregations (on 
different keys) are translated into multiple aggregations which are later 
joined together (the join would be rather cheap because its a 1-to-1 join).
    
    I'll have a look at this PR later today.
    Thanks, Fabian


> Add runtime support for distinct aggregation over grouped windows
> -----------------------------------------------------------------
>
>                 Key: FLINK-6373
>                 URL: https://issues.apache.org/jira/browse/FLINK-6373
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>
> This is a follow up task for FLINK-6335. FLINK-6335 enables parsing the 
> distinct aggregations over grouped windows. This jira tracks the effort of 
> adding runtime support for the query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-6373) Add runtime support for distinct aggregation over grouped windows

Reply via email to