[
https://issues.apache.org/jira/browse/FLINK-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991998#comment-15991998
]
ASF GitHub Bot commented on FLINK-6373:
---------------------------------------
Github user sunjincheng121 commented on the issue:
https://github.com/apache/flink/pull/3765
Hi @haohui @fhueske I am very interested in `DISTINCT`, Let me share some
ideas about this:
First up, in standard database there are two situations can using
`DISTINCT` keyword.
* in `SELECT Clause`, e.g.: `SELECT DISTINCT name FROM table`
* in `AGG Clause`, e.g.: `COUNT([ALL|DISTINCT]
expression)`,`SUM([ALL|DISTINCT] expression)`, etc.
In this post,we talk about `AGG Clause`. The `DISTINCT` keyword tells the
database system to aggregate only the distinct, or unique, values within the
scope of the aggregate function. i.e. database system will deal with the
`DISTINCT` keyword, and put the unique value into `AGG`. Based on this
understanding, I think FLINK FRAMEWORK(not the AGG) should deal with the
`DISTINCT` keyword. we do not need `DistinctAccumulator.java`. About GROUP
WINDOW, I think I like analyze whether the data is duplicated in
`XXXWindowFunction` and `DataSetXXXAggFunction`, And add boolean variable
`isFirstTimeProcess` identifies whether the data is duplicated as a parameter
of `GeneratedAggregationsFunction`. `GeneratedAggregationsFunction` process
data according to `aggCall.isDistinct` and `isFirstTimeProcess`. What do you
think? @haohui @fhueske
Best,
SunJincheng
> Add runtime support for distinct aggregation over grouped windows
> -----------------------------------------------------------------
>
> Key: FLINK-6373
> URL: https://issues.apache.org/jira/browse/FLINK-6373
> Project: Flink
> Issue Type: Bug
> Components: Table API & SQL
> Reporter: Haohui Mai
> Assignee: Haohui Mai
>
> This is a follow up task for FLINK-6335. FLINK-6335 enables parsing the
> distinct aggregations over grouped windows. This jira tracks the effort of
> adding runtime support for the query.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)