[jira] [Commented] (FLINK-5655) Add event time OVER RANGE BETWEEN x PRECEDING aggregation to SQL

ASF GitHub Bot (JIRA) Tue, 28 Mar 2017 20:34:05 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946474#comment-15946474
 ]


ASF GitHub Bot commented on FLINK-5655:
---------------------------------------

Github user sunjincheng121 commented on the issue:

    https://github.com/apache/flink/pull/3629
  
    Hi @fhueske thanks a lot for your review. I have updated the PR according 
to your comments. 
    There is one thing need your explanation.i.e. :
    `Also, I realized that this implementation (other OVER windows are probably 
affected as well) will not discard state if the key space evolves. We should 
add a JIRA to add a configuration parameter to remove state if no row was 
received for a certain amount of time.`
    IMO. For event-time case, we use data-driven management state. each 
processing a data will be timely processing of expired data. if the data 
continuously, our state will be promptly processed. If there is no data for a 
long time, the data in the state will not expand. If the next data is not sure 
when the arrival of the case, I think we should not clear the data, because the 
removal of data will lead to the next calculation error. If the state data has 
TTL settings, user can config TTL which can be friendly to clear the state. If 
i understand you correctly you said that the configuration parameters, that is 
the the TTL config, is this correct? If not so, I'm appreciated If you can tell 
me your detailed thoughts.
    
    Thanks,
    SunJincheng


> Add event time OVER RANGE BETWEEN x PRECEDING aggregation to SQL
> ----------------------------------------------------------------
>
>                 Key: FLINK-5655
>                 URL: https://issues.apache.org/jira/browse/FLINK-5655
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>            Reporter: Fabian Hueske
>            Assignee: sunjincheng
>
> The goal of this issue is to add support for OVER RANGE aggregations on event 
> time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT 
>   a, 
>   SUM(b) OVER (PARTITION BY c ORDER BY rowTime() RANGE BETWEEN INTERVAL '1' 
> HOUR PRECEDING AND CURRENT ROW) AS sumB,
>   MIN(b) OVER (PARTITION BY c ORDER BY rowTime() RANGE BETWEEN INTERVAL '1' 
> HOUR PRECEDING AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single 
> threaded execution).
> - The ORDER BY clause may only have rowTime() as parameter. rowTime() is a 
> parameterless scalar function that just indicates processing time mode.
> - UNBOUNDED PRECEDING is not supported (see FLINK-5658)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some 
> of the restrictions are trivial to address, we can add the functionality in 
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with 
> RexOver expression).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5655) Add event time OVER RANGE BETWEEN x PRECEDING aggregation to SQL

Reply via email to