[
https://issues.apache.org/jira/browse/FLINK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946474#comment-15946474
]
ASF GitHub Bot commented on FLINK-5655:
---------------------------------------
Github user sunjincheng121 commented on the issue:
https://github.com/apache/flink/pull/3629
Hi @fhueske thanks a lot for your review. I have updated the PR according
to your comments.
There is one thing need your explanation.i.e. :
`Also, I realized that this implementation (other OVER windows are probably
affected as well) will not discard state if the key space evolves. We should
add a JIRA to add a configuration parameter to remove state if no row was
received for a certain amount of time.`
IMO. For event-time case, we use data-driven management state. each
processing a data will be timely processing of expired data. if the data
continuously, our state will be promptly processed. If there is no data for a
long time, the data in the state will not expand. If the next data is not sure
when the arrival of the case, I think we should not clear the data, because the
removal of data will lead to the next calculation error. If the state data has
TTL settings, user can config TTL which can be friendly to clear the state. If
i understand you correctly you said that the configuration parameters, that is
the the TTL config, is this correct? If not so, I'm appreciated If you can tell
me your detailed thoughts.
Thanks,
SunJincheng
> Add event time OVER RANGE BETWEEN x PRECEDING aggregation to SQL
> ----------------------------------------------------------------
>
> Key: FLINK-5655
> URL: https://issues.apache.org/jira/browse/FLINK-5655
> Project: Flink
> Issue Type: Sub-task
> Components: Table API & SQL
> Reporter: Fabian Hueske
> Assignee: sunjincheng
>
> The goal of this issue is to add support for OVER RANGE aggregations on event
> time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT
> a,
> SUM(b) OVER (PARTITION BY c ORDER BY rowTime() RANGE BETWEEN INTERVAL '1'
> HOUR PRECEDING AND CURRENT ROW) AS sumB,
> MIN(b) OVER (PARTITION BY c ORDER BY rowTime() RANGE BETWEEN INTERVAL '1'
> HOUR PRECEDING AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single
> threaded execution).
> - The ORDER BY clause may only have rowTime() as parameter. rowTime() is a
> parameterless scalar function that just indicates processing time mode.
> - UNBOUNDED PRECEDING is not supported (see FLINK-5658)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some
> of the restrictions are trivial to address, we can add the functionality in
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with
> RexOver expression).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)