[jira] [Commented] (FLINK-5654) Add processing time OVER RANGE BETWEEN x PRECEDING aggregation to SQL

ASF GitHub Bot (JIRA) Wed, 08 Mar 2017 14:47:04 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902113#comment-15902113
 ]


ASF GitHub Bot commented on FLINK-5654:
---------------------------------------

Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/3459
  
    Hi @huawei-flink, sorry for the long delay. I focused on adding support for 
retraction to the new UDAGG interface (PR #3470), support for time mode 
functions (PR #3370, #3425), and the most simple OVER window case (PR #3397) 
first. These additions should serve as a good basis for the more advanced OVER 
window cases. 
    
    I had a look this PR and noticed a few things:
    - The Table API is implemented in Scala. We have a few Java classes (code 
copied & fixed from Calcite, Java examples, and tests) but do not intend to add 
more. The core parts like operators, translation, and runtime code have to be 
implemented in Scala. Mixing languages makes the maintenance more difficult.
    - Please squash your commits. Usually, PRs are opened with a single commit 
and new commits are added when feedback is addressed. Before merging a 
committer squashes these commits. However, it is too much effort to squash more 
than 50 commits including merge commits which can cause trouble. If multiple 
contributors worked on a PR, figure out how you can separate the work into one 
commit per author.
    - We have recently reworked our aggregation interface which will also serve 
as the interface for user-defined aggregations which should also be usable in 
OVER windows. Please use these aggregation functions. When designing the 
interface we had their use in OVER windows in mind. If you find that the 
interface is lacking a method, please start a discussion. However, we cannot 
have several incompatible aggregation interfaces in the Table API / SQL. Please 
rebase to the current master and use the new aggregation functions.
    - A couple of days ago, we added `PROCTIME()` and `ROWTIME()` methods which 
should be used to identify the time mode.
    - The first OVER window aggregation should serve as a blueprint for future 
OVER window implementations. We should try to keep the implementations as close 
as possible to share common code and make the overall maintenance of the code 
easier.
    
    Minor comments:
    - Why did you check modifications to the `.gitignore` files?
    - Please do not remove tests without replacing them with equivalent tests.
    - Please do not reformat classes (see FunctionCatalog). These changes cause 
additional effort when reviewing a PR.
    - Do not change ScalaDoc comments to JavaDoc comments in Scala code. 
    ScalaDoc:
    ```
    /**
      * Comment
      */
    ```
    JavaDoc:
    ```
    /**
     * Comment
     */
    ```
    
    Thanks, Fabian



> Add processing time OVER RANGE BETWEEN x PRECEDING aggregation to SQL
> ---------------------------------------------------------------------
>
>                 Key: FLINK-5654
>                 URL: https://issues.apache.org/jira/browse/FLINK-5654
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>            Reporter: Fabian Hueske
>            Assignee: radu
>
> The goal of this issue is to add support for OVER RANGE aggregations on 
> processing time streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT 
>   a, 
>   SUM(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN INTERVAL '1' 
> HOUR PRECEDING AND CURRENT ROW) AS sumB,
>   MIN(b) OVER (PARTITION BY c ORDER BY procTime() RANGE BETWEEN INTERVAL '1' 
> HOUR PRECEDING AND CURRENT ROW) AS minB
> FROM myStream
> {code}
> The following restrictions should initially apply:
> - All OVER clauses in the same SELECT clause must be exactly the same.
> - The PARTITION BY clause is optional (no partitioning results in single 
> threaded execution).
> - The ORDER BY clause may only have procTime() as parameter. procTime() is a 
> parameterless scalar function that just indicates processing time mode.
> - UNBOUNDED PRECEDING is not supported (see FLINK-5657)
> - FOLLOWING is not supported.
> The restrictions will be resolved in follow up issues. If we find that some 
> of the restrictions are trivial to address, we can add the functionality in 
> this issue as well.
> This issue includes:
> - Design of the DataStream operator to compute OVER ROW aggregates
> - Translation from Calcite's RelNode representation (LogicalProject with 
> RexOver expression).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5654) Add processing time OVER RANGE BETWEEN x PRECEDING aggregation to SQL

Reply via email to