[
https://issues.apache.org/jira/browse/CALCITE-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166673#comment-17166673
]
Julian Hyde commented on CALCITE-3272:
--------------------------------------
Let's not use {{CURRENT_TIMESTAMP}} for these purposes. Per the SQL standard,
it is fixed for the duration of the query. Maybe a new function,
{{WATERMARK_TIMESTAMP}}, is needed.
I don't think it makes sense to have expressions int the {{EMIT}} clause, in
general. Expressions have to be evaluated on data, the data may not have
arrived yet, and {{EMIT}} clause controls how long we wait for the data to
arrive, so there's a chicken-and-egg situation.
Let's stick to metadata-driven conditions like "EMIT AFTER WATERMARK". It's
possible that we will need to run the data through the table function more than
once, as new data arrives. Or make the table function "watermark aware", and
allow it to re-state its results as new data arrives, consume watermarks, and
emit watermarks.
> TUMBLE Table-valued Function
> ----------------------------
>
> Key: CALCITE-3272
> URL: https://issues.apache.org/jira/browse/CALCITE-3272
> Project: Calcite
> Issue Type: Sub-task
> Reporter: Rui Wang
> Assignee: Rui Wang
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.22.0
>
> Time Spent: 19h 20m
> Remaining Estimate: 0h
>
> Define a builtin TVF: Tumble (data , timecol , dur, [ offset ])
> The return value of Tumble is a relation that includes all columns of data as
> well as additional event time columns window_start and window_end.
> Examples of TUMBLE TVF are (from https://s.apache.org/streaming-beam-sql ):
> 8:21> SELECT * FROM Bid;
> --------------------------
> | bidtime | price | item |
> --------------------------
> | 8:07 | $2 | A |
> | 8:11 | $3 | B |
> | 8:05 | $4 | C |
> | 8:09 | $5 | D |
> | 8:13 | $1 | E |
> | 8:17 | $6 | F |
> --------------------------
> 8:21> SELECT *
> FROM TABLE Tumble (
> data => TABLE Bid ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10' MINUTES ,
> offset => INTERVAL '0' MINUTES );
> ------------------------------------------
> | window_start | window_end | bidtime | price | item |
> ------------------------------------------
> | 8:00 | 8:10 | 8:07 | $2 | A |
> | 8:10 | 8:20 | 8:11 | $3 | B |
> | 8:00 | 8:10 | 8:05 | $4 | C |
> | 8:00 | 8:10 | 8:09 | $5 | D |
> | 8:10 | 8:20 | 8:13 | $1 | E |
> | 8:10 | 8:20 | 8:17 | $6 | F |
> ------------------------------------------
> 8:21> SELECT MAX ( window_start ) , window_end , SUM ( price )
> FROM TABLE Tumble (
> data => TABLE ( Bid ) ,
> timecol => DESCRIPTOR ( bidtime ) ,
> dur => INTERVAL '10 ' MINUTES )
> GROUP BY wend;
> -------------------------
> | window_start | window_end | price |
> -------------------------
> | 8:00 | 8:10 | $11 |
> | 8:10 | 8:20 | $10 |
> -------------------------
--
This message was sent by Atlassian Jira
(v8.3.4#803005)