[ 
https://issues.apache.org/jira/browse/CALCITE-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169696#comment-17169696
 ] 

Rui Wang commented on CALCITE-4146:
-----------------------------------

I believe without a EMIT specification, the default behavior is emit when 
window is closed, and by definition that means watermark passes end of the 
window.

Regarding the types of emit strategies, due to limit pages, that paper only 
lists two strategies, and Calcite should supports at least four categories:
1. Event time triggers. Emitting depends on the relationship between
watermark and event timestamp of events. Handling late data is also included
in this category. 
2. Processing time triggers. Emitting depends on the system clock. This is
a natural idea of emitting. E.g. emit the current result every hour without
considering if data in a window is already complete.
3. data-driven triggers. E.g. emit when accumulated events exceed a
threshold (e.g. emit when have acculucated 1000 events)
4. Composite triggers. There is a need to concat 1, 2, 3 by OR and AND to
achieve better latency control.


I am not familiar with Flink's concept on time/count window so I don't have 
comment on it.


Lastly, watermark is a natural concept that blinds with data sources because 
only sources can give estimation son the completeness of data in terms of 
even-timestamp. I believe in implementation we need to introduce that in 
Calcite.

> Implement EMIT AFTER WATERMARK
> ------------------------------
>
>                 Key: CALCITE-4146
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4146
>             Project: Calcite
>          Issue Type: Sub-task
>            Reporter: Rui Wang
>            Assignee: Rui Wang
>            Priority: Major
>
> The goal is to support the following syntax
> {code:sql}
> SELECT clause
> FROM TUMBLE/HOP/SESSION
> EMIT AFTER WATERMARK
> {code}
> note that "EMIT AFTER WATERMARK" is the new thing.
> "EMIT AFTER WATERMARK" is proposed in [1]. This idea proposes a way to allow 
> streaming SQL queries control materialization latency. More specifically, it 
> means emit elements in a window once the watermark passes the end of that 
> window.
> There are more context discussed in [2][3].
> [1]: https://arxiv.org/pdf/1905.12133.pdf
> [2]: 
> https://issues.apache.org/jira/browse/CALCITE-3272?focusedCommentId=17166580&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17166580
> [3]:https://lists.apache.org/thread.html/r5bd9a6f7af2c0cd81aecd4de512fd889fbf15f112cc3704f188b1d4f%40%3Cdev.calcite.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to