[ 
https://issues.apache.org/jira/browse/CALCITE-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170523#comment-17170523
 ] 

Rui Wang edited comment on CALCITE-4146 at 8/4/20, 3:16 AM:
------------------------------------------------------------

The second.

The emit every 1 minute will propagate to both two TableFunctionScanRel. Each 
TableFunctionScanRel will emit data for each window (if there is any) in every 
1 minute. And then the JOIN will be applied on data in the same window from 
both sides. 

Note that this should be the easier way to implement EMIT syntax. If we choose 
to apply EMIT strategies on every relational node (join, sort, aggregation, and 
more and more), the implementation will become very complicated.



[~julianhyde] I converted this JIRA to umbrella jira to host high level 
discussions. More small tasks will be created as sub-tasks.  


was (Author: amaliujia):
The second.

The emit every 1 minute will propagate to both two TableFunctionScanRel. Each 
TableFunctionScanRel will emit data for each window (if there is any) in every 
1 minute. And then the JOIN will be applied on data in the same window from 
both sides.



[~julianhyde] I converted this JIRA to umbrella jira to host high level 
discussions. More small tasks will be created as sub-tasks.  

> Implement EMIT Syntax
> ---------------------
>
>                 Key: CALCITE-4146
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4146
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Rui Wang
>            Assignee: Rui Wang
>            Priority: Major
>
> The goal is to support the following syntax:
> {code:sql}
> SELECT clause
> FROM TUMBLE/HOP/SESSION
> [EMIT] 
> {code}
> EMIT Syntax  is proposed in [One SQL to Rule Them 
> All|https://arxiv.org/pdf/1905.12133.pdf]. This idea proposes a way to allow 
> streaming SQL queries control materialization latency.
> Regarding the types of emit strategies, due to limit pages, that paper only 
> lists two strategies, and Calcite should support at least four categories:
> 1. Event time triggers. Emitting depends on the relationship between
> watermark and event timestamp of events. Handling late data is also included
> in this category.
> 2. Processing time triggers. Emitting depends on the system clock. This is
> a natural idea of emitting. E.g. emit the current result every hour without
> considering if data in a window is already complete.
> 3. data-driven triggers. E.g. emit when accumulated events exceed a
> threshold (e.g. emit when have acculucated 1000 events)
> 4. Composite triggers. There is a need to concat 1, 2, 3 by OR and AND to
> achieve better latency control.
> There are more context discussed in 
> [CALCITE-3272|https://issues.apache.org/jira/browse/CALCITE-3272?focusedCommentId=17166580&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17166580]
>  and the [EMIT syntax proposal for event-timestamp semantic 
> windowing|https://lists.apache.org/thread.html/r5bd9a6f7af2c0cd81aecd4de512fd889fbf15f112cc3704f188b1d4f%40%3Cdev.calcite.apache.org%3E]
>  email thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to