Hi, fellows, long time no see on the mailing ~

Here I want to have a discussion on the join syntax of our recently introduced 
window table function ~

For example, we can define a tumbling window function of 5 minutes size as:

Tumble(table T, descriptor(T.ts), INTERVAL ‘5’ MINUTE)

The we can select from it, and moreover, I want to support 2 window function 
join for the streaming query recently.

The semantics of the windowed stream join is:

• The 2 window inputs should have the same window arguments (except for the 
table name), e.g. for TUMBLE the size should be equal, for HOP, both the side 
interval and size should be equal
• We first window the input stream then join the both window data set of the 
same TimeWindow
• The Join action is triggered by the watermark of the stream
• The join does not produce retractions of the stream, the mainly difference 
with normal two-stream join

And I want to propose a join syntax as:

Select L.f0, R.f2, L.window_start, L.window_end
FROM
Tumble(table T1, descriptor(T1.ts), INTERVAL ‘5’ MINUTE) L
JOIN
Tumble(table T2, descriptor(T2.ts), INTERVAL ‘5’ MINUTE) R
ON
L.f0 = R.f0 AND L.window_start = R.window_start AND L.window_end = R.window_end

The red syntax part is what I want to discuss, the condition seems too verbose 
because
user need to declare it every time.


• Should we make it optional ?
• Is there better syntax to describe this window join semantics ?


Best,
Danny Chan

Reply via email to