>Is it reasonable to apply TUMBLE to TUMBLE? If so, would people > generally want two sets of window_start, window_end columns?
I think it is reasonable to apply TUMBLE to TUMBLE or even TUMBLE to HOP join, as long as there is a real requirement there. The window starts/ends are not duplicates. For example TUMBLE as L JOIN HOP as R, L offers a window start and a window end, same for R. This is no different from a normal JOIN case where both JOIN sides have the same column names (but they are not considered duplicates). The SQL rule is still applying: within a scope there shouldn't be ambiguous column names (e.g. duplicated column name). For JOIN duplicate names from JOIN inputs are differentiated by table alias. Regarding https://issues.apache.org/jira/browse/CALCITE-4274, this is an interesting case that is different from the JOIN case, and I also think this is a general case (not limited to TUMBLE). Think about that for any query that uses table function of the pattern in CALCITE-4274. The first table function generates column A and then it becomes the input for the second table function, which also wants to append a column named "A". How should Calcite handle this case? -Rui On Wed, Sep 23, 2020 at 9:13 AM Julian Hyde <[email protected]> wrote: > I think we should also discuss > https://issues.apache.org/jira/browse/CALCITE-4274 here. > > We've never discussed what should happen if you apply TUMBLE to TUMBLE > (or TUMBLE to HOP, etc.). What happens now is that you get duplicate > columns. > > Is it reasonable to apply TUMBLE to TUMBLE? If so, would people > generally want two sets of window_start, window_end columns? > > Julian > > On Wed, Sep 23, 2020 at 2:41 AM Danny Chan <[email protected]> wrote: > > > > Thanks for the feedback, I agree we should keep the verbose part > > > > **L.window_start = R.window_start AND L.window_end =R.window_end** > > > > Which would make the semantic more clear ~ > > > > Best, > > Danny Chan > > 在 2020年9月23日 +0800 PM3:24,Viliam Durina <[email protected]>,写道: > > > You can also use > > > > > > SELECT L.f0, R.f2, L.window_start, L.window_end > > > FROM > > > Tumble(table T1, descriptor(T1.ts), INTERVAL ‘5’ MINUTE) L > > > JOIN > > > Tumble(table T2, descriptor(T2.ts), INTERVAL ‘5’ MINUTE) R > > > USING (f0, window_start) > > > > > > Viliam > > > > > > On Wed, 23 Sep 2020 at 08:02, Rui Wang <[email protected]> wrote: > > > > > > > Regarding to **L.window_start = R.window_start AND L.window_end = > > > > R.window_end**: > > > > > > > > In general, the current table function windowing model is to append > window > > > > metadata to table directly, thus window metadata becomes a part of > table > > > > (or call it data). So as a part of table, these two columns should be > > > > treated as normal columns thus they should be in the join on > condition. > > > > > > > > If you want to make it optional, it makes window start/end columns > special > > > > and has a semantic binding with special table functions (TUMBLE, HOP, > > > > SESSION), which then becomes really not a SQL thing. For example, we > can > > > > allow users to define their own windowing table function. In that > case, how > > > > will you utilize window start/end produced by a customized windowing > table > > > > function? What if users produce wired windows that have overlapped > window > > > > starts or window ends? > > > > > > > > Keeping windows start/end as a part of the table, treating them no > > > > different from other columns, could give a consistent behavior for > either > > > > built-in table function or user-defined table function. > > > > > > > > If you think it is too verbose, there are two options to optimize: > > > > > > > > 1. for TUMBLE/HOP/SESSION, to identify a unique window, you will > only need > > > > either window start or end, so you can simplify it, for example, to > > > > L.window_start = R.window_start only. > > > > 2. (not recommended), you can cut off **L.window_start = > R.window_start AND > > > > L.window_end = R.window_end**, but add window metadata comparison to > join > > > > implicitly by execution engine. E.g. you can make up the join > condition in > > > > your JoinRel if two inputs are TUMBLE. > > > > > > > > > > > > > > > > -Rui > > > > > > > > > > > > > > > > > > > > On Tue, Sep 22, 2020 at 10:27 PM Danny Chan <[email protected]> > wrote: > > > > > > > > > Yes, the red part is **L.window_start = R.window_start AND > L.window_end = > > > > > R.window_end** > > > > > > > > > > > Is this a limitation for "triggered by the watermark of the > stream”? > > > > > > > > > > No, because in most of the cases, there is no need to output the > > > > > intermediate/partial join records then send retractions. > > > > > > > > > > > > > > > So, how do you think about the condition syntax **L.window_start = > > > > > R.window_start AND L.window_end = R.window_end** ? > > > > > > > > > > Best, > > > > > Danny Chan > > > > > 在 2020年9月23日 +0800 PM12:47,[email protected],写道: > > > > > > > > > > > > L.window_start = R.window_start AND L.window_end = R.window_end > > > > > > > > > > > > > > > > > > -- > > > Viliam Durina > > > Jet Developer > > > hazelcast® > > > > > > <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo, CA > 94402 | > > > USA > > > +1 (650) 521-5453 <(650)%20521-5453> | hazelcast.com < > https://www.hazelcast.com> > > > > > > -- > > > This message contains confidential information and is intended only > for the > > > individuals named. If you are not the named addressee you should not > > > disseminate, distribute or copy this e-mail. Please notify the sender > > > immediately by e-mail if you have received this e-mail by mistake and > > > delete this e-mail from your system. E-mail transmission cannot be > > > guaranteed to be secure or error-free as information could be > intercepted, > > > corrupted, lost, destroyed, arrive late or incomplete, or contain > viruses. > > > The sender therefore does not accept liability for any errors or > omissions > > > in the contents of this message, which arise as a result of e-mail > > > transmission. If verification is required, please request a hard-copy > > > version. -Hazelcast >
