>Is it reasonable to apply TUMBLE to TUMBLE? If so, would people
> generally want two sets of window_start, window_end columns?

I think it is reasonable to apply TUMBLE to TUMBLE or even TUMBLE to HOP
join, as long as there is a real requirement there. The window starts/ends
are not duplicates. For example TUMBLE as L JOIN HOP as R, L offers a
window start and a window end, same for R. This is no different from a
normal JOIN case where both JOIN sides have the same column names (but they
are not considered duplicates). The SQL rule is still applying: within a
scope there shouldn't be ambiguous column names (e.g. duplicated column
name). For JOIN duplicate names from JOIN inputs are differentiated by
table alias.


Regarding https://issues.apache.org/jira/browse/CALCITE-4274, this is an
interesting case that is different from the JOIN case, and I also think
this is a general case (not limited to TUMBLE).

Think about that for any query that uses table function of the pattern
in CALCITE-4274. The first table function generates column A and then it
becomes the input for the second table function, which also wants to append
a column named "A". How should Calcite handle this case?


-Rui



On Wed, Sep 23, 2020 at 9:13 AM Julian Hyde <[email protected]> wrote:

> I think we should also discuss
> https://issues.apache.org/jira/browse/CALCITE-4274 here.
>
> We've never discussed what should happen if you apply TUMBLE to TUMBLE
> (or TUMBLE to HOP, etc.). What happens now is that you get duplicate
> columns.
>
> Is it reasonable to apply TUMBLE to TUMBLE? If so, would people
> generally want two sets of window_start, window_end columns?
>
> Julian
>
> On Wed, Sep 23, 2020 at 2:41 AM Danny Chan <[email protected]> wrote:
> >
> > Thanks for the feedback, I agree we should keep the verbose part
> >
> > **L.window_start = R.window_start AND L.window_end =R.window_end**
> >
> > Which would make the semantic more clear ~
> >
> > Best,
> > Danny Chan
> > 在 2020年9月23日 +0800 PM3:24,Viliam Durina <[email protected]>,写道:
> > > You can also use
> > >
> > > SELECT L.f0, R.f2, L.window_start, L.window_end
> > > FROM
> > > Tumble(table T1, descriptor(T1.ts), INTERVAL ‘5’ MINUTE) L
> > > JOIN
> > > Tumble(table T2, descriptor(T2.ts), INTERVAL ‘5’ MINUTE) R
> > > USING (f0, window_start)
> > >
> > > Viliam
> > >
> > > On Wed, 23 Sep 2020 at 08:02, Rui Wang <[email protected]> wrote:
> > >
> > > > Regarding to **L.window_start = R.window_start AND L.window_end =
> > > > R.window_end**:
> > > >
> > > > In general, the current table function windowing model is to append
> window
> > > > metadata to table directly, thus window metadata becomes a part of
> table
> > > > (or call it data). So as a part of table, these two columns should be
> > > > treated as normal columns thus they should be in the join on
> condition.
> > > >
> > > > If you want to make it optional, it makes window start/end columns
> special
> > > > and has a semantic binding with special table functions (TUMBLE, HOP,
> > > > SESSION), which then becomes really not a SQL thing. For example, we
> can
> > > > allow users to define their own windowing table function. In that
> case, how
> > > > will you utilize window start/end produced by a customized windowing
> table
> > > > function? What if users produce wired windows that have overlapped
> window
> > > > starts or window ends?
> > > >
> > > > Keeping windows start/end as a part of the table, treating them no
> > > > different from other columns, could give a consistent behavior for
> either
> > > > built-in table function or user-defined table function.
> > > >
> > > > If you think it is too verbose, there are two options to optimize:
> > > >
> > > > 1. for TUMBLE/HOP/SESSION, to identify a unique window, you will
> only need
> > > > either window start or end, so you can simplify it, for example, to
> > > > L.window_start = R.window_start only.
> > > > 2. (not recommended), you can cut off **L.window_start =
> R.window_start AND
> > > > L.window_end = R.window_end**, but add window metadata comparison to
> join
> > > > implicitly by execution engine. E.g. you can make up the join
> condition in
> > > > your JoinRel if two inputs are TUMBLE.
> > > >
> > > >
> > > >
> > > > -Rui
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Sep 22, 2020 at 10:27 PM Danny Chan <[email protected]>
> wrote:
> > > >
> > > > > Yes, the red part is **L.window_start = R.window_start AND
> L.window_end =
> > > > > R.window_end**
> > > > >
> > > > > > Is this a limitation for "triggered by the watermark of the
> stream”?
> > > > >
> > > > > No, because in most of the cases, there is no need to output the
> > > > > intermediate/partial join records then send retractions.
> > > > >
> > > > >
> > > > > So, how do you think about the condition syntax **L.window_start =
> > > > > R.window_start AND L.window_end = R.window_end** ?
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > > > 在 2020年9月23日 +0800 PM12:47,[email protected],写道:
> > > > > >
> > > > > > L.window_start = R.window_start AND L.window_end = R.window_end
> > > > >
> > > >
> > >
> > >
> > > --
> > > Viliam Durina
> > > Jet Developer
> > > hazelcast®
> > >
> > > <https://www.hazelcast.com> 2 W 5th Ave, Ste 300 | San Mateo, CA
> 94402 |
> > > USA
> > > +1 (650) 521-5453 <(650)%20521-5453> | hazelcast.com <
> https://www.hazelcast.com>
> > >
> > > --
> > > This message contains confidential information and is intended only
> for the
> > > individuals named. If you are not the named addressee you should not
> > > disseminate, distribute or copy this e-mail. Please notify the sender
> > > immediately by e-mail if you have received this e-mail by mistake and
> > > delete this e-mail from your system. E-mail transmission cannot be
> > > guaranteed to be secure or error-free as information could be
> intercepted,
> > > corrupted, lost, destroyed, arrive late or incomplete, or contain
> viruses.
> > > The sender therefore does not accept liability for any errors or
> omissions
> > > in the contents of this message, which arise as a result of e-mail
> > > transmission. If verification is required, please request a hard-copy
> > > version. -Hazelcast
>

Reply via email to