Now many compute engine do not use Calcite EnumerableXXXs and only use the 
logical node for planning, after all, the Enumerables are implementations are 
only specific to Calcite, I still think Calcite need to give more accurate 
definitions for what equi join is.

Best,
Danny Chan
在 2019年4月16日 +0800 AM12:19,Ruben Q L <[email protected]>,写道:
> Danny,
> I have seen the full picture and I have actually changed mind:
>
> If I am not mistaken, currently the way to make your example (and mine) to
> work as an EquiJoin is using intermediate projections (so that RexCall
> / RexFieldAccess "becomes" RexInputRef):
>
> Select A.a, B.b from A join B on cast(A.a as int) = B.b
>
> => option 1 (analyzed as equijoin)
> Project($0, $2)
> Join(condition: $1 = $2) -- i.e. cast(A.a as int) = B.b
> Project($0=a; $1=cast($0 as int))
> Scan(A)
> Scan(B)
>
> => option 2 (analyzed as non-equijoin)
> Project($0, $1)
> Join(condition: cast($0 as int) = $1) -- i.e. cast(A.a as int) = B.b
> Scan(A)
> Scan(B)
>
> It might seem "wrong", but the thing is, the Enumerable implementations
> that extend EquiJoin (i.e. EnumerableJoin, EnumerableMergeJoin,
> EnumerableSemiJoin) are based on the EquiJoin fields:
> public final ImmutableIntList leftKeys;
> public final ImmutableIntList rightKeys;
>
> And rely on the the fact that they are representing an equality on leftKeys
> and rightKeys field indices, and that we can directly generate accessors
> for these fields without any extra computation (i.e. without any extra
> call). That's the reason why EquiJoin cannot support RexCall
> / RexFieldAccess, because they cannot be translatable to a key (i.e. to a
> field index).
>
> With this situation, we could improve this logic to support more complex
> equijoin conditions; but I think this will not be worth it, because the
> alternative is quite simple: add a projection for the RexCall
> / RexFieldAccess and keep the existing (simple) logic.
>
> For this reason, I think we should stick to the current logic *an equi-join
> is "field = field", not "expression = field" *and I should abandon and
> close https://issues.apache.org/jira/browse/CALCITE-2898
>
> Best,
> Ruben
>
>
> Le lun. 15 avr. 2019 à 14:13, Yuzhao Chen <[email protected]> a écrit :
>
> > Thx Ruben, the issue really answer my questions, I encounter this when
> > dong CALCITE-2969, when I refactor SemiJoinRule, I think not only
> > RexFieldAccess, any RexCall should fit into this case, only if the RexCall
> > function is deterministic, what do you think ?
> >
> > Best,
> > Danny Chan
> > 在 2019年4月15日 +0800 PM7:48,Ruben Q L <[email protected]>,写道:
> > > Danny,
> > > In the context of https://issues.apache.org/jira/browse/CALCITE-2898, a
> > > discussion about this topic was started. In that ticket I pointed out
> > that
> > > Calcite does not recognize "RexFieldAccess = RexInputRef" as an EquiJoin
> > > condition (even though the RexFieldAccess itself is referencing a
> > > RexInputRef); which is somewhat similar to the situation that you propose
> > > "RexCall = RexInputRef". According to Julian Hyde's comment on that
> > > ticket: *'For
> > > our purposes, an equi-join is "field = field", not "expression = field".
> > > Even if that expression is a reference to sub-field'. *However, I agree
> > > with you and maybe this definition should be reviewed (I believe your
> > > example and my example should be valid cases of EquiJoin), but possibly
> > > this will break some pieces of the current code, so the modification
> > might
> > > not be straightforward.
> > >
> > > Best,
> > > Ruben
> > >
> > >
> > > Le lun. 15 avr. 2019 à 13:25, Xiening Dai <[email protected]> a écrit :
> > >
> > > > I think Calcite always pushes down equal join conditions. In
> > > > SqlToRelConverter.createJoin(), before ruction returns, it calls
> > > > RelOptUtil.pushDownJoinConditions(). So in your example, the cast
> > > > expression will be pushed down and it will still be an equal join.
> > > >
> > > > > On Apr 15, 2019, at 5:40 PM, Yuzhao Chen <[email protected]>
> > wrote:
> > > > >
> > > > > If we checkout the java doc for Calcite EuqiJoin, there is definition
> > > > for it:
> > > > > > for any join whose condition is based on column equality
> > > > >
> > > > > But what about if there are function calls in the equi condition
> > > > operands ? For example:
> > > > > Should we consider
> > > > >
> > > > > Select A.a, B.b from A join B on cast(A.a as int) = B.b
> > > > >
> > > > > as an equi join ?
> > > > >
> > > > > Now Calcite think it is not, which I think will lost some
> > possibilities
> > > > for sql plan promotion, e.g. join condition push down.
> > > > >
> > > > > Best,
> > > > > Danny Chan
> > > >
> > > >
> >

Reply via email to