Danny,
I have seen the full picture and I have actually changed mind:
If I am not mistaken, currently the way to make your example (and mine) to
work as an EquiJoin is using intermediate projections (so that RexCall
/ RexFieldAccess "becomes" RexInputRef):
Select A.a, B.b from A join B on cast(A.a as int) = B.b
=> option 1 (analyzed as equijoin)
Project($0, $2)
Join(condition: $1 = $2) -- i.e. cast(A.a as int) = B.b
Project($0=a; $1=cast($0 as int))
Scan(A)
Scan(B)
=> option 2 (analyzed as non-equijoin)
Project($0, $1)
Join(condition: cast($0 as int) = $1) -- i.e. cast(A.a as int) = B.b
Scan(A)
Scan(B)
It might seem "wrong", but the thing is, the Enumerable implementations
that extend EquiJoin (i.e. EnumerableJoin, EnumerableMergeJoin,
EnumerableSemiJoin) are based on the EquiJoin fields:
public final ImmutableIntList leftKeys;
public final ImmutableIntList rightKeys;
And rely on the the fact that they are representing an equality on leftKeys
and rightKeys field indices, and that we can directly generate accessors
for these fields without any extra computation (i.e. without any extra
call). That's the reason why EquiJoin cannot support RexCall
/ RexFieldAccess, because they cannot be translatable to a key (i.e. to a
field index).
With this situation, we could improve this logic to support more complex
equijoin conditions; but I think this will not be worth it, because the
alternative is quite simple: add a projection for the RexCall
/ RexFieldAccess and keep the existing (simple) logic.
For this reason, I think we should stick to the current logic *an equi-join
is "field = field", not "expression = field" *and I should abandon and
close https://issues.apache.org/jira/browse/CALCITE-2898
Best,
Ruben
Le lun. 15 avr. 2019 à 14:13, Yuzhao Chen <[email protected]> a écrit :
> Thx Ruben, the issue really answer my questions, I encounter this when
> dong CALCITE-2969, when I refactor SemiJoinRule, I think not only
> RexFieldAccess, any RexCall should fit into this case, only if the RexCall
> function is deterministic, what do you think ?
>
> Best,
> Danny Chan
> 在 2019年4月15日 +0800 PM7:48,Ruben Q L <[email protected]>,写道:
> > Danny,
> > In the context of https://issues.apache.org/jira/browse/CALCITE-2898, a
> > discussion about this topic was started. In that ticket I pointed out
> that
> > Calcite does not recognize "RexFieldAccess = RexInputRef" as an EquiJoin
> > condition (even though the RexFieldAccess itself is referencing a
> > RexInputRef); which is somewhat similar to the situation that you propose
> > "RexCall = RexInputRef". According to Julian Hyde's comment on that
> > ticket: *'For
> > our purposes, an equi-join is "field = field", not "expression = field".
> > Even if that expression is a reference to sub-field'. *However, I agree
> > with you and maybe this definition should be reviewed (I believe your
> > example and my example should be valid cases of EquiJoin), but possibly
> > this will break some pieces of the current code, so the modification
> might
> > not be straightforward.
> >
> > Best,
> > Ruben
> >
> >
> > Le lun. 15 avr. 2019 à 13:25, Xiening Dai <[email protected]> a écrit :
> >
> > > I think Calcite always pushes down equal join conditions. In
> > > SqlToRelConverter.createJoin(), before ruction returns, it calls
> > > RelOptUtil.pushDownJoinConditions(). So in your example, the cast
> > > expression will be pushed down and it will still be an equal join.
> > >
> > > > On Apr 15, 2019, at 5:40 PM, Yuzhao Chen <[email protected]>
> wrote:
> > > >
> > > > If we checkout the java doc for Calcite EuqiJoin, there is definition
> > > for it:
> > > > > for any join whose condition is based on column equality
> > > >
> > > > But what about if there are function calls in the equi condition
> > > operands ? For example:
> > > > Should we consider
> > > >
> > > > Select A.a, B.b from A join B on cast(A.a as int) = B.b
> > > >
> > > > as an equi join ?
> > > >
> > > > Now Calcite think it is not, which I think will lost some
> possibilities
> > > for sql plan promotion, e.g. join condition push down.
> > > >
> > > > Best,
> > > > Danny Chan
> > >
> > >
>