[
https://issues.apache.org/jira/browse/CALCITE-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790390#comment-16790390
]
Ruben Quesada Lopez edited comment on CALCITE-2898 at 3/12/19 9:49 AM:
-----------------------------------------------------------------------
Agree, probably the equijoin assertion needs to be added to some other places.
But considering the current discussion, maybe I'm missing something, but IMHO
the current verification does not make much sense regarding RexFieldAccess,
because the same expression will be / will not be considered as an equijoin,
the "trick" will be if we previously perform a projection of the
"expression.field" to get a RexInputRef for it, e.g.:
{code}
SELECT a1.* from authors a1 WHERE a1.birthPlace.country IN
(SELECT a2.birthPlace.country FROM authors a2
WHERE ...)
{code}
This semijoin will NOT work (condition not considered as equijoin condition):
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
RexNode condition = relBuilder.equals(
relBuilder.field(relBuilder.field(2, 0, "birthPlace"), "country"),
relBuilder.field(relBuilder.field(2, 1, "birthPlace"), "country"));
relBuilder.semiJoin(condition); // NOT considered as equijoin
{code}
But this semijoin, on the same fields will work, just because we have performed
a projection to be able to access 'country' via an "artificial" RexInputRef
instead of the "default" RexFieldAccess.RexInputRef:
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a1", "birthPlace"),
"country"));
// a1.birthPlace.country projected as a1's field 4
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a2", "birthPlace"),
"country"));
// a2.birthPlace.country projected as a2's field 4
RexNode condition = relBuilder.equals(
relBuilder.field(2, 0, 4),
relBuilder.field(2, 1, 4));
relBuilder.semiJoin(condition); // considered as equijoin
{code}
I think this behavioral difference is confusing and prevents us from defining
valid semijoins on structs' fields.
was (Author: rubenql):
Maybe I'm missing something, but IMHO the current verification does not make
much sense regarding RexFieldAccess, because the same expression will be / will
not be considered as an equijoin, the "trick" will be if we previously perform
a projection of the "expression.field" to get a RexInputRef for it, e.g.:
{code}
SELECT a1.* from authors a1 WHERE a1.birthPlace.country IN
(SELECT a2.birthPlace.country FROM authors a2
WHERE ...)
{code}
This semijoin will NOT work (condition not considered as equijoin condition):
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
RexNode condition = relBuilder.equals(
relBuilder.field(relBuilder.field(2, 0, "birthPlace"), "country"),
relBuilder.field(relBuilder.field(2, 1, "birthPlace"), "country"));
relBuilder.semiJoin(condition); // NOT considered as equijoin
{code}
But this semijoin, on the same fields will work, just because we have performed
a projection to be able to access 'country' via an "artificial" RexInputRef
instead of the "default" RexFieldAccess.RexInputRef:
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a1", "birthPlace"),
"country"));
// a1.birthPlace.country projected as a1's field 4
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a2", "birthPlace"),
"country"));
// a2.birthPlace.country projected as a2's field 4
RexNode condition = relBuilder.equals(
relBuilder.field(2, 0, 4),
relBuilder.field(2, 1, 4));
relBuilder.semiJoin(condition); // considered as equijoin
{code}
I think this behavioral difference is confusing and prevents us from defining
valid semijoins on structs' fields.
> RelOptUtil#splitJoinCondition must consider RexFieldAccess referencing
> RexInputRef
> ----------------------------------------------------------------------------------
>
> Key: CALCITE-2898
> URL: https://issues.apache.org/jira/browse/CALCITE-2898
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.18.0
> Reporter: Ruben Quesada Lopez
> Assignee: Ruben Quesada Lopez
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> {{RelOptUtil#splitJoinCondition}} "Splits out the equi-join components of a
> join condition, and returns what's left (remaining join filters that are not
> equijoins)". This works fine in case of RexInputRef operands in the condition
> (e.g. $0 = $1), but if any of the operands is a RexFieldAccess referencing a
> RexInputRef (e.g. $0 = $1.id), then the condition will NOT be detected as an
> equi-join and will be returned as if it were a non-equijoin.
> This can lead to undesired consequences, e.g {{JoinInfo#of}} would return a
> NonEquiJoinInfo object instead of an EquiJoinInfo object, which can generate
> problems if, for example, we are creating a SemiJoin (which requires an
> EquiJoinInfo object)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)