[ 
https://issues.apache.org/jira/browse/CALCITE-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790390#comment-16790390
 ] 

Ruben Quesada Lopez edited comment on CALCITE-2898 at 3/12/19 9:44 AM:
-----------------------------------------------------------------------

Maybe I'm missing something, but IMHO the current verification does not make 
much sense regarding RexFieldAccess, because the same expression will be / will 
not be considered as an equijoin, the "trick" will be if we previously perform 
a projection of the "expression.field" to get a RexInputRef for it, e.g.:
{code}
SELECT a1.* from authors a1 WHERE a1.birthPlace.country IN
  (SELECT a2.birthPlace.country FROM authors a2
   WHERE ...)
{code}

This semijoin will NOT work (condition not considered as equijoin condition):
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
RexNode condition = relBuilder.equals(
    relBuilder.field(relBuilder.field(2, "a1", "birthPlace"), "country"),
    relBuilder.field(relBuilder.field(2, "a2", "birthPlace"), "country"));
relBuilder.semiJoin(condition);  // NOT considered as equijoin
{code}

But this semijoin, on the same fields will work, just because we have performed 
a projection to be able to access 'country' via an "artificial" RexInputRef 
instead of the "default" RexFieldAccess.RexInputRef:
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a1", "birthPlace"), 
"country")); 
// a1.birthPlace.country projected as a1's field 4
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a2", "birthPlace"), 
"country")); 
// a2.birthPlace.country projected as a2's field 4
RexNode condition = relBuilder.equals(
    relBuilder.field(2, 0, 4),
    relBuilder.field(2, 1, 4));
relBuilder.semiJoin(condition);  // considered as equijoin
{code}

I think this behavioral difference is confusing and prevents us from defining 
valid semijoins on structs' fields.


was (Author: rubenql):
Maybe I'm missing something, but IMHO the current verification does not make 
much sense regarding RexFieldAccess, because the same expression will be / will 
not be considered as an equijoin, the "trick" will be if we previously perform 
a projection of the "expression.field" to get a RexInputRef for it, e.g.:
{code}
SELECT a1.* from authors a1 WHERE a1.birthPlace.country IN
  (SELECT a2.birthPlace.country FROM authors a2
   WHERE ...)
{code}

This semijoin will NOT work (condition not considered as equijoin condition):
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
RexNode condition = relBuilder.equals(
    relBuilder.field(relBuilder.field(2, "a1", "birthPlace"), "country"),
    relBuilder.field(relBuilder.field(2, "a2", "birthPlace"), "country"));
relBuilder.semiJoin(condition);  // NOT considered as equijoin
{code}

But this semijoin, on the same fields will work, just because we have performed 
a projection to be able to access 'country' via an "artificial" RexInputRef 
instead of the "default" RexFieldAccess.RexInputRef:
{code:java}
relBuilder.scan("authors").as("a1");
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a1", "birthPlace"), 
"country")); // birthPlace.country projected as a1's field 4
relBuilder.scan("authors").as("a2");
relBuilder.filter(...);
relBuilder.projectPlus(relBuilder.field(relBuilder.field("a2", "birthPlace"), 
"country")); // birthPlace.country projected as a2's field 4
RexNode condition = relBuilder.equals(
    relBuilder.field(2, 0, 4),
    relBuilder.field(2, 1, 4));
relBuilder.semiJoin(condition);  // considered as equijoin
{code}

I think this behavioral difference is confusing and prevents us from defining 
valid semijoins on structs' fields.

> RelOptUtil#splitJoinCondition must consider RexFieldAccess referencing 
> RexInputRef
> ----------------------------------------------------------------------------------
>
>                 Key: CALCITE-2898
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2898
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.18.0
>            Reporter: Ruben Quesada Lopez
>            Assignee: Ruben Quesada Lopez
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> {{RelOptUtil#splitJoinCondition}} "Splits out the equi-join components of a 
> join condition, and returns what's left (remaining join filters that are not 
> equijoins)". This works fine in case of RexInputRef operands in the condition 
> (e.g. $0 = $1), but if any of the operands is a RexFieldAccess referencing a 
> RexInputRef (e.g. $0 = $1.id), then the condition will NOT be detected as an 
> equi-join and will be returned as if it were a non-equijoin. 
> This can lead to undesired consequences, e.g {{JoinInfo#of}} would return a 
> NonEquiJoinInfo object instead of an EquiJoinInfo object, which can generate 
> problems if, for example, we are creating a SemiJoin (which requires an 
> EquiJoinInfo object)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to