[
https://issues.apache.org/jira/browse/CALCITE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ruben Q L updated CALCITE-3585:
-------------------------------
Description:
As of today EnumerableJoinRule transforms a LogicalJoin into
EnumerableConvention by producing two types of join operators:
- EnumerableHashJoin: if the join condition is totally or partially an equi-join
- EnumerableNestedLoopJoin: otherwise, i.e. if the join condition is completely
a non-equi-join
This distinction has its cause in the original implementation of
EnumerableHashJoin, which only supported equi-join. However, with the
implementation of CALCITE-2973, now EnumerableHashJoin supports all type of
conditions, not just equi-join, so EnumerableHashJoin could be generated
systematically.
Moreover, with the implementation of CALCITE-3576, which allows FilterJoinRule
to be applied in EnumerableConvention, the HashJoin vs NestedLoopJoin
distinction in EnumerableJoinRule can be "flawed". Let us considered the
following plan:
{code}
-- Select * FROM emp, dept WHERE emp.deptId = dept.id
Filter (condition: emp.deptId = dept.id)
Join (condition: true)
Scan (table: emp)
Scan (table: dept)
{code}
In this case (non-equi join), an EnumerableNestedLoopJoin would be created; but
then FilterJoinRule could be applied, inserting the filter into the
NestedLoopJoin as a join condition, and ending up with an equi-join (emp.deptId
= dept.id), "wrongly" implemented as an EnumerableNestedLoopJoin.
For these reasons, and since EnumerableHashJoin and EnumerableNestedLoopJoin
can now fully support all types of conditions, it could be better to deprecate
EnumerableJoinRule and create instead two rules:
- EnumerableHashJoinRule, which always creates an EnumerableHashJoin
- EnumerableNestedLoopJoinRule, which always creates an
EnumerableNestedLoopJoin
This would also be consistent with other join-related existing rules:
EnumerableMergeJoinRule, EnumerableBatchNestedLoopJoinRule; which always create
the same join operator.
was:
As of today EnumerableJoinRule transforms a LogicalJoin into
EnumerableConvention by producing two types of join operators:
- EnumerableHashJoin: if the join condition is totally or partially an equi-join
- EnumerableNestedLoopJoin: otherwise, i.e. if the join condition is completely
a non-equi-join
This distinction has its cause in the original implementation of
EnumerableHashJoin, which only supported equi-join. However, with the
implementation of CALCITE-2973, now EnumerableHashJoin supports all type of
conditions, not just equi-join, so EnumerableHashJoin could be generated
systematically.
Moreover, with the implementation of CALCITE-3576, which allows FilterJoinRule
to be applied in EnumerableConvention, the HashJoin vs NestedLoopJoin
distinction in EnumerableJoinRule can be "flawed". Let us considered the
following plan:
{code}
-- Select * FROM emp, dept WHERE emp.deptId = dept.id
Filter (condition: emp.deptId = dept.id)
Join (condition: true)
Scan (table: emp)
Scan (table: dept)
{code}
In this case (non-equi join), an EnumerableNestedLoopJoin would be created; but
then FilterJoinRule could be applied, inserting the filter as a join condition,
and ending up with an equi-join, "wrongly" implemented as an
EnumerableNestedLoopJoin.
For these reasons, and since EnumerableHashJoin and EnumerableNestedLoopJoin
can now fully support all types of conditions, it could be better to deprecated
EnumerableJoinRule and create instead two rules:
- EnumerableHashJoinRule, which always creates an EnumerableHashJoin
- EnumerableNestedLoopJoinRule, which always creates an
EnumerableNestedLoopJoin
> Deprecate EnumerableJoinRule in favor of EnumerableHashJoinRule +
> EnumerableNestedLoopJoinRule
> ----------------------------------------------------------------------------------------------
>
> Key: CALCITE-3585
> URL: https://issues.apache.org/jira/browse/CALCITE-3585
> Project: Calcite
> Issue Type: Task
> Reporter: Ruben Q L
> Priority: Minor
>
> As of today EnumerableJoinRule transforms a LogicalJoin into
> EnumerableConvention by producing two types of join operators:
> - EnumerableHashJoin: if the join condition is totally or partially an
> equi-join
> - EnumerableNestedLoopJoin: otherwise, i.e. if the join condition is
> completely a non-equi-join
> This distinction has its cause in the original implementation of
> EnumerableHashJoin, which only supported equi-join. However, with the
> implementation of CALCITE-2973, now EnumerableHashJoin supports all type of
> conditions, not just equi-join, so EnumerableHashJoin could be generated
> systematically.
> Moreover, with the implementation of CALCITE-3576, which allows
> FilterJoinRule to be applied in EnumerableConvention, the HashJoin vs
> NestedLoopJoin distinction in EnumerableJoinRule can be "flawed". Let us
> considered the following plan:
> {code}
> -- Select * FROM emp, dept WHERE emp.deptId = dept.id
> Filter (condition: emp.deptId = dept.id)
> Join (condition: true)
> Scan (table: emp)
> Scan (table: dept)
> {code}
> In this case (non-equi join), an EnumerableNestedLoopJoin would be created;
> but then FilterJoinRule could be applied, inserting the filter into the
> NestedLoopJoin as a join condition, and ending up with an equi-join
> (emp.deptId = dept.id), "wrongly" implemented as an EnumerableNestedLoopJoin.
> For these reasons, and since EnumerableHashJoin and EnumerableNestedLoopJoin
> can now fully support all types of conditions, it could be better to
> deprecate EnumerableJoinRule and create instead two rules:
> - EnumerableHashJoinRule, which always creates an EnumerableHashJoin
> - EnumerableNestedLoopJoinRule, which always creates an
> EnumerableNestedLoopJoin
> This would also be consistent with other join-related existing rules:
> EnumerableMergeJoinRule, EnumerableBatchNestedLoopJoinRule; which always
> create the same join operator.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)