[
https://issues.apache.org/jira/browse/CALCITE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17944874#comment-17944874
]
Mihai Budiu commented on CALCITE-6930:
--------------------------------------
I have tried this rule with the above query, and unfortunately it does not work.
The condition for the query is deemed to be too complex, since it contains CAST
and SEARCH operators.
isValidCond returns 'false' for the operands of the OR.
Here is a fragment of the condition:
OR(AND(=($16, $1), =(CAST($19):VARCHAR(10) NOT NULL, 'Brand#22'), SEARCH($22,
Sarg['SM BOX ', 'SM CASE ', 'SM PACK ', 'SM PKG ']:CHAR(10)),
SEARCH($4, Sarg[[8.00:DECIMAL(15, 2)..18.00:DECIMAL(15, 2)]]:DECIMAL(15, 2)),
SEARCH($21, Sarg[[1..5]]), SEARCH($14, Sarg['AIR ', 'AIR REG
']:CHAR(10)), =(CAST($13):VARCHAR(25) NOT NULL, 'DELIVER IN PERSON')), ...)))
However, you would think that this condition should be optimizable.
> Implementing JoinConditionOrExpansionRule
> -----------------------------------------
>
> Key: CALCITE-6930
> URL: https://issues.apache.org/jira/browse/CALCITE-6930
> Project: Calcite
> Issue Type: New Feature
> Reporter: Zhen Chen
> Assignee: Zhen Chen
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.40.0
>
> Attachments: image-2025-04-03-13-36-23-352.png,
> image-2025-04-03-13-37-16-575.png
>
>
> JoinConditionOrExpansionRule transforms a join with OR conditions into a
> UNION ALL of multiple joins.
> For example, the SQL:
> {code:java}
> SELECT * FROM emp JOIN dept
> ON emp.deptno = dept.deptno OR emp.deptno = dept.mgr {code}
> Will be transformed into:
> {code:java}
> SELECT * FROM emp JOIN dept ON emp.deptno = dept.deptno
> UNION ALL
> SELECT * FROM emp JOIN dept ON emp.deptno = dept.mgr {code}
> The original plan:
> {code:java}
> LogicalJoin(condition=[OR(=(deptno, deptno), =(deptno, mgr))])
> LogicalTableScan(table=emp)
> LogicalTableScan(table=dept) {code}
> Is transformed to:
> {code:java}
> LogicalUnion(all=[true])
> LogicalJoin(condition=[=(deptno, deptno)])
> LogicalTableScan(table=emp)
> LogicalTableScan(table=dept)
> LogicalJoin(condition=[=(deptno, mgr)])
> LogicalTableScan(table=emp)
> LogicalTableScan(table=dept) {code}
> Benefits:
> # Each individual join can use hash-join algorithm, which is more efficient
> than nested-loop joins required for OR conditions.
> # Each join branch can independently choose its optimal join implementation.
> # The joins can be executed in parallel.
> # Individual joins may be able to use indexes that would not be usable with
> OR conditions.
> The rule only fires when:
> # The OR condition contains only simple equi-join conditions (comparing
> columns between left and right inputs).
> Limitations and considerations:
> # May not improve performance if individual joins produce large results that
> need to be union-ed.
> # Multiple hash tables may cause memory pressure.
> # The optimizer should consider statistics and cost estimates when applying
> this rule.
>
> supplement:
> # The inner join will perform the conversion as described above.
> # The outer join will perform the conversion more complexly.
> left join:
> !image-2025-04-03-13-36-23-352.png|width=442,height=228!
> full join:
> !image-2025-04-03-13-37-16-575.png|width=501,height=199!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)