xiedeyantu commented on code in PR #4557:
URL: https://github.com/apache/calcite/pull/4557#discussion_r2387311732
##########
core/src/test/resources/sql/planner.iq:
##########
@@ -223,15 +224,16 @@ select a from (values (1.0), (4.0), (null)) as t3 (a);
!ok
-EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], A=[$t1])
- EnumerableNestedLoopJoin(condition=[OR(AND(IS NULL(CAST($0):DECIMAL(11, 1)),
IS NULL(CAST($1):DECIMAL(11, 1))), =(CAST($0):DECIMAL(11, 1),
CAST($1):DECIMAL(11, 1)))], joinType=[anti])
- EnumerableAggregate(group=[{0}])
- EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT NULL,
CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT
NULL], A=[$t1])
- EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 5.0
}]])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT
NULL], A=[$t1])
- EnumerableValues(tuples=[[{ 1 }, { 2 }]])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
A=[$t1])
+EnumerableCalc(expr#0..1=[{inputs}], expr#2=[CAST($t0):DECIMAL(11, 1)],
A=[$t2])
+ EnumerableHashJoin(condition=[=($1, $3)], joinType=[anti])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
proj#0..1=[{exprs}])
+ EnumerableAggregate(group=[{0}])
+ EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT
NULL, CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)
NOT NULL], A=[$t1])
+ EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, {
5.0 }]])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)
NOT NULL], A=[$t1])
+ EnumerableValues(tuples=[[{ 1 }, { 2 }]])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
A=[$t1], A0=[$t1])
Review Comment:
Thank you very much for your detailed reply! Actually, we both understand
the issue clearly.
1. I completely agree that the multiple layers of redundant `CAST`
operations in `IntersectToSemiJoinRule` are problematic.
2. It's also excellent that lossless `CAST` further enhances Calcite's
optimization capabilities.
We both clearly understand the root cause. I just wanted to explain why `A0`
appeared: it was because lossless `CAST` improved expression simplification
capabilities, which in turn allowed `JoinPushExpressionsRule` to push
expressions down—this is what led to the phenomenon. So when I first read your
comment, I was somewhat confused: if the implementation of
`IntersectToSemiJoinRule` was problematic, why was `A0` still successfully
pushed down without any modifications? I believe this represents a positive
analytical logic. Therefore, I supplemented my conclusions and analysis process
below so that others can better understand the nature of this issue.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]