xiedeyantu commented on code in PR #4557:
URL: https://github.com/apache/calcite/pull/4557#discussion_r2387311732


##########
core/src/test/resources/sql/planner.iq:
##########
@@ -223,15 +224,16 @@ select a from (values (1.0), (4.0), (null)) as t3 (a);
 
 !ok
 
-EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], A=[$t1])
-  EnumerableNestedLoopJoin(condition=[OR(AND(IS NULL(CAST($0):DECIMAL(11, 1)), 
IS NULL(CAST($1):DECIMAL(11, 1))), =(CAST($0):DECIMAL(11, 1), 
CAST($1):DECIMAL(11, 1)))], joinType=[anti])
-    EnumerableAggregate(group=[{0}])
-      EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT NULL, 
CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
-        EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT 
NULL], A=[$t1])
-          EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 5.0 
}]])
-        EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT 
NULL], A=[$t1])
-          EnumerableValues(tuples=[[{ 1 }, { 2 }]])
-    EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], 
A=[$t1])
+EnumerableCalc(expr#0..1=[{inputs}], expr#2=[CAST($t0):DECIMAL(11, 1)], 
A=[$t2])
+  EnumerableHashJoin(condition=[=($1, $3)], joinType=[anti])
+    EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], 
proj#0..1=[{exprs}])
+      EnumerableAggregate(group=[{0}])
+        EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT 
NULL, CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
+          EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) 
NOT NULL], A=[$t1])
+            EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 
5.0 }]])
+          EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) 
NOT NULL], A=[$t1])
+            EnumerableValues(tuples=[[{ 1 }, { 2 }]])
+    EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], 
A=[$t1], A0=[$t1])

Review Comment:
   Thank you very much for your detailed reply! Actually, we both understand 
the issue clearly.
   1. I completely agree that the multiple layers of redundant `CAST` 
operations in `IntersectToSemiJoinRule` are problematic.
   2. It's also excellent that lossless `CAST` further enhances Calcite's 
optimization capabilities.
   
   We both clearly understand the root cause. I just wanted to explain why `A0` 
appeared: it was because lossless `CAST` improved expression simplification 
capabilities, which in turn allowed `JoinPushExpressionsRule` to push 
expressions down—this is what led to the phenomenon. So when I first read your 
comment, I was somewhat confused: if the implementation of 
`IntersectToSemiJoinRule` was problematic, why was `A0` still successfully 
pushed down without any modifications? I believe this represents a positive 
analytical logic. Therefore, I supplemented my conclusions and analysis process 
below so that others can better understand the nature of this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to