asolimando commented on code in PR #4557:
URL: https://github.com/apache/calcite/pull/4557#discussion_r2386928135


##########
core/src/test/resources/sql/planner.iq:
##########
@@ -223,15 +224,16 @@ select a from (values (1.0), (4.0), (null)) as t3 (a);
 
 !ok
 
-EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], A=[$t1])
-  EnumerableNestedLoopJoin(condition=[OR(AND(IS NULL(CAST($0):DECIMAL(11, 1)), 
IS NULL(CAST($1):DECIMAL(11, 1))), =(CAST($0):DECIMAL(11, 1), 
CAST($1):DECIMAL(11, 1)))], joinType=[anti])
-    EnumerableAggregate(group=[{0}])
-      EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT NULL, 
CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
-        EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT 
NULL], A=[$t1])
-          EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 5.0 
}]])
-        EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT 
NULL], A=[$t1])
-          EnumerableValues(tuples=[[{ 1 }, { 2 }]])
-    EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], 
A=[$t1])
+EnumerableCalc(expr#0..1=[{inputs}], expr#2=[CAST($t0):DECIMAL(11, 1)], 
A=[$t2])
+  EnumerableHashJoin(condition=[=($1, $3)], joinType=[anti])
+    EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], 
proj#0..1=[{exprs}])
+      EnumerableAggregate(group=[{0}])
+        EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT 
NULL, CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
+          EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) 
NOT NULL], A=[$t1])
+            EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 
5.0 }]])
+          EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) 
NOT NULL], A=[$t1])
+            EnumerableValues(tuples=[[{ 1 }, { 2 }]])
+    EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], 
A=[$t1], A0=[$t1])

Review Comment:
   > This wasn't happening before due to the more complex join condition, which 
included an OR with IS NULL (LogicalJoin(condition=[OR(AND(IS 
NULL(CAST($0):DECIMAL(11, 1)), IS NULL(CAST($1):DECIMAL(11, 1))), 
=(CAST($0):DECIMAL(11, 1), CAST($1):DECIMAL(11, 1)))], joinType=[semi]) vs 
LogicalJoin(condition=[=(CAST($0):DECIMAL(11, 1), CAST($1):DECIMAL(11, 1))], 
joinType=[semi]) now due to simplifications allowed by the improved lossless 
cast detection).
   >
   > It suffices to compare the logs to find JoinPushExpressionsRule being 
applied in the new code
   
   We agree that is the extra simplification that allows more rule 
applications, and that allows the extra cast to be pushed down, and when it's 
being pushed down and referenced, it's there to stay.
   
   I did implement quickly CALCITE-7203 to verify this (that's why I assigned 
it to myself, I just need to polish it, but the code is there already), and all 
the spurious aliases go away, so the root cause is really the presence of the 
extra casts, and the problem wasn't just surfacing before due to limitations in 
simplifications that didn't allow `JoinPushExpressionsRule` to trigger.
   
   Thanks for taking a look, I will merge this now that we have a clearer idea 
of what's going on, cleanup will follow shortly with CALCITE-7203 fixing the 
root cause.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to