asolimando commented on code in PR #4557:
URL: https://github.com/apache/calcite/pull/4557#discussion_r2386928135
##########
core/src/test/resources/sql/planner.iq:
##########
@@ -223,15 +224,16 @@ select a from (values (1.0), (4.0), (null)) as t3 (a);
!ok
-EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], A=[$t1])
- EnumerableNestedLoopJoin(condition=[OR(AND(IS NULL(CAST($0):DECIMAL(11, 1)),
IS NULL(CAST($1):DECIMAL(11, 1))), =(CAST($0):DECIMAL(11, 1),
CAST($1):DECIMAL(11, 1)))], joinType=[anti])
- EnumerableAggregate(group=[{0}])
- EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT NULL,
CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT
NULL], A=[$t1])
- EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 5.0
}]])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT
NULL], A=[$t1])
- EnumerableValues(tuples=[[{ 1 }, { 2 }]])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
A=[$t1])
+EnumerableCalc(expr#0..1=[{inputs}], expr#2=[CAST($t0):DECIMAL(11, 1)],
A=[$t2])
+ EnumerableHashJoin(condition=[=($1, $3)], joinType=[anti])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
proj#0..1=[{exprs}])
+ EnumerableAggregate(group=[{0}])
+ EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT
NULL, CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)
NOT NULL], A=[$t1])
+ EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, {
5.0 }]])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)
NOT NULL], A=[$t1])
+ EnumerableValues(tuples=[[{ 1 }, { 2 }]])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
A=[$t1], A0=[$t1])
Review Comment:
> This wasn't happening before due to the more complex join condition, which
included an OR with IS NULL (LogicalJoin(condition=[OR(AND(IS
NULL(CAST($0):DECIMAL(11, 1)), IS NULL(CAST($1):DECIMAL(11, 1))),
=(CAST($0):DECIMAL(11, 1), CAST($1):DECIMAL(11, 1)))], joinType=[semi]) vs
LogicalJoin(condition=[=(CAST($0):DECIMAL(11, 1), CAST($1):DECIMAL(11, 1))],
joinType=[semi]) now due to simplifications allowed by the improved lossless
cast detection).
>
> It suffices to compare the logs to find JoinPushExpressionsRule being
applied in the new code
We agree that is the extra simplification that allows more rule
applications, and that allows the extra cast to be pushed down, and when it's
being pushed down and referenced, it's there to stay.
I did implement quickly CALCITE-7203 to verify this (that's why I assigned
it to myself, I just need to polish it, but the code is there already), and all
the spurious aliases go away, so the root cause is really the presence of the
extra casts, and the problem wasn't just surfacing before due to limitations in
simplifications that didn't allow `JoinPushExpressionsRule` to trigger.
Thanks for taking a look, I will merge this now that we have a clearer idea
of what's going on, cleanup will follow shortly with CALCITE-7203 fixing the
root cause.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]