xiedeyantu commented on code in PR #4557:
URL: https://github.com/apache/calcite/pull/4557#discussion_r2384206942
##########
core/src/test/resources/sql/planner.iq:
##########
@@ -223,15 +224,16 @@ select a from (values (1.0), (4.0), (null)) as t3 (a);
!ok
-EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)], A=[$t1])
- EnumerableNestedLoopJoin(condition=[OR(AND(IS NULL(CAST($0):DECIMAL(11, 1)),
IS NULL(CAST($1):DECIMAL(11, 1))), =(CAST($0):DECIMAL(11, 1),
CAST($1):DECIMAL(11, 1)))], joinType=[anti])
- EnumerableAggregate(group=[{0}])
- EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT NULL,
CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT
NULL], A=[$t1])
- EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, { 5.0
}]])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1) NOT
NULL], A=[$t1])
- EnumerableValues(tuples=[[{ 1 }, { 2 }]])
- EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
A=[$t1])
+EnumerableCalc(expr#0..1=[{inputs}], expr#2=[CAST($t0):DECIMAL(11, 1)],
A=[$t2])
+ EnumerableHashJoin(condition=[=($1, $3)], joinType=[anti])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
proj#0..1=[{exprs}])
+ EnumerableAggregate(group=[{0}])
+ EnumerableNestedLoopJoin(condition=[=(CAST($0):DECIMAL(11, 1) NOT
NULL, CAST($1):DECIMAL(11, 1) NOT NULL)], joinType=[anti])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)
NOT NULL], A=[$t1])
+ EnumerableValues(tuples=[[{ 1.0 }, { 2.0 }, { 3.0 }, { 4.0 }, {
5.0 }]])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)
NOT NULL], A=[$t1])
+ EnumerableValues(tuples=[[{ 1 }, { 2 }]])
+ EnumerableCalc(expr#0=[{inputs}], expr#1=[CAST($t0):DECIMAL(11, 1)],
A=[$t1], A0=[$t1])
Review Comment:
@asolimando Thank you very much for the explanation. After carefully
debugging, I found that the issue reflected by the `IntersectToSemiJoinRule` is
only superficial. The root cause is that you implemented lossless casting for
the `Decimal` type, which enhances expression simplification capabilities. For
example, in the case of `AND(IS NULL(CAST($0):DECIMAL(11, 1)), IS
NULL(CAST($1):DECIMAL(11, 1)))`, it can be simplified to `false`. This enables
the `JoinPushExpressionsRule` to be successfully applied.
The issue is not actually about multiple casts in the
`IntersectToSemiJoinRule`. I believe that even if multiple repeated `cast`s are
performed, they should be successfully eliminated. Of course, I agree with your
point about optimizing the problem of generating `cast`s at every layer in the
`IntersectToSemiJoinRule`—this should be addressed at another level. Generating
`cast`s at every layer indeed complicates the plan unnecessarily.
Finally, I want to say that your PR is truly very useful. If there are no
further comments, I'm looking forward to this PR being merged as soon as
possible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]