silundong commented on code in PR #4392:
URL: https://github.com/apache/calcite/pull/4392#discussion_r2098009043


##########
core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml:
##########
@@ -17826,16 +17911,16 @@ LogicalProject(EMPNO=[$0])
     </Resource>
     <Resource name="planAfter">
       <![CDATA[
-LogicalProject(EMPNO=[$19])
-  LogicalJoin(condition=[AND(=($19, $0), =(+($8, $0), +($17, $13)))], 
joinType=[inner])
-    LogicalTableScan(table=[[CATALOG, SALES, EMP_ADDRESS]])
+LogicalProject(EMPNO=[$16])

Review Comment:
   I think this is caused by double precision.
   
   Before explaining this problem, I need to explain that in the previous pr, 
in order to conveniently represent `RexInputRef` during enumeration, a unique 
alias will be generated for each input before enumeration (that is, an 
additional Project operator will be generated). 
   However, relying on strings and input rowType will cause trouble when 
processing semi/anti joins (the right child will not be projected). Therefore, 
in this PR, I changed it to use node index and the relative position of the 
field in the node to represent `RexInputRef`.
   
   So, even if the join order is essentially the same, the cost value of the 
plan in this pr and the previous pr is different during enumeration (the 
previous pr has an additional Project).
   
   Specifically,
   In the previous pr:
   t1 join t2 cost=154.21564(with additional Project)
   t2 join t1 cost=154.21564(with additional Project)     **_winner_** (see 
`chooseBetterPlan` in DpHyp.java)
   
   In this pr:
   t1 join t2 cost=92.21564                                             
**_winner_**
   t2 join t1 cost=92.21564000000001
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to