comphead commented on code in PR #1424:
URL: https://github.com/apache/datafusion-comet/pull/1424#discussion_r1964288717


##########
spark/src/main/scala/org/apache/comet/rules/RewriteJoin.scala:
##########
@@ -67,4 +83,21 @@ object RewriteJoin extends JoinSelectionHelper {
       }
     case _ => plan
   }
+
+  def getOptimalBuildSide(join: Join): BuildSide = {
+    val leftSize = join.left.stats.sizeInBytes
+    val rightSize = join.right.stats.sizeInBytes
+    val leftRowCount = join.left.stats.rowCount
+    val rightRowCount = join.right.stats.rowCount
+    if (leftSize == rightSize && rightRowCount.isDefined && 
leftRowCount.isDefined) {

Review Comment:
   maybe I missing something? `leftSize == rightSize` condition looks very 
unlikely so by the logic it would never consider rowCounts here and fallback to 
sizes. 
   
   We can perhaps use something like 
https://docs.pingcap.com/tidb/stable/join-reorder#example-the-greedy-algorithm-of-join-reorder
   
   and if rowCounts are not available fallback to sizes
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to