andygrove commented on PR #1424: URL: https://github.com/apache/datafusion-comet/pull/1424#issuecomment-2673152457
> Do we know why Spark's decision is so bad to start with? Spark has the same logic here: https://github.com/apache/spark/blob/fb17856a22be6968b2ed55ccbd7cf72111920bea/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala#L506 Spark is building a SortMergeJoin and we are replacing with ShuffledHashJoin. Our new logic in this PR seems to match the Spark logic you linked to. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org