andygrove commented on PR #1211: URL: https://github.com/apache/datafusion-comet/pull/1211#issuecomment-2568581478
Something odd is going on. I have more data that will maybe help us understand this. I ran q21 with the code in this PR and then again with all SMJs wrapped in a `CoalesceBatchesExec`. PR version took 25 minutes, and the AntiSemi and LeftAnti SMJs produced more than a billion rows each. With `CoalesceBatchesExec`, the query took 5.5 minutes, and the AntiSemi and LeftAnti SMJs produced the correct row counts. The final two Inner SMJs produced the correct row count in both cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org