Dandandan commented on PR #22652: URL: https://github.com/apache/datafusion/pull/22652#issuecomment-4634634541
> Replacing an inner join with a left semi-join, where the inner join would have produced at most one matching row for each left tuple. Exactly the same intermediate result sets, but should be slightly faster due to less join overhead etc (plus it lets the planner place tighter bounds on the size of the join output set). I haven't done detailed microbenchmarks yet, but you're right that it seems we aren't seeing major wins from this and it might merit further investigation. I think in this situation there is only limited difference currently, it will use most of the same code paths. There is probably some further specialization we can do for different join types. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
