ygf11 commented on PR #5087: URL: https://github.com/apache/arrow-datafusion/pull/5087#issuecomment-1407567003
> I think we can change the required_input_distribution to optimize the performance for the Full or other join type, if change the implementation like your code in this pr and bring a global visited bit map of state: Arc<Mutex<NestedLoopJoinState>>. But the global lock of state will bring some negative effects. Compare to the improvement, I think the negative effects of global lock is trivial. Because it will not be visited very frequently: 1. For joins that do not need global state (inner/right/right-semi/right-anti), the `NestedLoopJoinStream` will never visit the global state. 2. The global state will be locked and updated when a partition finishes, and only visited once for the partition. So the lock competition will not be very intense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
