RatulDawar opened a new pull request, #21632: URL: https://github.com/apache/datafusion/pull/21632
## Summary This PR fixes an indefinite wait situation in Hash Join when dynamic filtering is enabled and some partitions have zero rows. ### The Issue The short-circuit optimization in `state_after_build_ready` was marking empty partitions as `Completed` immediately. However, when dynamic filtering is used, all partitions must report their build-side information to a shared accumulator (which uses a barrier). Empty partitions were skipping this report, causing the barrier to never be reached and leading to a deadlock. Additionally, once the short-circuit was disabled to allow for coordination, a `debug_assert!` in `process_probe_batch` was being triggered for join types that produce empty results when the build side is empty. ### The Fix 1. Modified `state_after_build_ready` to only short-circuit if no `build_accumulator` is present. This ensures that empty partitions still report their (empty) filters to the shared accumulator before completing. 2. Updated `process_probe_batch` to handle empty build sides gracefully when coordination is enabled, bypassing the join logic and the `debug_assert!`. ## Test plan 1. Run a hash join with dynamic filtering enabled and a high number of partitions such that some partitions are empty (e.g. TPC-H Q18 with target_partitions=24). 2. Verify that the query completes instead of hanging or panicking. Made with [Cursor](https://cursor.com) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
