2010YOUY01 commented on code in PR #22865:
URL: https://github.com/apache/datafusion/pull/22865#discussion_r3392962858
##########
datafusion/physical-plan/src/joins/nested_loop_join.rs:
##########
@@ -2368,29 +2434,11 @@ impl NestedLoopJoinStream {
// Early return if join type can't have unmatched rows
let join_type_no_produce_left =
!need_produce_result_in_final(self.join_type);
- // Early return if another thread is already processing unmatched rows.
- //
- // The shared probe-threads counter must be decremented exactly once
per
- // probe stream. This function can be re-entered with `left_emit_idx`
- // still 0 (e.g. when a ready batch was flushed via an early return in
- // `handle_emit_left_unmatched` before the state advanced), so guard
the
- // decrement with `probe_completed_reported` instead of relying solely
on
- // `left_emit_idx == 0`. Decrementing twice would drive the counter to
- // zero prematurely and let a partition emit unmatched-left rows before
- // all partitions finished probing, producing spurious NULL-padded
rows.
- let handled_by_other_partition = if self.probe_completed_reported {
- // Already counted this stream's completion; if we're the
designated
- // emitter we have `left_emit_idx > 0` (or are mid-emit) and
continue,
- // otherwise another partition is handling emission.
- self.left_emit_idx == 0
- } else {
- self.probe_completed_reported = true;
- self.left_emit_idx == 0 && !left_data.report_probe_completed()
- };
// Stop processing unmatched rows, the caller will go to the next state
let finished = self.left_emit_idx >= left_batch.num_rows();
- if join_type_no_produce_left || handled_by_other_partition || finished
{
+ // `ProbeEnd` already recorded whether this stream emits
unmatched-left rows.
+ if join_type_no_produce_left || !self.is_unmatched_left_emitter ||
finished {
Review Comment:
Here `self.is_unmatched_left_emitter` seems like always true, if a partition
has entered the current `emit left` state? If it's true, we could instead
`debug_assert(self.is_unmatched_left_emitter)` here
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]