kumarUjjawal commented on code in PR #22230:
URL: https://github.com/apache/datafusion/pull/22230#discussion_r3296333431


##########
datafusion/physical-plan/src/joins/sort_merge_join/materializing_stream.rs:
##########
@@ -997,6 +1112,7 @@ impl MaterializingSortMergeJoinStream {
                             .unwrap(); // Operation only return None if no 
batches are spilled, here we ensure that at least one batch is spilled
 
                         buffered_batch.batch = 
BufferedBatchState::Spilled(spill_file);
+                        self.spilled_batch_count += 1;

Review Comment:
   this also needs a decrement in the dequeue path. 



##########
datafusion/physical-plan/src/joins/sort_merge_join/bitwise_stream.rs:
##########
@@ -785,24 +798,44 @@ impl BitwiseSortMergeJoinStream {
         )
         .count_ones();
 
-        // Process spilled inner batches first (read back from disk).
-        if let Some(spill_file) = &self.inner_key_spill {
-            let file = BufReader::new(File::open(spill_file.path())?);
-            let reader = StreamReader::try_new(file, None)?;
-            for batch_result in reader {
-                let inner_slice = batch_result?;
-                matched_count = eval_filter_for_inner_slice(
-                    self.outer_is_left,
-                    filter,
-                    &outer_slice,
-                    &inner_slice,
-                    &mut self.matched,
-                    self.outer_offset,
-                    outer_group_len,
-                    matched_count,
-                )?;
-                if matched_count == outer_group_len {
-                    break;
+        // Process spilled inner batches first asynchronously.
+        if self.inner_key_spill.is_some() || self.spill_stream.is_some() {

Review Comment:
   we should guard the stream creation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to