jonathanc-n commented on code in PR #20478:
URL: https://github.com/apache/datafusion/pull/20478#discussion_r2839310905


##########
datafusion/physical-plan/src/joins/sort_merge_join/stream.rs:
##########
@@ -153,17 +156,15 @@ impl StreamedBatch {
             idx: 0,
             join_arrays: vec![],
             output_indices: vec![],
+            num_output_rows: 0,
             buffered_batch_idx: None,
             join_filter_matched_idxs: HashSet::new(),
         }
     }
 
     /// Number of unfrozen output pairs in this streamed batch
     fn num_output_rows(&self) -> usize {

Review Comment:
   nit: i think this function can be removed, we can call 
`num_unfrozen_pairs()` as `streamed_batch.num_output_rows`. and add a small 
comment for the `num_output_rows` field to declare that it represents unfrozen 
pairs



##########
datafusion/physical-plan/src/joins/sort_merge_join/stream.rs:
##########
@@ -1437,6 +1433,7 @@ impl SortMergeJoinStream {
         }
 
         self.streamed_batch.output_indices.clear();
+        self.streamed_batch.num_output_rows = 0;

Review Comment:
   I think we should probably encapsulate this reset pattern into its own 
function? (self.reset() calls .clear() and sets num_output_rows = 0)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to