viirya commented on code in PR #9163:
URL: https://github.com/apache/arrow-datafusion/pull/9163#discussion_r1482641685


##########
datafusion/physical-plan/src/joins/sort_merge_join.rs:
##########
@@ -1254,6 +1261,20 @@ impl SMJStream {
 
                         // For full join, we also need to output the null 
joined rows from the buffered side
                         if matches!(self.join_type, JoinType::Full) {
+                            // Handle not mask for buffered side further.
+                            // For buffered side, we want to output the rows 
that are not null joined with
+                            // the streamed side. i.e. the rows that are not 
null in the `buffered_indices`.
+                            let not_mask = if buffered_indices.null_count() > 
0 {
+                                let nulls = buffered_indices.nulls().unwrap();
+                                let mask = not_mask.values() & nulls.inner();
+                                BooleanArray::new(mask, None)

Review Comment:
   For full outer join, we need to output buffered rows that fail join filter. 
But in the `output_batch` batch, we only care about the rows with 
`buffered_indices` not null. Other rows with null indices are rows failed with 
equijoin predicates.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to