comphead commented on code in PR #10304:
URL: https://github.com/apache/datafusion/pull/10304#discussion_r1599216844


##########
datafusion/physical-plan/src/joins/sort_merge_join.rs:
##########
@@ -1363,6 +1380,57 @@ fn get_filter_column(
     filter_columns
 }
 
+// Get buffered data sliece by specific batch index and for specified column 
indices only
+#[inline(always)]
+fn get_buffered_columns(
+    buffered_data: &BufferedData,
+    buffered_batch_idx: usize,
+    buffered_indices: &UInt64Array,
+) -> Result<Vec<ArrayRef>, ArrowError> {
+    buffered_data.batches[buffered_batch_idx]
+        .batch
+        .columns()
+        .iter()
+        .map(|column| take(column, &buffered_indices, None))
+        .collect::<Result<Vec<_>, ArrowError>>()
+}
+
+// Calculate join filter bit mask considering join type specifics
+fn get_filtered_join_mask(
+    join_type: JoinType,
+    streamed_indices: UInt64Array,
+    mask: &BooleanArray,
+) -> Option<BooleanArray> {
+    // for LeftSemi Join the filter mask should be calculated in its own way:
+    // if we find at least one matching row for specific streaming index
+    // we dont need to check any others for the same index
+    if matches!(join_type, JoinType::LeftSemi) {
+        // have we seen a filter match for a streaming index before
+        let mut seen_as_true: bool = false;
+        let streamed_indices_length = streamed_indices.len();
+        let mut corrected_mask: Vec<bool> = vec![false; 
streamed_indices_length];

Review Comment:
   done, btw, I'm thinking why BooleanArray, doesnt support capacity with 
default values to achieve the same as 
   ```
   vec![false; streamed_indices_length];
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to