comphead commented on code in PR #13134:
URL: https://github.com/apache/datafusion/pull/13134#discussion_r1821466636


##########
datafusion/physical-plan/src/joins/sort_merge_join.rs:
##########
@@ -784,6 +790,29 @@ fn get_corrected_filter_mask(
             corrected_mask.extend(vec![Some(false); null_matched]);
             Some(corrected_mask.finish())
         }
+        JoinType::LeftMark => {
+            for i in 0..row_indices_length {
+                let last_index =
+                    last_index_for_row(i, row_indices, batch_ids, 
row_indices_length);
+                if filter_mask.value(i) && !seen_true {

Review Comment:
   Please correct me if I'm wrong supposedly there are 2 options how to achieve 
LeftMark
   
   EXISTS
   ```
   SELECT 
       a.id,
       a.name,
       CASE 
           WHEN EXISTS (SELECT 1 FROM table_B AS b WHERE a.id = b.table_a_id) 
THEN 1
           ELSE 0
       END AS has_match
   FROM 
       table_A AS a;
   ```
   
   LeftOuter
   ```
   SELECT 
       a.id,
       a.name,
       CASE 
           WHEN b.table_a_id IS NOT NULL THEN 1
           ELSE 0
       END AS has_match
   FROM 
       table_A AS a
   LEFT JOIN 
       table_B AS b
   ON 
       a.id = b.table_a_id;
   ```
   
   The EXISTS approach looks overheated but LeftOuter interpretation  looks for 
me as a pretty standard  without extra performance issues? 
   
   I'm just wondering on introducing new join type what the value we can bring 
up? 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to