comphead commented on code in PR #13134:
URL: https://github.com/apache/datafusion/pull/13134#discussion_r1821466636
##########
datafusion/physical-plan/src/joins/sort_merge_join.rs:
##########
@@ -784,6 +790,29 @@ fn get_corrected_filter_mask(
corrected_mask.extend(vec![Some(false); null_matched]);
Some(corrected_mask.finish())
}
+ JoinType::LeftMark => {
+ for i in 0..row_indices_length {
+ let last_index =
+ last_index_for_row(i, row_indices, batch_ids,
row_indices_length);
+ if filter_mask.value(i) && !seen_true {
Review Comment:
Please correct me if I'm wrong supposedly there are 2 options how to achieve
LeftMark
EXISTS
```
SELECT
a.id,
a.name,
CASE
WHEN EXISTS (SELECT 1 FROM table_B AS b WHERE a.id = b.table_a_id)
THEN 1
ELSE 0
END AS has_match
FROM
table_A AS a;
```
LeftOuter
```
SELECT
a.id,
a.name,
CASE
WHEN b.table_a_id IS NOT NULL THEN 1
ELSE 0
END AS has_match
FROM
table_A AS a
LEFT JOIN
table_B AS b
ON
a.id = b.table_a_id;
```
The EXISTS approach looks overheated but LeftOuter interpretation looks for
me as a pretty standard without extra performance issues?
I'm just wondering on introducing new join type what the value we can bring
up? 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]