morrySnow opened a new pull request, #64718:
URL: https://github.com/apache/doris/pull/64718
## Problem
The USING JOIN merge key selection was always using the left-side USING
column as the output key, which leads to bugs:
| Issue | Join Type | Bug Behavior |
|-------|-----------|-------------|
| **DORIS-25927** | RIGHT/FULL OUTER JOIN | Right-only rows output NULL for
the merge key (left side is NULL) |
| **DORIS-26080** | RIGHT/OUTER JOIN | Merge key incorrectly selected from
wrong side |
| **DORIS-26085** | RIGHT SEMI JOIN | Merge key dropped from output entirely
|
## Root Cause
In `BindExpression.bindUsingJoin()`, the `exceptAsteriskOutputs` was always
set to the right-side USING columns and passed to `LogicalJoin`, regardless of
join type. This meant the left-side column always served as the merge key in
`SELECT *`.
## Solution
Move merge key logic from `LogicalJoin` to a `LogicalProject` wrapping the
join:
1. **Remove `exceptAsteriskOutputs` from `LogicalJoin`** — the join should
output all columns normally.
2. **Add `exceptAsteriskOutputs` to `LogicalProject`** with
`DiffOutputInAsterisk` support.
3. **In `bindUsingJoin()`**, wrap the `LogicalJoin` with a `LogicalProject`
that selects the correct merge key based on join type:
- LEFT/INNER/LEFT SEMI/LEFT ANTI → `Alias(left_slot, name)`
- RIGHT/RIGHT SEMI/RIGHT ANTI → `Alias(right_slot, name)` (preserved side)
- FULL OUTER → `Alias(Coalesce(left_slot, right_slot), name)`
- Qualified USING columns (`t1.a`, `t2.a`) are excluded from asterisk
output; only the unqualified merge key appears in `SELECT *`.
## Files Changed
| File | Change |
|------|--------|
| `LogicalJoin.java` | Remove `exceptAsteriskOutputs` field and related
methods |
| `LogicalProject.java` | Add `exceptAsteriskOutputs` field +
`DiffOutputInAsterisk` support |
| `BindExpression.java` | Rewrite `bindUsingJoin()` to add merge-key Project
on top of Join |
| `LogicalPlanDeepCopier.java` | Copy `exceptAsteriskOutputs` when
deep-copying LogicalProject |
| `BindUsingJoinTest.java` | New test file with 14 test scenarios |
## Behavior
For `t1(a, b) RIGHT JOIN t2(a, c) USING(a)` with a right-only row (t2.a=5,
t2.c=10, no match in t1):
| Reference | Result |
|-----------|--------|
| `SELECT *` | `a=5, b=NULL, c=10` |
| `SELECT a` | `5` (unqualified → merge key from right preserved side) |
| `SELECT t1.a` | `NULL` (qualified → join original value) |
| `SELECT t2.a` | `5` (qualified → join original value) |
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]