neilconway commented on code in PR #22998:
URL: https://github.com/apache/datafusion/pull/22998#discussion_r3430260510


##########
datafusion/expr/src/expr_rewriter/mod.rs:
##########
@@ -80,28 +91,82 @@ pub fn normalize_col(expr: Expr, plan: &LogicalPlan) -> 
Result<Expr> {
     .data()
 }
 
+/// Resolve a normalized column to itself, or -- when it is the merged key of a
+/// `RIGHT` / `FULL` `USING` / `NATURAL` join referenced *unqualified* -- to 
the
+/// equivalent `COALESCE(left, right)`.

Review Comment:
   This results in handling `LEFT` and `RIGHT` in an asymmetric manner, which 
seems odd to me. What if we instead did:
   
   ```
     LEFT/INNER -> left key
     RIGHT      -> right key
     FULL       -> COALESCE(left key, right key)
   ```



##########
datafusion/expr/src/expr_rewriter/mod.rs:
##########
@@ -80,28 +91,82 @@ pub fn normalize_col(expr: Expr, plan: &LogicalPlan) -> 
Result<Expr> {
     .data()
 }
 
+/// Resolve a normalized column to itself, or -- when it is the merged key of a
+/// `RIGHT` / `FULL` `USING` / `NATURAL` join referenced *unqualified* -- to 
the
+/// equivalent `COALESCE(left, right)`.
+///
+/// The COALESCE is built as `CASE WHEN key IS NOT NULL THEN key ELSE other 
END`
+/// (the same form `coalesce` is simplified to, and buildable here without
+/// depending on the functions crate), aliased to the key name so the output
+/// column keeps its name.
+pub(crate) fn merged_using_key_or_column(
+    col: Column,
+    was_unqualified: bool,
+    outer_using_keys: &[(Column, Column)],
+) -> Result<Expr> {
+    if was_unqualified
+        && let Some((l, r)) = outer_using_keys
+            .iter()
+            .find(|(l, r)| l == &col || r == &col)
+    {
+        let other = if l == &col { r } else { l };
+        let case = CaseBuilder::new(
+            None,
+            vec![Expr::Column(col.clone()).is_not_null()],
+            vec![Expr::Column(col.clone())],
+            Some(Box::new(Expr::Column(other.clone()))),
+        )
+        .end()?;
+        return Ok(case.alias(col.name.clone()));
+    }
+    Ok(Expr::Column(col))
+}
+
 /// See [`Column::normalize_with_schemas_and_ambiguity_check`] for usage
 pub fn normalize_col_with_schemas_and_ambiguity_check(
     expr: Expr,
     schemas: &[&[&DFSchema]],
     using_columns: &[HashSet<Column>],
+) -> Result<Expr> {
+    normalize_col_with_schemas_ambiguity_and_outer_using(
+        expr,
+        schemas,
+        using_columns,
+        &[],
+    )
+}
+
+/// Like [`normalize_col_with_schemas_and_ambiguity_check`], but additionally
+/// resolves the merged key of a `RIGHT` / `FULL` `USING` / `NATURAL` join 
(given
+/// as `(left, right)` column pairs) to `COALESCE(left, right)`.
+pub fn normalize_col_with_schemas_ambiguity_and_outer_using(

Review Comment:
   Seems unfortunate to leak an implementation detail ("`USING` needs special 
handling for outer joins") into the public API surface.



##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -525,6 +525,39 @@ impl LogicalPlan {
         Ok(using_columns)
     }
 
+    /// Returns the `(left, right)` join-key column pairs of every `RIGHT` /
+    /// `FULL` `USING` / `NATURAL` join in this plan.
+    ///
+    /// For these join types the left key is NULL-padded on rows that exist 
only
+    /// on the right, so an unqualified reference to the merged key must 
resolve
+    /// to `COALESCE(left, right)` rather than to the left column alone.
+    pub fn outer_using_key_pairs(

Review Comment:
   The function name could be improved, since we handle some outer joins 
(right, full) but not others (left).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to