alamb commented on code in PR #5509:
URL: https://github.com/apache/arrow-datafusion/pull/5509#discussion_r1131527905
##########
datafusion/expr/src/expr_rewriter.rs:
##########
@@ -365,6 +377,23 @@ pub fn normalize_col_with_schemas(
})
}
+pub fn normalize_col_with_schemas_and_ambiguity_check(
Review Comment:
Maybe we can leave a reference to
`Column::normalize_with_schemas_and_ambiguity_check` here:
```suggestion
/// See [`Column::normalize_with_schemas_and_ambiguity_check`] for usage
pub fn normalize_col_with_schemas_and_ambiguity_check(
```
##########
datafusion/common/src/column.rs:
##########
@@ -154,6 +158,105 @@ impl Column {
.collect(),
}))
}
+
+ /// Qualify column if not done yet.
+ ///
+ /// If this column already has a [relation](Self::relation), it will be
returned as is and the given parameters are
+ /// ignored. Otherwise this will search through the given schemas to find
the column.
+ ///
+ /// Will check for ambiguity at each level of `schemas`.
+ ///
+ /// A schema matches if there is a single column that -- when unqualified
-- matches this column. There is an
+ /// exception for `USING` statements, see below.
+ ///
+ /// # Using columns
+ /// Take the following SQL statement:
+ ///
+ /// ```sql
+ /// SELECT id FROM t1 JOIN t2 USING(id)
+ /// ```
+ ///
+ /// In this case, both `t1.id` and `t2.id` will match unqualified column
`id`. To express this possibility, use
+ /// `using_columns`. Each entry in this array is a set of columns that are
bound together via a `USING` clause. So
+ /// in this example this would be `[{t1.id, t2.id}]`.
+ ///
+ /// Regarding ambiguity check, `schemas` is structured to allow levels of
schemas to be passed in.
+ /// For example:
+ ///
+ /// ```text
+ /// schemas = &[
+ /// &[schema1, schema2], // first level
+ /// &[schema3, schema4], // second level
+ /// ]
+ /// ```
+ ///
+ /// Will search for a matching field in all schemas in the first level. If
a matching field according to above
+ /// mentioned conditions is not found, then will check the next level. If
found more than one matching column across
+ /// all schemas in a level, that isn't a USING column, will return an
error due to ambiguous column.
+ ///
+ /// If checked all levels and couldn't find field, will return field not
found error.
+ pub fn normalize_with_schemas_and_ambiguity_check(
+ self,
+ schemas: &[&[&DFSchema]],
Review Comment:
This syntax is nasty but it makes the callsites better, in my opinion
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]