xanderbailey opened a new issue, #17294:
URL: https://github.com/apache/datafusion/issues/17294
### Describe the bug
```rust
let values = vec![vec![lit(1).alias("column1"),
lit("hello").alias("column2")]];
let left = LogicalPlanBuilder::values(values.clone())?
.alias("left")?
.build()?;
let right = LogicalPlanBuilder::values(values)?
.alias("right")?
.build()?;
let join = LogicalPlanBuilder::from(left)
.join_with_expr_keys(
right,
JoinType::Left,
(vec![col("left.column1")], vec![col("right.column1")]),
None,
)?
.build()?;
let plan = LogicalPlanBuilder::from(join)
.project(vec![lit("hello").alias("column1"), col("left.column1")])?
.build()?;
```
Fails with:
```
Error: SchemaError(AmbiguousReference { field: Column { relation: Some(Bare
{ table: "left" }), name: "column1" } }, Some(""))
```
This is particularly important when datafusion coverts substrait plans since
column names / alias are stripped.
Consider the following case:
Create a null string column before a join and call it "column1", join the
dataset and construct a new column in a project which is also a null string
column called "column2". The schema after the join has `UTF8(NULL)` from the
left (relation: `left`) and another `UTF(NULL)` with no relation.
We could fix that for the substrait case by aliasing literals with a uuid
but this could still happen for any expression that returns a default name that
doesn't depend on the columns it uses (maybe current timestamp?)
### To Reproduce
_No response_
### Expected behavior
_No response_
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]