gruuya opened a new issue, #8008:
URL: https://github.com/apache/arrow-datafusion/issues/8008

   ### Is your feature request related to a problem or challenge?
   
   The problem I'm encountering is related to #7981. Namely: 
   - when a plan gets optimized into a equivalent representation using other 
primitive plans it is necessary to have the optimized output match the schema 
of the originally created plan
   - the primitive plan combination in question can involve columnized 
expressions as output field names (e.g. with aggregations), so in order to 
align it with the original schema aliasing is required
   - there in lies the problem—all logical plans with a schema use 
`exprlist_to_fields` to generate the initial schema, however this function will 
always 
[result](https://github.com/apache/arrow-datafusion/blob/656c6a93fadcec7bc43a8a881dfaf55388b0b5c6/datafusion/expr/src/expr_schema.rs#L285-L305)
 in a unqualified field for `Expr::Alias` unlike for `Expr::Column`, hence the 
schemas can never match
   
   ### Describe the solution you'd like
   
   Introduce a new enum variant along the lines of:
   ```rust
   pub struct QualifiedAlias {
       pub expr: Box<Expr>,
       pub relation: OwnedTableReference,
       pub name: String,
   }
   ```
   or alternatively extend the existing alias expression to accommodate for an 
optional relation:
   ```rust
   pub struct Alias {
       pub expr: Box<Expr>,
       pub relation: Option<OwnedTableReference>,
       pub name: String,
   }
   ```
   
   This would allow for 1-1 mapping between fields and column aliases.
   
   ### Describe alternatives you've considered
   
   Everything else seems like a hack:
   - always strip qualifiers with `DFSchema::strip_qualifiers` when building 
the schema in plan constructors; this will result in conflicts for joins
   - push the qualifier down into the field name like in 
a5cff4e622d4cb7ec51ab46dacd5710b45a5985d; the issue here is the unexpected 
derived column naming
   - make `ExprSchemable::to_field` try to parse the qualifer and name from an 
alias name; seems error prone, confusing and potentially problematic for column 
names with dots
   
   ### Additional context
   
   It seems like previously this could have been done with 
`Projection::try_new_with_schema`, but it looks like the overriding schema 
isn't being propagated through the optimizers after #7919.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to