Thank you for raising this Marko -- for anyone else following along there is some great discussion on the ticket and I would encourage anyone who is interested to share your thoughts there as well.
On Wed, Nov 8, 2023 at 7:24 PM Jeremy Dyer <jdy...@gmail.com> wrote: > From an Arrow Datafusion Python consumer standpoint this would make using > datafusion so so much easier. Can’t speak to the ramifications to upstream > or other projects but would love to see this and help if needed > > Thanks, > Jeremy Dyer > > Get Outlook for iOS<https://aka.ms/o0ukef> > ________________________________ > From: Marko Grujic <mark...@gmail.com> > Sent: Wednesday, November 8, 2023 3:45:56 AM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: [DataFusion] Introduce qualified alias expressions > > Hi all, > > I would like to propose extending the `datafusion_expr::Expr` enum so as to > introduce a capability for aliases which can map to qualified > schemas/fields. Currently this is not possible, since the existing > `Expr::Alias` always maps to an unqualified field[1]. > > The detailed reasoning behind this is laid out in the accompanying issue > I've opened[2], and has to do with a problem I encountered while working on > a recent PR[3]. In brief, when a plan gets transformed during optimizations > into some other plan(s) which involve aggregation, there is currently no > way to abide by the original (qualified) schema, which is asserted as a > sanity check in-between each optimization[4]. > > I thus propose either adding a new enum variant (e.g. > `Expr::QualifiedAlias` in parallel with `Expr::QualifiedWildcard`), or > extending the existing `Expr::Alias` with an optional relation/qualifier. > > I would love to know any thoughts on this, and whether this discussion is > perhaps better suited to the Discord channel or someplace else. > > Thanks, > Marko > > [1] > > https://github.com/apache/arrow-datafusion/blob/656c6a93fadcec7bc43a8a881dfaf55388b0b5c6/datafusion/expr/src/expr_schema.rs#L285-L305 > [2] https://github.com/apache/arrow-datafusion/issues/8008 > [3]https://github.com/apache/arrow-datafusion/pull/7981 > [4] > > https://github.com/apache/arrow-datafusion/blob/724bafd4de98eff8d6ffd67942d29d0f9faf2aa3/datafusion/optimizer/src/optimizer.rs#L427-L452 >