>From an Arrow Datafusion Python consumer standpoint this would make using 
>datafusion so so much easier. Can’t speak to the ramifications to upstream or 
>other projects but would love to see this and help if needed

Thanks,
Jeremy Dyer

Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: Marko Grujic <mark...@gmail.com>
Sent: Wednesday, November 8, 2023 3:45:56 AM
To: dev@arrow.apache.org <dev@arrow.apache.org>
Subject: [DataFusion] Introduce qualified alias expressions

Hi all,

I would like to propose extending the `datafusion_expr::Expr` enum so as to
introduce a capability for aliases which can map to qualified
schemas/fields. Currently this is not possible, since the existing
`Expr::Alias` always maps to an unqualified field[1].

The detailed reasoning behind this is laid out in the accompanying issue
I've opened[2], and has to do with a problem I encountered while working on
a recent PR[3]. In brief, when a plan gets transformed during optimizations
into some other plan(s) which involve aggregation, there is currently no
way to abide by the original (qualified) schema, which is asserted as a
sanity check in-between each optimization[4].

I thus propose either adding a new enum variant (e.g.
`Expr::QualifiedAlias` in parallel with `Expr::QualifiedWildcard`), or
extending the existing `Expr::Alias` with an optional relation/qualifier.

I would love to know any thoughts on this, and whether this discussion is
perhaps better suited to the Discord channel or someplace else.

Thanks,
Marko

[1]
https://github.com/apache/arrow-datafusion/blob/656c6a93fadcec7bc43a8a881dfaf55388b0b5c6/datafusion/expr/src/expr_schema.rs#L285-L305
[2] https://github.com/apache/arrow-datafusion/issues/8008
[3]https://github.com/apache/arrow-datafusion/pull/7981
[4]
https://github.com/apache/arrow-datafusion/blob/724bafd4de98eff8d6ffd67942d29d0f9faf2aa3/datafusion/optimizer/src/optimizer.rs#L427-L452

Reply via email to