Jefffrey commented on issue #8379:
URL:
https://github.com/apache/arrow-datafusion/issues/8379#issuecomment-1836817802
Actually I think I was off the mark on what `ExprId` is intended to do, it
seems it would be more useful if there were a new LogicalExpr enum such as
`AttributeReference`, which would refer to an expr from the parent plan by
ExprId
Like given a logical plan:
```
Projection: a.int_col, b.double_col, CAST(a.date_string_col AS Utf8)
Inner Join: a.int_col = b.int_col
SubqueryAlias: a
Projection: alltypes_plain.int_col, alltypes_plain.date_string_col
Filter: alltypes_plain.id > Int32(1)
TableScan: alltypes_plain projection=[id, int_col,
date_string_col], partial_filters=[alltypes_plain.id > Int32(1)]
SubqueryAlias: b
Projection: alltypes_plain.int_col, alltypes_plain.double_col
Filter: CAST(alltypes_plain.tinyint_col AS Float64) <
alltypes_plain.double_col
TableScan: alltypes_plain projection=[tinyint_col, int_col,
double_col], partial_filters=[CAST(alltypes_plain.tinyint_col AS Float64) <
alltypes_plain.double_col]
```
That top level projection has `a.int_col` as a `Column` for example, which
when turned into physical plan needs to search the parent schema by name
https://github.com/apache/arrow-datafusion/blob/a6e6d3fab083839239ef81cf3a3546dd8929a541/datafusion/core/src/physical_planner.rs#L879-L891
Whereas with exprid's, it could be possible for `a.int_col` to be an
AttributeReference which references the parent expr list to point to which expr
it references by id.
And I think each new expr would have a new ID.
Honestly I could be way off the mark here on the usages/benefits of exprid 😅
It's just something I was thinking about, especially in relation to how
verbose it can be to check if columns are the same when taking into account
table, schema and catalog parts of the identifier for a column
- See troubles with ambiguity check here
https://github.com/apache/arrow-datafusion/issues/6012
So instead of having to find the original column of a projected column in a
logical plan via name during logical optimization/physical planning, could have
that done once off in an analyzer rule pass then afterwards use exprids
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]