pepijnve commented on issue #21231:
URL: https://github.com/apache/datafusion/issues/21231#issuecomment-4151111735
My first thought was to add something like this
```
pub trait PhysicalExpr {
fn column_index(&self) -> Option<usize> {
None
}
}
```
to avoid the downcast to `Column`.
For something like this (not real syntax, just making some educated guesses
based on what you said)
```
case x@0
when y@1 then array_transform(z@2, l@3 -> l@3 + 1)
end
```
where the lambda variable is an extra index compared to the parent scope, or
```
case x@0
when y@1 then array_transform(z@2, l@0' -> l@0' + 1)
end
```
where `0'` is a redefined index that's not going to be sufficient though.
Without knowledge of the existence of lambda functions you can't really do the
right thing. In the first alternative you would end up trying to project a
column index that doesn't exist from the perspective of the record batch coming
into the case expression. In the second alternative you would end up retaining
an existing column index, but that's actually a misinterpretation of the index.
One way around that might be to push the recursion into `PhysicalExpr` with
something like
```
fn required_column_indices(&self) -> Vec<usize> {
self.children()
.iter()
.map(|child| child.required_column_indices())
.flatten()
.sorted()
.unique()
.collect()
}
```
but I don't think that's going to get us to a solution for lambda functions
either. I only just started looking at the lambda PR. I'll continue with that
tomorrow; maybe I can find some inspiration there.
Short term, we might just have to disable the 'project' code path. It could
be turned into an optimisation instead with a dedicated 'project' wrapper
expression. That way you could at least opt-out if it's getting in the way (or
alternatively make it opt-in only).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]