pepijnve commented on issue #21231:
URL: https://github.com/apache/datafusion/issues/21231#issuecomment-4151111735

   My first thought was to add something like this
   
   ```
   pub trait PhysicalExpr {
       fn column_index(&self) -> Option<usize> {
           None
       }
   }
   ```
   
   to avoid the downcast to `Column`.
   
   For something like this (not real syntax, just making some educated guesses 
based on what you said)
   
   ```
   case x@0
   when y@1 then array_transform(z@2, l@3 -> l@3 + 1)
   end
   ```
   
   where the lambda variable is an extra index compared to the parent scope, or
   
   ```
   case x@0
   when y@1 then array_transform(z@2, l@0' -> l@0' + 1)
   end
   ```
   
   where `0'` is a redefined index that's not going to be sufficient though. 
Without knowledge of the existence of lambda functions you can't really do the 
right thing. In the first alternative you would end up trying to project a 
column index that doesn't exist from the perspective of the record batch coming 
into the case expression. In the second alternative you would end up retaining 
an existing column index, but that's actually a misinterpretation of the index.
   
   One way around that might be to push the recursion into `PhysicalExpr` with 
something like
   ```
   fn required_column_indices(&self) -> Vec<usize> {
       self.children()
           .iter()
           .map(|child| child.required_column_indices())
           .flatten()
           .sorted()
           .unique()
           .collect()
   }
   ```
   but I don't think that's going to get us to a solution for lambda functions 
either. I only just started looking at the lambda PR. I'll continue with that 
tomorrow; maybe I can find some inspiration there.
   
   Short term, we might just have to disable the 'project' code path. It could 
be turned into an optimisation instead with a dedicated 'project' wrapper 
expression. That way you could at least opt-out if it's getting in the way (or 
alternatively make it opt-in only).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to