pepijnve commented on PR #18055:
URL: https://github.com/apache/datafusion/pull/18055#issuecomment-3406549307

   > There is similar code for filtering here (namely that evaluates the filter 
expression first, and then only calles `filter` with columns that are needed)
   
   This touches on one of the things I was struggling with a bit working in the 
`PhysicalExpr` context rather than `ExecutionPlan`. While each `ExecutionPlan` 
is aware of its own input and output schema, `PhysicalExpr` does not. Instead 
the `Schema` is passed in as argument to functions like `nullable`. And for 
`evaluate` specifically, you get it via the `RecordBatch`. The consequence is 
that I have to `RecordBatch::project` which ends up deriving the same schema on 
every invocation. I wasn't sure how we could fix this.
   
   I already need the schema anyway in order to decide if it makes sense to 
project or not. One simple solution is to just keep a reference to that one. 
But things get a bit weird when a `PhysicalExpr` has a reference to a schema 
but also receives one externally when `nullable` and friends are called.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to