jorgecarleitao commented on pull request #9976:
URL: https://github.com/apache/arrow/pull/9976#issuecomment-817243506


   Isn't the filename that a column came from uniquely identified by the 
logical plan? If two physical plans arrive to different conclusions about a 
columns' provenance, then those physical plans are using two data origins, 
which implies different semantics.
   
   This is rationale I was using to recommend addressing this at the DAG level. 
I do agree that the logical plan may not have all the information about how the 
source is partitioned and its exact names (as that may even change with time), 
but I would expect to resolve that as part of the query execution (just like we 
resolve the physical plan when we run `SHOW PLAN`).
   
   @seddonm1 , I think that a physical expression does not need to be 
`ScalarFunctionExpr`: the `ScalarFunctionExpr` is useful for the cases where 
the physical operation can be described by a simple function and signature. 
Check e.g. how e.g. the binary operators are defined: they have their own 
custom `struct` that implements `PhysicalExpr`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to