ethan-tyler commented on PR #20071:
URL: https://github.com/apache/datafusion/pull/20071#issuecomment-3825022132

   > I think it should be much simpler than this: create a UDF and pass it 
through `ProjectionExprs::transform_exprs(|expr| expr.transform(|expr| // if 
expr is our ScalarUDF, replace with literal filename))`
   
   Thanks @adriangb - really appreciate the feedback! 
   
   I've been working on implementing your approach where `input_file_name()` 
stays as a 0-arg scalar UDF, then `ProjectionOpener::open(partitioned_file)` 
does `expr.transform(...)` to replace the `ScalarFunctionExpr` with 
`Literal(Utf8(partitioned_file.location))`.
   
   
   Does this direction look right to you?
   
   One question on plumbing: in many plans (e.g. `SELECT input_file_name() FROM 
t WHERE a > 1`), a top `ProjectionExec` can't push through `FilterExec` without 
dropping columns needed by the filter. This causes the UDF to stay above the 
scan and hit the runtime error guard.
   
   
   For this first PR, which is better?
   1. Minimal opener rewrite only: works only when projection naturally pushes 
into the scan
   2. Include analyzer/optimizer/planner glue: marks scan as needing the 
reserved column, injects UDF into file source projection so it works with 
filters/sorts/limits


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to