AdamGS opened a new pull request, #22978:
URL: https://github.com/apache/datafusion/pull/22978

   ## Which issue does this PR close?
   
   - Closes ##6051.
   - Part of #20135.
   
   ## Rationale for this change
   
   Adds useful metadata functions to DataFusion. This PR builds on 
@ethan-tyler's #20071. 
   
   ## What changes are included in this PR?
   
   1. A new `input_file_name()` UDF, which reports a UTF8 return_type. Like 
`file_row_index()`, it errors when evaluated out of context.
   2. Add rewrites in the `FileOpener` (Both in `ParquetOpener` and 
`ProjectionOpener`), making it a literal with the file's path.
   3. New public rewrite helper `rewrite_input_file_name_in_projection` in 
`datafusion-physical-expr-adapter`.
   
   ## Are these changes tested?
   
   1. Unit tests in all rewrite-sites and for the core rewrite logic
   2. New SLT tests working on CSV
   3. New SLT testing metadata functions specifically on Parquet which has its 
own opener. These include `file_row_index()`.
   
   ## Are there any user-facing changes?
   - The new UDF
   - New public function in `datafusion-physical-expr-adapter` - 
`rewrite_input_file_name_in_projection`.
   
   AI was used in this PR, mostly when helping to come up with test cases.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to