zml1206 opened a new pull request, #6517: URL: https://github.com/apache/incubator-gluten/pull/6517
## What changes were proposed in this pull request? The Spark implementation of input_file_name uses a thread local to stash the file name and retrieve it from the function. If the `Project`containing input_file_name and scan contain a transform node, the result of input_file_name is an empty string. For example, read delta lake table need union checkpoint parquet file and json file, then order by `input_file_name` to get parquet data files, it will get wrong parquet file list. So we should push down input_file_name to transformer scan or add project before fallback scan ## How was this patch tested? UT -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
