aokolnychyi edited a comment on pull request #35395: URL: https://github.com/apache/spark/pull/35395#issuecomment-1074348488
@rdblue @cloud-fan, I assumed the delete condition (not negated) would be explicitly passed to both scan builders by Spark. For instance, if the delete condition is `part_col = 'a' and id =1`, Spark would push it to the main scan builder and then provide an extra predicate on the filter attributes (e.g. `_file_name IN (...)`). Since the scan condition will be the same, data sources may cache and reuse some information between the scans. I can also see data sources delaying the actual split planning in the main scan up until they receive the runtime filter too. I guess there is a number of ways data sources can behave. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
