huaxingao commented on a change in pull request #33650:
URL: https://github.com/apache/spark/pull/33650#discussion_r686106971
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownUtils.scala
##########
@@ -40,37 +40,43 @@ object PushDownUtils extends PredicateHelper {
def pushFilters(
scanBuilder: ScanBuilder,
filters: Seq[Expression]): (Seq[sources.Filter], Seq[Expression]) = {
+ // A map from translated data source leaf node filters to original
catalyst filter
+ // expressions. For a `And`/`Or` predicate, it is possible that the
predicate is partially
+ // pushed down. This map can be used to construct a catalyst filter
expression from the
+ // input filter, or a superset(partial push down filter) of the input
filter.
Review comment:
This method returns pushed down sources.Filters and post scan Filters
Expression. In the returned post scan Filters Expressions, we want the
partition Filters already have been removed so we don't need a second rule
(`PruneFileSourcePartitions`) to prune off the partition Filters.
We will separate the two types of filters for `FileScanBuilder`, and only
pass the data Filter to `ScanBuilder`.`pushFilters`. The separated partition
filters are set on `FileScanBuilder` in the format of `Expression` and are used
for partition pruning in
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/FileScan.scala#L138
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]