sunchao commented on a change in pull request #33584:
URL: https://github.com/apache/spark/pull/33584#discussion_r680263893
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanRelationPushDown.scala
##########
@@ -57,7 +57,11 @@ object V2ScanRelationPushDown extends Rule[LogicalPlan] with
PredicateHelper {
// `postScanFilters` and `pushedFilters` can overlap, e.g. the parquet
row group filter.
val (pushedFilters, postScanFiltersWithoutSubquery) =
PushDownUtils.pushFilters(
sHolder.builder, normalizedFiltersWithoutSubquery)
- val postScanFilters = postScanFiltersWithoutSubquery ++
normalizedFiltersWithSubquery
+ var postScanFilters = postScanFiltersWithoutSubquery ++
normalizedFiltersWithSubquery
+ val partitionFilters = PushDownUtils
+ .pushPartitionFilters(sHolder.builder, sHolder.relation,
normalizedFiltersWithoutSubquery)
+ postScanFilters =
+ (ExpressionSet(postScanFilters) --
partitionFilters.filter(_.references.nonEmpty)).toSeq
Review comment:
I'm not sure why we can't do this in the old code, where `filters`
contain both partition filters and data filters. Suppose we have a method
`DataSourceUtils.getPartitionKeyFiltersAndDataFilters`, then we can basically
do:
```scala
val (partitionFilters, dataFilters) =
DataSourceUtils.getPartitionKeyFiltersAndDataFilters(...)
if (dataFilters.isEmpty) {
// pushdown aggregates
}
```
?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]