steven-aerts commented on a change in pull request #33191:
URL: https://github.com/apache/spark/pull/33191#discussion_r665567882
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala
##########
@@ -120,7 +120,7 @@ private[sql] object PruneFileSourcePartitions
case op @ PhysicalOperation(projects, filters,
v2Relation @ DataSourceV2ScanRelation(_, scan: FileScan, output))
- if filters.nonEmpty && scan.readDataSchema.nonEmpty =>
Review comment:
@cloud-fan I tried what you proposed and added `&&
scan.readPartitionSchema.nonEmpty`.
Problem is that this prevents any data filter from being pushed down when
there is no partition filter. As the right part of the condition at line 128 `
|| (dataFilters.nonEmpty && scan.dataFilters.isEmpty)` can then never be true.
This also causes [some regression tests in tue avroSuite to
fail](https://github.com/steven-aerts/spark/runs/3007716757?check_suite_focus=true).
So I rolled back to the original proposal.
Is this ok for you?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]