[GitHub] [spark] CodingCat commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

GitBox Wed, 30 Sep 2020 09:27:42 -0700


CodingCat commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-701500142



   > Probably, you'd be better to describe a bit more in the PR description;
   > example) currently, actual partition pruning is executed in the optimizer 
phase (`PruneFileSourcePartitions`) if an input relation has a catalog file 
index. The current code assumes the same partition filters are generated again 
in `FileSourceStrategy` and passed into `FileSourceScanExec`. 
`FileSourceScanExec` uses the partition filters when listing files, but [the 
filters do 
nothing](https://github.com/apache/spark/blob/cc06266ade5a4eb35089501a3b32736624208d4c/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L211-L213)
 because unnecessary partitions are already pruned in advance, so the filters 
are mainly used for explain output in this case. If a `WEHRE` clause has DNF-ed 
predicates, `FileSourceStrategy` cannot extract the same filters with 
`PruneFileSourcePartitions` and then `PartitionFilters` is not shown in explain 
output. In this PR, brabrabra....
   
   sure, added


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] CodingCat commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

Reply via email to