[GitHub] [spark] LuciferYang opened a new pull request #30663: [SPARK-33700][SQL] Avoid file meta reading when enableFilterPushDown is true and filters is empty for Parquet and Orc

GitBox Mon, 21 Dec 2020 18:25:57 -0800


LuciferYang opened a new pull request #30663:
URL: https://github.com/apache/spark/pull/30663



   ### What changes were proposed in this pull request?
   Parquet support filter push down optimization, but this optimization will 
read file meta from external storage even if filters is empty, Orc has a 
similar problem. 
   
   This pr add a extra `filters.nonEmpty` when 
`spark.sql.parquet.filterPushdown` is true or `spark.sql.orc.filterPushdown` is 
true
   
   ### Why are the changes needed?
   Avoid unnecessary file reading.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Pass the Jenkins or GitHub Action
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] LuciferYang opened a new pull request #30663: [SPARK-33700][SQL] Avoid file meta reading when enableFilterPushDown is true and filters is empty for Parquet and Orc

Reply via email to