Hi Spark devs,

I'd like to call a vote on the SPIP*: DSV2 Enhanced Partition Stats
Filtering.*

*Summary:*

The DataSource V2 (DSV2) framework does not currently provide full
data-skipping capabilities comparable to Spark-native sources, primarily
due to limitations in Catalyst expression evaluation.

This SPIP bridges that gap to achieve partition-skipping parity. To support
this, Spark will push new *PartitionPredicate* objects that encapsulate
Catalyst partition filter expressions and the evaluation logic, allowing
data sources to skip irrelevant partitions effectively.

*Relevant Links:*

   -

   - *SPIP Doc:*
   
https://docs.google.com/document/d/17vcw411PxSRLWoK-BiLI56UiNdokLWtovF8JZUlDTOo/edit?usp=sharing
   -

   *Discuss Thread:*
   https://lists.apache.org/thread/p2cwngj9bmtcbmyplds833s9lwts8bwc
   -

   *JIRA:* SPARK-55596 <https://issues.apache.org/jira/browse/SPARK-55596>
   -

   *POC PR:* PR 54459 <https://github.com/apache/spark/pull/54459>

*The vote will be open for at least 72 hours. *Please vote:

[ ] +1: Accept the proposal as an official SPIP

[ ] +0

[ ] -1: I don't think this is a good idea because ...

Thanks,
Gengliang Wang

Reply via email to