Re: [VOTE] SPIP: DSV2 Enhanced Partition Stats Filtering

Gengliang Wang Wed, 25 Feb 2026 14:47:21 -0800

Starting with my own +1.

On Wed, Feb 25, 2026 at 2:44 PM Gengliang Wang <[email protected]> wrote:


> Hi Spark devs,
>
> I'd like to call a vote on the SPIP*: DSV2 Enhanced Partition Stats
> Filtering.*
>
> *Summary:*
>
> The DataSource V2 (DSV2) framework does not currently provide full
> data-skipping capabilities comparable to Spark-native sources, primarily
> due to limitations in Catalyst expression evaluation.
>
> This SPIP bridges that gap to achieve partition-skipping parity. To
> support this, Spark will push new *PartitionPredicate* objects that
> encapsulate Catalyst partition filter expressions and the evaluation logic,
> allowing data sources to skip irrelevant partitions effectively.
>
> *Relevant Links:*
>
>    -
>
>    - *SPIP Doc:*
>    
> https://docs.google.com/document/d/17vcw411PxSRLWoK-BiLI56UiNdokLWtovF8JZUlDTOo/edit?usp=sharing
>    -
>
>    *Discuss Thread:*
>    https://lists.apache.org/thread/p2cwngj9bmtcbmyplds833s9lwts8bwc
>    -
>
>    *JIRA:* SPARK-55596 <https://issues.apache.org/jira/browse/SPARK-55596>
>    -
>
>    *POC PR:* PR 54459 <https://github.com/apache/spark/pull/54459>
>
> *The vote will be open for at least 72 hours. *Please vote:
>
> [ ] +1: Accept the proposal as an official SPIP
>
> [ ] +0
>
> [ ] -1: I don't think this is a good idea because ...
>
> Thanks,
> Gengliang Wang
>
>

Re: [VOTE] SPIP: DSV2 Enhanced Partition Stats Filtering

Reply via email to