Hi Apollo,

Thanks for reporting this issue in dev.

I think we can solve this without introducing new predicates.

EqualNullSafe =>

1. if literal is Null, just convert it to IsNull.
2. if literal is not Null, convert it to (col is not Null and col = literal).

What do you think?

Best,
Jingsong

On Thu, Jan 1, 2026 at 12:42 PM Apollo Elon <[email protected]> wrote:
>
> Hi Paimon Dev Team,
>       In GitHub Issue #6931 <https://github.com/apache/paimon/issues/6931>,
> we identified a correctness issue in filter pushdown when Spark SQL uses
> the null-safe equality operator (<=>).
>       The root cause is that Paimon currently treats both regular equality (
> =) and null-safe equality (<=>) as the same Equal operator during predicate
> pushdown. Moreover, their negations are uniformly simplified into a generic
> NotEqual predicate, which does not account for the distinct semantics of
> !<=>—particularly the fact that it can be true when the column contains NULL
> .
> To address this, I propose introducing two new filter operators:
>
>    - SafeEqual: Semantically identical to Equal (used when the literal is
>    non-null).
>    - NotSafeEqual: Specifically for !(col <=> literal), with a test()
>    method that respects null-safe semantics:
>
> @Override
> > public boolean test(
> >         DataType type, long rowCount, Object min, Object max, Long 
> > nullCount, Object literal) {
> >     // According to the semantics of SafeEqual,
> >     // as long as the file contains "null", it meets the data condition.
> >     if(!Objects.isNull(nullCount) && nullCount > 0) {
> >         return true;
> >     }
> >     return compareLiteral(type, literal, min) != 0 || compareLiteral(type, 
> > literal, max) != 0;
> > }
> >
> >  I’ve also updated SparkV2FilterConverter.convert to properly route
> EQUAL_NULL_SAFE:
>
> > case EQUAL_NULL_SAFE =>
> >   sparkPredicate match {
> >     case BinaryPredicate(transform, literal) =>
> >       if (literal == null) {
> >         builder.isNull(transform)
> >       } else {
> >         // builder.equal(transform, literal)
> >         builder.safeEqual(transform, literal)
> >       }
> >     case _ =>
> >       throw new UnsupportedOperationException(s"Convert $sparkPredicate is 
> > unsupported.")
> >   }
> >
> >      This ensures:
>
>    - col <=> null → isNull(col)
>    - col <=> value → safeEqual(col, value)
>    - !(col <=> value) → notSafeEqual(col, value)
>
> With these changes, file skipping becomes both correct and efficient,
> aligning Paimon’s behavior with Spark’s evaluation semantics.
>
> I’m happy to submit a PR for this fix and welcome any feedback on the
> design.
>
> Best regards 😀

Reply via email to