Cool! Feel free to open a PR for this.
Best, Jingsong On Thu, Jan 1, 2026 at 7:06 PM Apollo Elon <[email protected]> wrote: > > Hi Jingsong, > > Thank you so much for your reply and suggestions—even during the holiday! I > really appreciate it. > > Initially, my intention for introducing new operators was to make the logic > under the EQUAL_NULL_SAFE branch more self-explanatory and readable. > However, I overlooked the fact that this approach would add extra files, > most of whose logic overlaps with existing ones. Thank you for pointing > this out! > > I now realize that the same goal can be achieved simply by modifying the > existing code like this: > > > case EQUAL_NULL_SAFE => > > BinaryPredicate.unapply(sparkPredicate) match { > > case Some((fieldName, literal)) => > > val index = fieldIndex(fieldName) > > if (literal == null) { > > builder.isNull(index) > > } else { > > PredicateBuilder.and( > > builder.isNotNull(index), > > builder.equal(index, convertLiteral(index, literal)) > > ) > > // builder.equal(index, convertLiteral(index, literal)) > > } > > } > > > > Will this modification achieve the desired result? Thanks again for your > valuable feedback! > > Best regards, > Apollo > > Jingsong Li <[email protected]> 于2026年1月1日周四 17:16写道: > > > Hi Apollo, > > > > Thanks for reporting this issue in dev. > > > > I think we can solve this without introducing new predicates. > > > > EqualNullSafe => > > > > 1. if literal is Null, just convert it to IsNull. > > 2. if literal is not Null, convert it to (col is not Null and col = > > literal). > > > > What do you think? > > > > Best, > > Jingsong > > > > On Thu, Jan 1, 2026 at 12:42 PM Apollo Elon <[email protected]> wrote: > > > > > > Hi Paimon Dev Team, > > > In GitHub Issue #6931 < > > https://github.com/apache/paimon/issues/6931>, > > > we identified a correctness issue in filter pushdown when Spark SQL uses > > > the null-safe equality operator (<=>). > > > The root cause is that Paimon currently treats both regular > > equality ( > > > =) and null-safe equality (<=>) as the same Equal operator during > > predicate > > > pushdown. Moreover, their negations are uniformly simplified into a > > generic > > > NotEqual predicate, which does not account for the distinct semantics of > > > !<=>—particularly the fact that it can be true when the column contains > > NULL > > > . > > > To address this, I propose introducing two new filter operators: > > > > > > - SafeEqual: Semantically identical to Equal (used when the literal is > > > non-null). > > > - NotSafeEqual: Specifically for !(col <=> literal), with a test() > > > method that respects null-safe semantics: > > > > > > @Override > > > > public boolean test( > > > > DataType type, long rowCount, Object min, Object max, Long > > nullCount, Object literal) { > > > > // According to the semantics of SafeEqual, > > > > // as long as the file contains "null", it meets the data > > condition. > > > > if(!Objects.isNull(nullCount) && nullCount > 0) { > > > > return true; > > > > } > > > > return compareLiteral(type, literal, min) != 0 || > > compareLiteral(type, literal, max) != 0; > > > > } > > > > > > > > I’ve also updated SparkV2FilterConverter.convert to properly route > > > EQUAL_NULL_SAFE: > > > > > > > case EQUAL_NULL_SAFE => > > > > sparkPredicate match { > > > > case BinaryPredicate(transform, literal) => > > > > if (literal == null) { > > > > builder.isNull(transform) > > > > } else { > > > > // builder.equal(transform, literal) > > > > builder.safeEqual(transform, literal) > > > > } > > > > case _ => > > > > throw new UnsupportedOperationException(s"Convert > > $sparkPredicate is unsupported.") > > > > } > > > > > > > > This ensures: > > > > > > - col <=> null → isNull(col) > > > - col <=> value → safeEqual(col, value) > > > - !(col <=> value) → notSafeEqual(col, value) > > > > > > With these changes, file skipping becomes both correct and efficient, > > > aligning Paimon’s behavior with Spark’s evaluation semantics. > > > > > > I’m happy to submit a PR for this fix and welcome any feedback on the > > > design. > > > > > > Best regards 😀 > >
