huan233usc opened a new pull request, #2592:
URL: https://github.com/apache/iceberg-rust/pull/2592

   ## Which issue does this PR close?
   
   Closes #2154.
   
   ## What changes are included in this PR?
   
   Previously, scalar-function pushdown only handled `isnan(...)` when the 
argument was a bare column reference; complex numeric arguments such as 
`isnan(qux + 1)` were silently dropped.
   
   This PR adds `resolve_nan_preserving_reference`, which resolves an `isnan` 
argument down to a single column `Reference` through transformations that 
preserve NaN-ness, so `isnan(<expr>)` can be pushed down as `<col> IS NAN`:
   
   - negation `-x`
   - `abs(x)`
   - numeric casts (date casts still rejected)
   - `x + c`, `c + x`, `x - c`, `c - x` for a finite literal `c`
   - `x * c`, `c * x`, `x / c` for a finite, non-zero literal `c`
   - arbitrary nesting of the above (e.g. `isnan(-(abs(qux) + 1) * 3)`)
   
   ### Why is this sound?
   
   Filter pushdown is reported as `Inexact`, so DataFusion re-applies the 
original predicate after scanning. The pushed-down predicate therefore only 
needs to be implied by the original filter (it may match extra rows, but must 
never drop a matching one). Every supported transformation keeps NaN-ness 
*exactly* equivalent — the result is NaN iff the wrapped column is NaN — so 
both `isnan(...)` and `NOT isnan(...)` remain correct.
   
   Cases that do **not** preserve NaN-ness are intentionally rejected (and 
covered by tests): `x * 0` / `x / 0` (`±inf * 0` is NaN), `c / x` (`0 / 0` is 
NaN), multi-column expressions, and unknown functions.
   
   This is intentionally more aggressive than Spark's `SparkV2Filters` (which 
only pushes Iceberg transform functions with ref/literal args), because that is 
exactly what the issue asks for, and Iceberg's `Reference` term cannot 
represent transform terms anyway.
   
   ## Are these changes tested?
   
   Yes. Updated the existing `isnan(qux + 1)` "unsupported" test and added unit 
tests covering negation, `abs`, additive and multiplicative forms, nested 
expressions, combination with other predicates, and the rejected-as-unsound 
cases. All unit tests in `expr_to_predicate` pass, along with `clippy` and 
`rustfmt`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to