comphead opened a new pull request, #22140:
URL: https://github.com/apache/datafusion/pull/22140

   ## Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and 
enhancements and this helps us generate change logs for our releases. You can 
link an issue to this PR using the GitHub syntax. For example `Closes #123` 
indicates that this PR will close issue #123.
   -->
   
   - Closes #22137 .
   
    ## Rationale for this change
   
     `RANGE` window frames with a value offset (e.g. `RANGE BETWEEN 4 PRECEDING 
AND 4 FOLLOWING`) panicked with `attempt to add/subtract with overflow` 
whenever the boundary target (`value ± delta`) wrapped
      past the type's representable range. Affected inputs include values close 
to `i64::MAX`/`i64::MIN`, `u64::MAX`, and any analogous boundary for other 
integer/decimal/timestamp types.
   
     Two `// TODO: Handle ... overflows.` markers in 
`WindowFrameStateRange::calculate_index_of_row` had been left for this case; 
the unchecked `ScalarValue::add` / `sub` silently wrapped the target, after
     which `search_in_slice` was handed a nonsensical (wrapped) value and 
downstream code tripped a debug-assert subtraction in 
`functions-window/src/nth_value.rs`.
   
     Semantically, an overflowed boundary is *unbounded* with respect to the 
data in the partition — every real value lies strictly inside the wrapped 
sentinel — so the correct behavior is to collapse the
     search to the appropriate partition edge rather than to search with a 
wrapped target.
   
     ## What changes are included in this PR?
   
     `datafusion/expr/src/window_state.rs`
   
     - Replace `ScalarValue::add` / `sub` with their `*_checked` counterparts 
in the boundary computation.
     - On overflow, short-circuit to the correct partition edge: `search_start` 
for `PRECEDING`-direction searches, `length` for `FOLLOWING`-direction 
searches. The collapse direction depends only on the
     const-generic `SEARCH_SIDE` (the add branch and sub branch both reduce to 
`!SEARCH_SIDE` once you expand the `SEARCH_SIDE == is_descending` invariant 
that selects each arithmetic branch).
     - The pre-existing `value.is_unsigned() && value < delta` clamp-to-zero 
path for unsigned subtraction is preserved — it produces a valid polymorphic 
zero, not an overflow sentinel.
     - No behavior change on the non-overflow path.
   
     `datafusion/sqllogictest/test_files/window.slt`
   
     Regression coverage for positive and negative overflow, across:
     - `ASC` + `FOLLOWING` / `ASC` + `PRECEDING` / `DESC` + `PRECEDING` / 
`DESC` + `FOLLOWING` (each overflow direction occurs on both sort orders 
depending on which arithmetic branch is taken)
     - Symmetric `N PRECEDING AND N FOLLOWING` frames where only one side 
overflows
     - Signed (`i64`) and unsigned (`u64`) ordering columns
     - `first_value` and `last_value` both exercised to verify both frame edges
     - `ROWS` frame regression guard to document that the pre-existing 
`saturating_sub` / `min(length)` saturation behavior is unchanged.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to