sweb opened a new pull request, #22892:
URL: https://github.com/apache/datafusion/pull/22892

   Closes #22687
   
   ## Rationale for this change
   
   The distance API in `datafusion/common/src/scalar/mod.rs` previously 
returned `Option<usize>`. `usize` is machine-width dependent and does not 
represent value-domain cardinality. This could lead to target-dependent 
behavior on large integer/temporal ranges. Additionally, downstream callers 
like `interval_arithmetic.rs` had to convert the distance back to `u64` to 
compute cardinality.
   
   Exposing an overflow-aware `u64`-oriented contract (`distance_u64`) resolves 
these architecture differences and aligns the API with value-domain semantics.
   
   ## What changes are included in this PR?
   
   - Added `distance_u64`: Added a new public method `distance_u64(&self, 
other: &ScalarValue) -> Option<u64>` to `ScalarValue`.
   - Deprecated `distance`: Marked the original `distance(&self, other: 
&ScalarValue) -> Option<usize>` method as deprecated and redirected it to call 
`distance_u64`.
   - Interval Cardinality: Migrated the cardinality calculation in 
`datafusion/expr-common/src/interval_arithmetic.rs` to use `distance_u64` 
directly.
   - Selectivity / Stats Overlap: Migrated the overlap calculations in 
`datafusion/common/src/stats.rs` to use `distance_u64`.
   - Boundary/Overflow Tests: Added `test_scalar_distance_u64_boundaries` in 
`scalar/mod.rs` to verify edge cases:
     - Full signed range edge (`i64::MIN` to `i64::MAX`)
     - Full unsigned range edge (`u64::MIN` to `u64::MAX`)
     - Large temporal range edge (`TimestampSecond` and `Date32` boundaries)
     - Overflow-to-None behavior (exceeding `u64::MAX` for Float, `Decimal128`, 
and `Decimal256` values)
   
   ## Are these changes tested?
   
   Yes, they are covered by the new unit tests in `datafusion-common` and 
existing test suites in both `datafusion-common` and `datafusion-expr-common`.
   
   ## Are there any user-facing changes?
   
   Yes, `ScalarValue::distance` has been deprecated in favor of 
`ScalarValue::distance_u64`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to