sdf-jkl commented on code in PR #20008:
URL: https://github.com/apache/datafusion/pull/20008#discussion_r2728255986


##########
datafusion/expr/src/udf.rs:
##########
@@ -709,20 +709,49 @@ pub trait ScalarUDFImpl: Debug + DynEq + DynHash + Send + 
Sync {
         Ok(ExprSimplifyResult::Original(args))
     }
 
-    /// Returns the [preimage] for this function and the specified scalar 
value, if any.
+    /// Returns a single contiguous preimage for this function and the 
specified
+    /// scalar expression, if any.
     ///
-    /// A preimage is a single contiguous [`Interval`] of values where the 
function
-    /// will always return `lit_value`
+    /// # Return Value
     ///
-    /// Implementations should return intervals with an inclusive lower bound 
and
-    /// exclusive upper bound.
+    /// Implementations should return a half-open interval: inclusive lower
+    /// bound and exclusive upper bound. Note that this is slightly different

Review Comment:
   It's extra complexity for calculating the upper bound in a preimage impl.
   
   For example for date_part we would just subtract one time unit from each 
value.
   
   Something like:
   ```rust
   fn date_to_scalar(date: NaiveDate, target_type: &DataType) -> 
Option<ScalarValue> {
       // --snip--
       Some(match target_type {
           Date32 => ScalarValue::Date32(Some(days as i32 - 1)),
           Date64 => ScalarValue::Date64(Some(days * MILLISECONDS_IN_DAY - 1))
           // utc timestamps are easy too
           // still figuring out how to make tz timestamps work
   ```
   
   
   Half open interval relies on logic and would make it more simple to 
implement a preimage function without adding extra calculations and less 
error-prone.
   
   I also think it looks more clear when using exclusive upper range:
   `<'01-01-2025T00:00:00'` vs `<='12-31-2024T23:59:999' `in sql `explain`.
   Timestamps would still look the same though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to