wjones127 commented on code in PR #7467:
URL: https://github.com/apache/arrow-datafusion/pull/7467#discussion_r1320893999


##########
datafusion/physical-expr/src/intervals/interval_aritmetic.rs:
##########
@@ -686,6 +715,231 @@ fn calculate_cardinality_based_on_bounds(
     }
 }
 
+/// The null status of an [NullableInterval].
+///
+/// This is an internal convenience that can be used in match statements
+/// (unlike Interval).
+#[derive(Debug, Clone, PartialEq, Eq)]
+enum NullStatus {
+    /// The interval is guaranteed to be non-null.
+    Never,
+    /// The interval is guaranteed to be null.
+    Always,
+    /// The interval isn't guaranteed to never be null or always be null.
+    Maybe,
+}
+
+/// An [Interval] that also tracks null status using a boolean interval.
+///
+/// This represents values that may be in a particular range or be null.
+///
+/// # Examples
+///
+/// ```
+/// use datafusion_physical_expr::intervals::{Interval, NullableInterval};
+/// use datafusion_common::ScalarValue;
+///
+/// // [1, 2) U {NULL}
+/// NullableInterval {
+///    values: Interval::make(Some(1), Some(2), (false, true)),
+///    is_valid: Interval::UNCERTAIN,
+/// }
+///
+/// // (0, ∞)
+/// NullableInterval {
+///   values: Interval::make(Some(0), None, (true, true)),
+///   is_valid: Interval::CERTAINLY_TRUE,
+/// }
+///
+/// // {NULL}
+/// NullableInterval::from(ScalarValue::Int32(None))
+///
+/// // {4}
+/// NullableInterval::from(ScalarValue::Int32(4))
+/// ```
+#[derive(Debug, Clone, PartialEq, Eq, Default)]
+pub struct NullableInterval {
+    /// The interval for the values
+    pub values: Interval,
+    /// A boolean interval representing whether the value is certainly valid
+    /// (not null), certainly null, or has an unknown validity. This takes
+    /// precedence over the values in the interval: if this field is equal to
+    /// [Interval::CERTAINLY_FALSE], then the interval is certainly null
+    /// regardless of what `values` is.
+    pub is_valid: Interval,
+}

Review Comment:
   Hi @metesynnada. You're idea of using a pair of intervals has worked quite 
well. Since it's a new struct, this shouldn't impact the performance of your 
existing `cp_solver` code.
   
   I ended up not moving the `Interval` struct into `datafusion-common`. First, 
`datafusion-optimizer` already depends on `datafusion-physical-epxr`, so I 
didn't need to move it. Plus it has some dependencies within this crate that 
make it not easy to move. I think if we wanted to, we might be able to move it 
to `datafusion-expr`, but I'd rather leave that to a different PR.
   
   cc @ozankabak 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to