Jefffrey commented on code in PR #22658:
URL: https://github.com/apache/datafusion/pull/22658#discussion_r3339686342
##########
datafusion/sqllogictest/test_files/array/make_array.slt:
##########
@@ -163,4 +163,38 @@ from arrays_values_without_nulls;
[[31, 32, 33, 34, 35, 26, 37, 38, 39, 40]] [[31, 32, 33, 34, 35, 26, 37, 38,
39, 40], [8, 9]] [[31, 32, 33, 34, 35, 26, 37, 38, 39, 40], [50, 51, 52]]
+# make_array over structs built from a literal (non-nullable fields) and a
Review Comment:
these tests succeed on main without this fix
##########
datafusion/functions-nested/src/make_array.rs:
##########
@@ -256,3 +306,45 @@ pub fn coerce_types_inner(arg_types: &[DataType], name:
&str) -> Result<Vec<Data
)
}
}
+
+#[cfg(test)]
+mod tests {
+ use super::*;
+ use arrow::array::{Int32Array, StructArray};
+ use arrow::datatypes::Fields;
+
+ /// Builds a single-row struct array `{ i: 1 }` whose `i` field has the
+ /// given nullability flag. The underlying data is identical regardless of
+ /// the flag.
+ fn struct_array_with_field_nullable(nullable: bool) -> ArrayRef {
+ let field = Arc::new(Field::new("i", DataType::Int32, nullable));
+ let values = Arc::new(Int32Array::from(vec![1])) as ArrayRef;
+ Arc::new(StructArray::new(
+ Fields::from(vec![field]),
+ vec![values],
+ None,
+ ))
+ }
+
+ /// Reproduces https://github.com/apache/datafusion/issues/22366:
+ /// `make_array` over structs that differ only in nested-field nullability
+ /// must succeed (widening nullable flags) rather than panic in
+ /// `MutableArrayData`.
+ #[test]
+ fn make_array_relaxes_nested_field_nullability() {
Review Comment:
we might need to move this test somewhere like here instead
https://github.com/apache/datafusion/blob/main/datafusion/core/tests/sql/select.rs
so that we go through the signature machinery properly; preferably this
would just be an SLT test but i can't figure a way to get the
non-nullable+nullable mix working, see related issue:
- https://github.com/apache/datafusion/issues/22709
##########
datafusion/functions-nested/src/make_array.rs:
##########
@@ -198,13 +198,26 @@ pub fn array_array<O: OffsetSizeTrait>(
return plan_err!("Array requires at least one argument");
}
+ // Widen the element type so inputs that differ only in nested-field
Review Comment:
the changes to `array_array` is only because this is what we're testing in
the unit test i think? ideally we'd only need to fix this in the coerce
function, instead of having it twice (correct me if i'm wrong here 😅)
##########
datafusion/functions-nested/src/make_array.rs:
##########
@@ -241,8 +254,45 @@ pub fn array_array<O: OffsetSizeTrait>(
)?))
}
+/// Merge `data_types` into a single type, OR-ing the nullable flag at every
+/// nesting level (struct fields, list elements, ...). `Null` inputs are
+/// ignored.
+///
+/// Returns `None` when there is no non-null input or the inputs are
+/// structurally incompatible (different field names, element types, ...), in
+/// which case callers fall back to their previous behavior.
+///
+/// This is what allows e.g. `make_array(named_struct('i', 1), some_column)` to
+/// succeed when the literal yields a non-nullable field and the column yields
a
+/// nullable one: the result widens both to nullable. See issue #22366.
+fn merge_nullability(data_types: impl IntoIterator<Item = DataType>) ->
Option<DataType> {
+ let mut merged: Option<Field> = None;
+ for data_type in data_types {
+ if data_type.is_null() {
+ continue;
+ }
+ let field = Field::new(Field::LIST_FIELD_DEFAULT_NAME, data_type,
true);
+ match merged {
+ None => merged = Some(field),
+ // `Field::try_merge` ORs nullability and recurses into nested
types;
+ // it errors only when the structures are incompatible.
+ Some(ref mut m) => m.try_merge(&field).ok()?,
+ }
+ }
+ merged.map(|f| f.data_type().clone())
+}
+
pub fn coerce_types_inner(arg_types: &[DataType], name: &str) ->
Result<Vec<DataType>> {
if let Ok(unified) = try_type_union_resolution_with_struct(arg_types) {
+ // `try_type_union_resolution_with_struct` coerces the inner field
types
+ // but preserves each argument's own nested-field nullability, so the
+ // returned struct types can still differ in nullability. Widen them to
+ // a single common type, both so the declared return type matches the
+ // value produced at runtime and so the arguments can actually be
+ // combined. See issue #22366.
+ if let Some(merged) = merge_nullability(unified.iter().cloned()) {
Review Comment:
i do wonder if we're better off trying to integrate this fix within
`try_type_union_resolution_with_struct` itself 🤔
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]