schenksj opened a new pull request, #4533: URL: https://github.com/apache/datafusion-comet/pull/4533
## Which issue does this PR close? Closes #4528. ## Rationale for this change DataFusion's `make_array` asserts strict element-type equality in `MutableArrayData::with_capacities` and panics on a mismatch. Spark's `CreateArray` coerces element types with `sameType`, which ignores nullability, so children that share a surface type but differ only in a nested struct field's nullability get no unifying cast. For example `array(struct(a not null), struct(a nullable))` reaches native execution with two different struct types and panics: ``` native panic: assertion `left == right` failed: Arrays with inconsistent types passed to MutableArrayData ``` This is a standalone fix; it was surfaced while working on the Delta Lake contrib integration (Delta's CDC write path builds `array(struct(...), struct(...))` plans with one struct per change type, leaving a `_change_type` field's nullability divergent across arms), so prioritizing it helps that effort, but it applies to any such plan. ## What changes are included in this PR? `CometCreateArray` now declines (falls back to Spark) when its children's types differ in a way `make_array` cannot handle. DataFusion **tolerates** container nullability differences (`ArrayType.containsNull` / `MapType.valueContainsNull` are coerced) but not a struct field's nullability, so the check normalizes container nullability before comparing and keeps struct field nullability significant — declining only the cases that actually panic. This avoids over-declining legitimate arrays of arrays/maps that differ only in `containsNull`. This tracks upstream apache/datafusion#22366; the caller-side decline can be removed once that fix lands. ## How are these changes tested? New test in `CometArrayExpressionSuite` builds `array(struct(id, ct not null), struct(id, ct nullable))` and asserts correct results. The test fails on `main` with the native `MutableArrayData` panic and passes with this change. The full `CometArrayExpressionSuite` (40/40) passes, including `arrays_overlap - nested array null handling` which exercises arrays differing only in `containsNull` and must still run natively. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
