schenksj commented on issue #22366: URL: https://github.com/apache/datafusion/issues/22366#issuecomment-4581492644
Update from the Comet side, with a finding that narrows the scope of this. **Empirical asymmetry in `MutableArrayData`:** while implementing the caller-side mitigation I found that `make_array` already tolerates **container** nullability differences — an `ArrayType.containsNull` / `MapType.valueContainsNull` mismatch is coerced and runs fine (e.g. `array(array(int) <non-null elem>, array(int) <nullable elem>)`). The panic fires **only** on a `StructType` field's nullability mismatch. So the strictness is really in the struct branch of `with_capacities`, not the list/map branches. The OR-merge-at-every-level proposal above is a clean superset and still correct, but the minimal fix may only need to touch struct field merging. **Corrected caller-side reference:** the "Related caller-side mitigation" note links a placeholder commit. The actual Comet fix is apache/datafusion-comet#4533. It declines (falls back to Spark) **only** when the children still differ after normalizing container nullability — i.e. exactly the struct-field-nullability case — so it does not over-decline the container-only arrays that already work natively. It's still conservative (loses native execution for the struct case); upstreaming the relaxation here would remove the need for it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
