comphead opened a new pull request, #23192:
URL: https://github.com/apache/datafusion/pull/23192
## Rationale for this change
`array_compact(make_array(NULL, NULL, NULL))` returned `[NULL, NULL,
NULL]` instead of an empty array.
Root cause: `make_array(NULL, NULL, NULL)` has type `List(Null)`, whose
inner values are an Arrow `NullArray`. `NullArray::nulls()` returns `None` (it
has no validity buffer), so the default
`Array::is_null()` returns `false` for every index — even though every
element is logically null. The compaction loop saw "no nulls" and copied all
elements through unchanged.
## What changes are included in this PR?
- In `compact_list`, resolve the values' null mask once via
`values.logical_nulls()` and use that buffer for both the fast-path check and
the per-element null test. This correctly treats `NullArray` (and any
other type without a physical validity buffer) as all-null.
- Added a sqllogictest covering the untyped-NULL case: `select
array_compact(make_array(NULL, NULL, NULL))` → `[]`.
## Are these changes tested?
Yes — new test added in
`datafusion/sqllogictest/test_files/array/array_distinct.slt` alongside the
existing `array_compact` coverage. Existing `array_compact` tests continue to
pass.
## Are there any user-facing changes?
Yes — `array_compact` on a list of untyped NULLs now returns `[]`
(matching the typed-NULL behavior and user expectation) instead of preserving
the null elements. No API changes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]