hcrosse opened a new pull request, #21689: URL: https://github.com/apache/datafusion/pull/21689
## Which issue does this PR close? - Closes #18020 ## Rationale for this change `concat(array, array, ...)` in SQL dispatches to the string `concat` UDF, which only handles string and binary types. With array arguments it falls through to a debug-only path and returns wrong results. Two earlier attempts were rejected. [#18137](https://github.com/apache/datafusion/pull/18137) changed `ConcatFunc`'s signature to accept arrays and broke `simplify_expressions`. [#18105](https://github.com/apache/datafusion/pull/18105) duplicated `array_concat` logic inside `ConcatFunc`. This PR rewrites `concat(array, ...)` to `array_concat(array, ...)` at the analyzer phase. Every logical plan gets the corrected behavior regardless of frontend, the string `concat` signature stays untouched, and no array logic is duplicated. ## What changes are included in this PR? A new `ConcatArrayRewrite` `FunctionRewrite` lives in `datafusion-functions-nested`. It detects calls to `ConcatFunc` by identity check via `Any::is::<ConcatFunc>`, so user-level shadowing of `concat` such as Spark's variant is unaffected. When all args resolve to `List`, `LargeList`, or `FixedSizeList` it rewrites to `array_concat_udf()`. Mixed array and non-array returns a `plan_err`. The rewrite is wired into `SessionStateDefaults::default_function_rewrites()` and registered on the analyzer in `SessionStateBuilder::with_default_features()`, which is the actual default-init path. It's also registered via `FunctionRegistry::register_function_rewrite` in `functions_nested::register_all` as a fallback for custom registries. ## Are these changes tested? New SLT coverage in `array/array_concat.slt`: - 2- and 3-argument array concat, table-column concat, arrays with NULLs, string arrays - `LargeList` and `FixedSizeList` inputs - `NULL::integer[]` at either position, all-NULL case - Two error cases for mixed array and non-array `explain.slt` is updated to reflect the new `apply_function_rewrites` line that now appears in `EXPLAIN VERBOSE` output. ## Are there any user-facing changes? `concat(array, ...)` now returns correct `array_concat` results instead of the prior wrong output. No public API changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
