[
https://issues.apache.org/jira/browse/SPARK-55256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kent Yao reassigned SPARK-55256:
--------------------------------
Assignee: Kent Yao
> [SQL] Support IGNORE NULLS / RESPECT NULLS for array_agg and collect_list
> -------------------------------------------------------------------------
>
> Key: SPARK-55256
> URL: https://issues.apache.org/jira/browse/SPARK-55256
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.2.0
> Reporter: Kent Yao
> Assignee: Kent Yao
> Priority: Major
> Labels: pull-request-available
>
> This PR adds support for the IGNORE NULLS and RESPECT NULLS clauses for
> array_agg and collect_list aggregate functions.
> The SQL standard and many databases (PostgreSQL, Snowflake, DuckDB, etc.)
> support the IGNORE NULLS / RESPECT NULLS syntax for aggregate functions.
> Currently, Spark only supports this syntax for window functions like first,
> last, lead, lag, and nth_value.
> By adding this support to array_agg and collect_list, users can explicitly
> control whether null values should be included in the resulting array:
> - array_agg(col) IGNORE NULLS - skips null values (default behavior)
> - array_agg(col) RESPECT NULLS - includes null values in the result
> Implementation Details:
> 1. Added ignoreNulls: Boolean = true parameter to CollectList class
> 2. array_agg now uses CollectList as they have identical behavior
> 3. Changed UnresolvedFunction.ignoreNulls from Boolean to Option[Boolean] to
> distinguish between None (use function default), Some(true) (IGNORE NULLS),
> Some(false) (RESPECT NULLS)
> 4. Consolidated ignoreNulls resolution logic in FunctionResolution with
> shared resolveIgnoreNulls and applyIgnoreNulls methods
> Users can now use IGNORE NULLS / RESPECT NULLS with array_agg and
> collect_list:
> SELECT array_agg(col IGNORE NULLS) FROM table;
> SELECT collect_list(col RESPECT NULLS) OVER (PARTITION BY id) FROM table;
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]