[
https://issues.apache.org/jira/browse/SPARK-49520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Haejoon Lee updated SPARK-49520:
--------------------------------
Fix Version/s: 3.5.4
(was: 3.5.3)
> ArrayRemove() Function Need Remove NULL Value
> ---------------------------------------------
>
> Key: SPARK-49520
> URL: https://issues.apache.org/jira/browse/SPARK-49520
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.2.1
> Environment: *Spark Version: 3.2.1*
> Reporter: Feng Jie
> Priority: Major
> Labels: Function, SQL
> Fix For: 3.2.1, 3.3.0, 3.2.2, 3.2.3, 3.2.4, 3.3.3, 3.4.2, 3.3.2,
> 3.4.0, 3.4.1, 3.5.0, 3.5.1, 3.3.4, 3.5.2, 3.4.3, 3.4.4, 3.5.4
>
> Attachments: image-2024-09-06-09-52-45-339.png
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> I want to calculate the intersection of two arrays like this:
> {noformat}
> select
> case when intersect_size > 0 then 1 else 0 end as is_include
> from (
> select
> size(array_intersect(array_a, array_b)) as intersect_size
> from table_a
> )
> {noformat}
>
> But, the NULL will affect the output:
> {code:java}
> SELECT size(array_intersect(array(1, 2, 3, null), array(null)))
> Output: 1 {code}
> So I want remove the NULL in first array by using {*}array_remove{*}:
> {code:java}
> SELECT array_remove(array(1, 2, 3, null, 3), null)
> Output: null{code}
> I want to add extra logic for function *array_remove* to remove NULL. Shall I
> overwrite the function (May be named: array_remove(array_a, array_b,
> isIgnoreNull)) or just fix the original function?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]