[ 
https://issues.apache.org/jira/browse/SPARK-49520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-49520:
--------------------------------
    Fix Version/s: 3.5.4
                       (was: 3.5.3)

> ArrayRemove() Function Need Remove NULL Value
> ---------------------------------------------
>
>                 Key: SPARK-49520
>                 URL: https://issues.apache.org/jira/browse/SPARK-49520
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.2.1
>         Environment: *Spark Version: 3.2.1*
>            Reporter: Feng Jie
>            Priority: Major
>              Labels: Function, SQL
>             Fix For: 3.2.1, 3.3.0, 3.2.2, 3.2.3, 3.2.4, 3.3.3, 3.4.2, 3.3.2, 
> 3.4.0, 3.4.1, 3.5.0, 3.5.1, 3.3.4, 3.5.2, 3.4.3, 3.4.4, 3.5.4
>
>         Attachments: image-2024-09-06-09-52-45-339.png
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> I want to calculate the intersection of two arrays like this: 
> {noformat}
> select   
>     case when intersect_size > 0 then 1 else 0 end as is_include
> from ( 
>     select 
>         size(array_intersect(array_a, array_b)) as intersect_size 
> from table_a
> )
> {noformat}
>  
> But, the NULL will affect the output:
> {code:java}
> SELECT size(array_intersect(array(1, 2, 3, null), array(null)))
> Output: 1 {code}
> So I want remove the NULL in first array by using {*}array_remove{*}:
> {code:java}
> SELECT array_remove(array(1, 2, 3, null, 3), null) 
> Output: null{code}
> I want to add extra logic for function *array_remove* to remove NULL. Shall I 
> overwrite the function (May be named: array_remove(array_a, array_b, 
> isIgnoreNull)) or just fix the original function?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to