[ 
https://issues.apache.org/jira/browse/SPARK-42401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce Robbins updated SPARK-42401:
----------------------------------
    Summary: Incorrect results or NPE when inserting null value into array 
using array_insert/array_append  (was: Incorrect results or NPE when inserting 
null value using array_insert/array_append)

> Incorrect results or NPE when inserting null value into array using 
> array_insert/array_append
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-42401
>                 URL: https://issues.apache.org/jira/browse/SPARK-42401
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.0, 3.5.0
>            Reporter: Bruce Robbins
>            Priority: Major
>              Labels: correctness
>
> Example:
> {noformat}
> create or replace temp view v1 as
> select * from values
> (array(1, 2, 3, 4), 5, 5),
> (array(1, 2, 3, 4), 5, null)
> as v1(col1,col2,col3);
> select array_insert(col1, col2, col3) from v1;
> {noformat}
> This produces an incorrect result:
> {noformat}
> [1,2,3,4,5]
> [1,2,3,4,0] <== should be [1,2,3,4,null]
> {noformat}
> A more succint example:
> {noformat}
> select array_insert(array(1, 2, 3, 4), 5, cast(null as int));
> {noformat}
> This also produces an incorrect result:
> {noformat}
> [1,2,3,4,0] <== should be [1,2,3,4,null]
> {noformat}
> Another example:
> {noformat}
> create or replace temp view v1 as
> select * from values
> (array('1', '2', '3', '4'), 5, '5'),
> (array('1', '2', '3', '4'), 5, null)
> as v1(col1,col2,col3);
> select array_insert(col1, col2, col3) from v1;
> {noformat}
> The above query throws a {{NullPointerException}}:
> {noformat}
> 23/02/10 11:08:05 ERROR SparkSQLDriver: Failed in [select array_insert(col1, 
> col2, col3) from v1]
> java.lang.NullPointerException
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown
>  Source)
>       at 
> org.apache.spark.sql.execution.LocalTableScanExec.$anonfun$unsafeRows$1(LocalTableScanExec.scala:44)
> {noformat}
> {{array_append}} has the same issue:
> {noformat}
> spark-sql> select array_append(array(1, 2, 3, 4), cast(null as int));
> [1,2,3,4,0] <== should be [1,2,3,4,null]
> Time taken: 3.679 seconds, Fetched 1 row(s)
> spark-sql> select array_append(array('1', '2', '3', '4'), cast(null as 
> string));
> 23/02/10 11:13:36 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
> java.lang.NullPointerException
>       at 
> org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.project_doConsume_0$(Unknown
>  Source)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to