This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new c0e367a54b1b [SPARK-55747][SQL] Fix NPE when accessing elements from 
an array that is null
c0e367a54b1b is described below

commit c0e367a54b1bb1877d1a367af8d321aca59dff59
Author: Wenchen Fan <[email protected]>
AuthorDate: Sun Mar 1 23:06:47 2026 +0800

    [SPARK-55747][SQL] Fix NPE when accessing elements from an array that is 
null
    
    ### What changes were proposed in this pull request?
    The `GetArrayItem` expression incorrectly computed `nullable = false` when 
indexing into arrays with `containsNull = false` (e.g., from split()), even 
when the array itself could be null. This caused codegen to skip null checks, 
leading to NPE on `array.numElements()` during bounds checking.
    
    ### Why are the changes needed?
    To resolve NPE within spark engine.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Tests in this PR.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #54546 from stevomitric/stevomitric/fix-npe-codegen.
    
    Lead-authored-by: Wenchen Fan <[email protected]>
    Co-authored-by: Stevo Mitric <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
---
 .../sql/catalyst/expressions/complexTypeExtractors.scala  |  2 +-
 .../scala/org/apache/spark/sql/StringFunctionsSuite.scala | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
index dba061eeb870..f40077c53311 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
@@ -431,7 +431,7 @@ trait GetArrayItemUtil {
           true
       }
     } else {
-      if (failOnError) arrayElementNullable else true
+      if (failOnError) arrayElementNullable || child.nullable else true
     }
   }
 }
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala
index ff0ee19ae971..7bfc8cf4fa61 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala
@@ -1470,4 +1470,19 @@ class StringFunctionsSuite extends QueryTest with 
SharedSparkSession {
         Seq(Row("abc", "def")))
     }
   }
+
+  test("SPARK-55747: GetArrayItem NPE on null array from split() with ANSI 
enabled") {
+    // GetArrayItem.nullable was incorrectly computed as false when the array 
type has
+    // containsNull=false (e.g., from StringSplit) but the array itself can be 
null.
+    // This caused codegen to skip null checks, leading to NPE when calling
+    // array.numElements() on a null array during bounds checking.
+    withTable("t") {
+      sql("CREATE TABLE t (s STRING) USING parquet")
+      sql("INSERT INTO t VALUES ('a-b'), (null)")
+      checkAnswer(
+        sql("SELECT split(s, '-')[size(split(s, '-')) - 1] FROM t"),
+        Seq(Row("b"), Row(null))
+      )
+    }
+  }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to