cloud-fan commented on code in PR #56237:
URL: https://github.com/apache/spark/pull/56237#discussion_r3338790209


##########
sql/core/src/test/resources/sql-tests/inputs/extract-value-resolution-edge-cases.sql:
##########
@@ -8,3 +8,13 @@ SELECT col1.a, a FROM t1 ORDER BY col1.a;
 SELECT split(col1, '-')[1] AS a FROM VALUES('a-b') ORDER BY split(col1, 
'-')[1];
 
 DROP TABLE t1;
+
+-- SPARK-57186: extracting a field/element/key from a NullType base returns 
NULL instead of
+-- throwing INVALID_EXTRACT_BASE_FIELD_TYPE (SQL NULL propagation; a NullType 
column can arise e.g.
+-- from schema evolution with missing columns). This applies uniformly to 
dotted field access
+-- (`col.a`) and the subscript forms (`col[0]`, `col['key']`), and is 
implemented at the
+-- user-facing resolution sites (ExtractValue.applyOrNull) without changing 
the shared
+-- ExtractValue.extractValue utility.
+SELECT col.a FROM (SELECT null AS col) t;
+SELECT col[0] FROM (SELECT null AS col) t;

Review Comment:
   Nit: `col[0]` / `col['key']` on a `NullType` base now produce an output 
column named `NULL` (analyzer plan: `Project [null AS NULL#x]`), whereas 
`col.a` is named `a` and a non-null `arr[0]` would be named `arr[0]`. Cosmetic 
only and an extreme edge case (NullType column + subscript), so not blocking -- 
just flagging the naming asymmetry in case a stable column name is preferable 
here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to