(spark) branch master updated: [SPARK-55507][PYTHON] Add None check for field.metadata in is_geometry and is_geography

wenchen Tue, 17 Feb 2026 06:24:42 -0800

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 0931371c06a4 [SPARK-55507][PYTHON] Add None check for field.metadata 
in is_geometry and is_geography
0931371c06a4 is described below

commit 0931371c06a4a8de39c81eb3d1fbcfa63e2dd6e9
Author: Yicong Huang <[email protected]>
AuthorDate: Tue Feb 17 22:24:07 2026 +0800

    [SPARK-55507][PYTHON] Add None check for field.metadata in is_geometry and 
is_geography
    
    ### What changes were proposed in this pull request?
    
    Add `field.metadata is not None` check in `is_geometry()` and 
`is_geography()` before accessing `field.metadata` with the `in` operator.
    
    ### Why are the changes needed?
    
    PyArrow struct fields have `metadata=None` by default. When 
`from_arrow_type()` encounters a struct with a field named `wkb` that has no 
metadata, `is_geometry()` / `is_geography()` crash with `TypeError: argument of 
type 'NoneType' is not iterable` because the code does `b"geometry" in 
field.metadata` without checking for `None` first.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No
    
    ### How was this patch tested?
    
    Verified with a dedicated unit test (see commit a3fbd9b) that constructs a 
PyArrow struct with `wkb` and `srid` fields without metadata, confirming the 
`TypeError` before the fix and a clean pass after. The test was then removed 
(commit da08fdc) to reduce test burden since the fix is a trivial one-line 
defensive check.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #54295 from Yicong-Huang/SPARK-55507/fix/is-geometry-none-metadata.
    
    Authored-by: Yicong Huang <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
---
 python/pyspark/sql/pandas/types.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/python/pyspark/sql/pandas/types.py 
b/python/pyspark/sql/pandas/types.py
index 19fb5930c075..3b3bacc34db3 100644
--- a/python/pyspark/sql/pandas/types.py
+++ b/python/pyspark/sql/pandas/types.py
@@ -317,6 +317,7 @@ def is_geometry(at: "pa.DataType") -> bool:
     return any(
         (
             field.name == "wkb"
+            and field.metadata is not None
             and b"geometry" in field.metadata
             and field.metadata[b"geometry"] == b"true"
         )
@@ -333,6 +334,7 @@ def is_geography(at: "pa.DataType") -> bool:
     return any(
         (
             field.name == "wkb"
+            and field.metadata is not None
             and b"geography" in field.metadata
             and field.metadata[b"geography"] == b"true"
         )


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55507][PYTHON] Add None check for field.metadata in is_geometry and is_geography

Reply via email to