(spark) branch master updated: [SPARK-55405][PYTHON][TESTS][FOLLOWUP] Skip PyArrow array cast tests when numpy < 2.0

ruifengz Tue, 10 Feb 2026 17:23:50 -0800

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 39788da81acb [SPARK-55405][PYTHON][TESTS][FOLLOWUP] Skip PyArrow array 
cast tests when numpy < 2.0
39788da81acb is described below

commit 39788da81acb7b0ad1de659fbc215137aa0a15e7
Author: Yicong Huang <[email protected]>
AuthorDate: Wed Feb 11 09:23:24 2026 +0800

    [SPARK-55405][PYTHON][TESTS][FOLLOWUP] Skip PyArrow array cast tests when 
numpy < 2.0
    
    ### What changes were proposed in this pull request?
    
    Skip PyArrowScalarTypeCastTests and PyArrowNestedTypeCastTests when numpy < 
2.0.0, as float16 behavior differs in older numpy versions. This follows the 
same pattern used in test_pandas_udf_return_type.py.
    
    Also simplified docstrings by removing numpy version-specific formatting 
details, since we now only support numpy >= 2.0.0.
    
    ### Why are the changes needed?
    
    Float16 handling in PyArrow < 21 varies across numpy versions, causing test 
failures. Rather than maintaining version-specific overrides, we skip tests on 
numpy < 2.0 following the established pattern in the codebase.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Ran the test suite with numpy >= 2.0 to verify tests pass, and confirmed 
tests are skipped on numpy < 2.0.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #54228 from 
Yicong-Huang/SPARK-55405/fix/float16-override-numpy-compat.
    
    Lead-authored-by: Yicong Huang 
<[email protected]>
    Co-authored-by: Yicong-Huang 
<[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 .../upstream/pyarrow/test_pyarrow_array_cast.py    | 28 +++++++++++++---------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py 
b/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py
index 00335d4ae897..1395cbb69131 100644
--- a/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py
+++ b/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py
@@ -65,13 +65,17 @@ from pyspark.loose_version import LooseVersion
 from pyspark.testing.utils import (
     have_pyarrow,
     have_pandas,
+    have_numpy,
     pyarrow_requirement_message,
     pandas_requirement_message,
+    numpy_requirement_message,
 )
 from pyspark.testing.goldenutils import GoldenFileTestMixin
 
 if have_pyarrow:
     import pyarrow as pa
+if have_numpy:
+    import numpy as np
 
 
 # ============================================================
@@ -215,8 +219,11 @@ class _PyArrowCastTestBase(GoldenFileTestMixin, 
unittest.TestCase):
 
 
 @unittest.skipIf(
-    not have_pyarrow or not have_pandas,
-    pyarrow_requirement_message or pandas_requirement_message,
+    not have_pyarrow
+    or not have_pandas
+    or not have_numpy
+    or LooseVersion(np.__version__) < LooseVersion("2.0.0"),
+    pyarrow_requirement_message or pandas_requirement_message or 
numpy_requirement_message,
 )
 class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
     """
@@ -470,12 +477,9 @@ class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
         Build overrides for known PyArrow version-dependent behaviors 
(safe=True mode).
 
         PyArrow < 21: str(scalar) for float16 uses numpy's formatting
-        (via np.float16), which varies across numpy versions:
-          - numpy >= 2.4: scientific notation (e.g. "3.277e+04")
-          - numpy <  2.4: decimal (e.g. "32770.0")
+        (via np.float16), which may vary across numpy versions.
         The golden file uses PyArrow >= 21 output (Python float).
-        We compute the expected values dynamically to handle all
-        numpy versions.
+        We compute the expected values dynamically to handle this difference.
         """
         overrides = {}
         if LooseVersion(pa.__version__) < LooseVersion("21.0.0"):
@@ -502,8 +506,7 @@ class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
         Build overrides for known PyArrow version-dependent behaviors 
(safe=False mode).
 
         PyArrow < 21: str(scalar) for float16 uses numpy's formatting
-        (via np.float16), which varies across numpy versions.
-        Same dynamic computation approach as safe=True mode.
+        (via np.float16). Same dynamic computation approach as safe=True mode.
         Additional overrides may be needed for different PyArrow versions
         as safe=False behavior varies across versions.
         """
@@ -570,8 +573,11 @@ class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
 
 
 @unittest.skipIf(
-    not have_pyarrow or not have_pandas,
-    pyarrow_requirement_message or pandas_requirement_message,
+    not have_pyarrow
+    or not have_pandas
+    or not have_numpy
+    or LooseVersion(np.__version__) < LooseVersion("2.0.0"),
+    pyarrow_requirement_message or pandas_requirement_message or 
numpy_requirement_message,
 )
 class PyArrowNestedTypeCastTests(_PyArrowCastTestBase):
     """


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55405][PYTHON][TESTS][FOLLOWUP] Skip PyArrow array cast tests when numpy < 2.0

Reply via email to