This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 39788da81acb [SPARK-55405][PYTHON][TESTS][FOLLOWUP] Skip PyArrow array
cast tests when numpy < 2.0
39788da81acb is described below
commit 39788da81acb7b0ad1de659fbc215137aa0a15e7
Author: Yicong Huang <[email protected]>
AuthorDate: Wed Feb 11 09:23:24 2026 +0800
[SPARK-55405][PYTHON][TESTS][FOLLOWUP] Skip PyArrow array cast tests when
numpy < 2.0
### What changes were proposed in this pull request?
Skip PyArrowScalarTypeCastTests and PyArrowNestedTypeCastTests when numpy <
2.0.0, as float16 behavior differs in older numpy versions. This follows the
same pattern used in test_pandas_udf_return_type.py.
Also simplified docstrings by removing numpy version-specific formatting
details, since we now only support numpy >= 2.0.0.
### Why are the changes needed?
Float16 handling in PyArrow < 21 varies across numpy versions, causing test
failures. Rather than maintaining version-specific overrides, we skip tests on
numpy < 2.0 following the established pattern in the codebase.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Ran the test suite with numpy >= 2.0 to verify tests pass, and confirmed
tests are skipped on numpy < 2.0.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #54228 from
Yicong-Huang/SPARK-55405/fix/float16-override-numpy-compat.
Lead-authored-by: Yicong Huang
<[email protected]>
Co-authored-by: Yicong-Huang
<[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
.../upstream/pyarrow/test_pyarrow_array_cast.py | 28 +++++++++++++---------
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py
b/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py
index 00335d4ae897..1395cbb69131 100644
--- a/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py
+++ b/python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py
@@ -65,13 +65,17 @@ from pyspark.loose_version import LooseVersion
from pyspark.testing.utils import (
have_pyarrow,
have_pandas,
+ have_numpy,
pyarrow_requirement_message,
pandas_requirement_message,
+ numpy_requirement_message,
)
from pyspark.testing.goldenutils import GoldenFileTestMixin
if have_pyarrow:
import pyarrow as pa
+if have_numpy:
+ import numpy as np
# ============================================================
@@ -215,8 +219,11 @@ class _PyArrowCastTestBase(GoldenFileTestMixin,
unittest.TestCase):
@unittest.skipIf(
- not have_pyarrow or not have_pandas,
- pyarrow_requirement_message or pandas_requirement_message,
+ not have_pyarrow
+ or not have_pandas
+ or not have_numpy
+ or LooseVersion(np.__version__) < LooseVersion("2.0.0"),
+ pyarrow_requirement_message or pandas_requirement_message or
numpy_requirement_message,
)
class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
"""
@@ -470,12 +477,9 @@ class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
Build overrides for known PyArrow version-dependent behaviors
(safe=True mode).
PyArrow < 21: str(scalar) for float16 uses numpy's formatting
- (via np.float16), which varies across numpy versions:
- - numpy >= 2.4: scientific notation (e.g. "3.277e+04")
- - numpy < 2.4: decimal (e.g. "32770.0")
+ (via np.float16), which may vary across numpy versions.
The golden file uses PyArrow >= 21 output (Python float).
- We compute the expected values dynamically to handle all
- numpy versions.
+ We compute the expected values dynamically to handle this difference.
"""
overrides = {}
if LooseVersion(pa.__version__) < LooseVersion("21.0.0"):
@@ -502,8 +506,7 @@ class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
Build overrides for known PyArrow version-dependent behaviors
(safe=False mode).
PyArrow < 21: str(scalar) for float16 uses numpy's formatting
- (via np.float16), which varies across numpy versions.
- Same dynamic computation approach as safe=True mode.
+ (via np.float16). Same dynamic computation approach as safe=True mode.
Additional overrides may be needed for different PyArrow versions
as safe=False behavior varies across versions.
"""
@@ -570,8 +573,11 @@ class PyArrowScalarTypeCastTests(_PyArrowCastTestBase):
@unittest.skipIf(
- not have_pyarrow or not have_pandas,
- pyarrow_requirement_message or pandas_requirement_message,
+ not have_pyarrow
+ or not have_pandas
+ or not have_numpy
+ or LooseVersion(np.__version__) < LooseVersion("2.0.0"),
+ pyarrow_requirement_message or pandas_requirement_message or
numpy_requirement_message,
)
class PyArrowNestedTypeCastTests(_PyArrowCastTestBase):
"""
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]