This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 94cfa3d0e431 [SPARK-55132][INFRA] Upgrade numpy version on lint image
94cfa3d0e431 is described below

commit 94cfa3d0e431522fb14d975b122a2907b885a163
Author: Tian Gao <[email protected]>
AuthorDate: Fri Jan 23 21:58:31 2026 +0800

    [SPARK-55132][INFRA] Upgrade numpy version on lint image
    
    ### What changes were proposed in this pull request?
    
    Upgrade numpy version on lint image and fixed some minor lint failures.
    
    ### Why are the changes needed?
    
    When we do `pip install ./dev/requirements.txt` locally, we normally have 
the latest version of `numpy`. This creates a diff between our local dev 
environment and CI. We should keep this as close as possible so we can rely on 
local mypy results.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Locally mypy test passed.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #53913 from gaogaotiantian/upgrade-lint-numpy.
    
    Authored-by: Tian Gao <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 dev/spark-test-image/lint/Dockerfile | 2 +-
 python/pyspark/pandas/frame.py       | 2 +-
 python/pyspark/pandas/series.py      | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/dev/spark-test-image/lint/Dockerfile 
b/dev/spark-test-image/lint/Dockerfile
index c76eb82b32b5..ea0d9ed3eb10 100644
--- a/dev/spark-test-image/lint/Dockerfile
+++ b/dev/spark-test-image/lint/Dockerfile
@@ -91,7 +91,7 @@ RUN python3.11 -m pip install \
     'jinja2' \
     'matplotlib' \
     'mypy==1.8.0' \
-    'numpy==2.0.2' \
+    'numpy==2.4.1' \
     'numpydoc' \
     'pandas' \
     'pandas-stubs' \
diff --git a/python/pyspark/pandas/frame.py b/python/pyspark/pandas/frame.py
index 1d0c0fc638b1..1c66bbec37b7 100644
--- a/python/pyspark/pandas/frame.py
+++ b/python/pyspark/pandas/frame.py
@@ -11293,7 +11293,7 @@ defaultdict(<class 'list'>, {'col..., 'col...})]
         """
         # Rely on dtype rather than spark type because columns that consist of 
bools and
         # Nones should be excluded if bool_only is True
-        return [label for label in column_labels if 
is_bool_dtype(self._psser_for(label))]  # type: ignore[arg-type]
+        return [label for label in column_labels if 
is_bool_dtype(self._psser_for(label))]
 
     def _result_aggregated(
         self, column_labels: List[Label], scols: Sequence[PySparkColumn]
diff --git a/python/pyspark/pandas/series.py b/python/pyspark/pandas/series.py
index 6407749c14fc..9c9ff94f2e16 100644
--- a/python/pyspark/pandas/series.py
+++ b/python/pyspark/pandas/series.py
@@ -1205,10 +1205,10 @@ class Series(Frame, IndexOpsMixin, Generic[T]):
                 else:
                     current = current.when(self.spark.column == 
F.lit(to_replace), value)
 
-            if hasattr(arg, "__missing__"):
-                tmp_val = arg[np._NoValue]  # type: ignore[attr-defined]
+            if isinstance(arg, dict) and hasattr(arg, "__missing__"):
+                tmp_val = arg[np._NoValue]
                 # Remove in case it's set in defaultdict.
-                del arg[np._NoValue]  # type: ignore[attr-defined]
+                del arg[np._NoValue]
                 current = current.otherwise(F.lit(tmp_val))
             else:
                 current = 
current.otherwise(F.lit(None).cast(self.spark.data_type))


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to