(spark) branch master updated: [SPARK-55954][PYTHON] Remove the incorrect overload type hint for fillna

gurwls223 Tue, 10 Mar 2026 21:47:53 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new c43328396085 [SPARK-55954][PYTHON] Remove the incorrect overload type 
hint for fillna
c43328396085 is described below

commit c433283960859e08292e24e44d88bbaa2e2f104e
Author: Tian Gao <[email protected]>
AuthorDate: Wed Mar 11 13:46:26 2026 +0900

    [SPARK-55954][PYTHON] Remove the incorrect overload type hint for fillna
    
    ### What changes were proposed in this pull request?
    
    Remove the incorrect (and unnecessary) type hint overload for `fillna`.
    
    ### Why are the changes needed?
    
    The current type hints are incorrect:
    
    ```python
        overload
        def fillna(
            self,
            value: "LiteralType",
            subset: Optional[Union[str, Tuple[str, ...], List[str]]] = ...,
        ) -> ParentDataFrame:
            ...
    
        overload
        def fillna(self, value: Dict[str, "LiteralType"]) -> ParentDataFrame:
            ...
    ```
    
    In python this means - if `subset` is provided, then `value` has to be 
`LiteralType`, otherwise, `value` has to be a `Dict`. This is not what the 
interface is.
    
    I believe this overload meant to say "if value is a dict, subset is 
ignored", but that's not what the overload suggests. `subset` can still be 
provided (explicitly as `None` or even other value because it will just be 
ignored) when `value` is a `Dict`, and even when `value` is a `LiteralType`, 
`subset` could be `None`.
    
    In this case, there's no way to do overload properly - because we just 
accept any combination. We should just use the original type hint and throw 
away the overloads.
    
    Also that's why we need the `type: ignore` - because our overload is 
incorrect. With our overloads, the function call is wrong.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    mypy passed locally.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #54742 from gaogaotiantian/fillna-type-hint.
    
    Authored-by: Tian Gao <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/sql/classic/dataframe.py | 14 +-------------
 python/pyspark/sql/connect/dataframe.py |  2 +-
 python/pyspark/sql/dataframe.py         | 12 ------------
 3 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/python/pyspark/sql/classic/dataframe.py 
b/python/pyspark/sql/classic/dataframe.py
index 5b3bcaa9598f..243fa8a37b5a 100644
--- a/python/pyspark/sql/classic/dataframe.py
+++ b/python/pyspark/sql/classic/dataframe.py
@@ -1293,18 +1293,6 @@ class DataFrame(ParentDataFrame, PandasMapOpsMixin, 
PandasConversionMixin):
 
         return DataFrame(self._jdf.na().drop(thresh, self._jseq(subset)), 
self.sparkSession)
 
-    @overload
-    def fillna(
-        self,
-        value: "LiteralType",
-        subset: Optional[Union[str, Tuple[str, ...], List[str]]] = ...,
-    ) -> ParentDataFrame:
-        ...
-
-    @overload
-    def fillna(self, value: Dict[str, "LiteralType"]) -> ParentDataFrame:
-        ...
-
     def fillna(
         self,
         value: Union["LiteralType", Dict[str, "LiteralType"]],
@@ -1906,7 +1894,7 @@ class DataFrameNaFunctions(ParentDataFrameNaFunctions):
         value: Union["LiteralType", Dict[str, "LiteralType"]],
         subset: Optional[List[str]] = None,
     ) -> ParentDataFrame:
-        return self.df.fillna(value=value, subset=subset)  # type: 
ignore[arg-type]
+        return self.df.fillna(value=value, subset=subset)
 
     @overload
     def replace(
diff --git a/python/pyspark/sql/connect/dataframe.py 
b/python/pyspark/sql/connect/dataframe.py
index 114276f97bb0..9cb12d8388a5 100644
--- a/python/pyspark/sql/connect/dataframe.py
+++ b/python/pyspark/sql/connect/dataframe.py
@@ -2291,7 +2291,7 @@ class DataFrameNaFunctions(ParentDataFrameNaFunctions):
         value: Union["LiteralType", Dict[str, "LiteralType"]],
         subset: Optional[Union[str, Tuple[str, ...], List[str]]] = None,
     ) -> ParentDataFrame:
-        return self.df.fillna(value=value, subset=subset)  # type: 
ignore[arg-type]
+        return self.df.fillna(value=value, subset=subset)
 
     def drop(
         self,
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index 72b86ca3f43e..2ee3e4e9d703 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -5131,18 +5131,6 @@ class DataFrame:
         """
         ...
 
-    @overload
-    def fillna(
-        self,
-        value: "LiteralType",
-        subset: Optional[Union[str, Tuple[str, ...], List[str]]] = ...,
-    ) -> "DataFrame":
-        ...
-
-    @overload
-    def fillna(self, value: Dict[str, "LiteralType"]) -> "DataFrame":
-        ...
-
     @dispatch_df_method
     def fillna(
         self,


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55954][PYTHON] Remove the incorrect overload type hint for fillna

Reply via email to