This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new c43328396085 [SPARK-55954][PYTHON] Remove the incorrect overload type
hint for fillna
c43328396085 is described below
commit c433283960859e08292e24e44d88bbaa2e2f104e
Author: Tian Gao <[email protected]>
AuthorDate: Wed Mar 11 13:46:26 2026 +0900
[SPARK-55954][PYTHON] Remove the incorrect overload type hint for fillna
### What changes were proposed in this pull request?
Remove the incorrect (and unnecessary) type hint overload for `fillna`.
### Why are the changes needed?
The current type hints are incorrect:
```python
overload
def fillna(
self,
value: "LiteralType",
subset: Optional[Union[str, Tuple[str, ...], List[str]]] = ...,
) -> ParentDataFrame:
...
overload
def fillna(self, value: Dict[str, "LiteralType"]) -> ParentDataFrame:
...
```
In python this means - if `subset` is provided, then `value` has to be
`LiteralType`, otherwise, `value` has to be a `Dict`. This is not what the
interface is.
I believe this overload meant to say "if value is a dict, subset is
ignored", but that's not what the overload suggests. `subset` can still be
provided (explicitly as `None` or even other value because it will just be
ignored) when `value` is a `Dict`, and even when `value` is a `LiteralType`,
`subset` could be `None`.
In this case, there's no way to do overload properly - because we just
accept any combination. We should just use the original type hint and throw
away the overloads.
Also that's why we need the `type: ignore` - because our overload is
incorrect. With our overloads, the function call is wrong.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
mypy passed locally.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #54742 from gaogaotiantian/fillna-type-hint.
Authored-by: Tian Gao <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/sql/classic/dataframe.py | 14 +-------------
python/pyspark/sql/connect/dataframe.py | 2 +-
python/pyspark/sql/dataframe.py | 12 ------------
3 files changed, 2 insertions(+), 26 deletions(-)
diff --git a/python/pyspark/sql/classic/dataframe.py
b/python/pyspark/sql/classic/dataframe.py
index 5b3bcaa9598f..243fa8a37b5a 100644
--- a/python/pyspark/sql/classic/dataframe.py
+++ b/python/pyspark/sql/classic/dataframe.py
@@ -1293,18 +1293,6 @@ class DataFrame(ParentDataFrame, PandasMapOpsMixin,
PandasConversionMixin):
return DataFrame(self._jdf.na().drop(thresh, self._jseq(subset)),
self.sparkSession)
- @overload
- def fillna(
- self,
- value: "LiteralType",
- subset: Optional[Union[str, Tuple[str, ...], List[str]]] = ...,
- ) -> ParentDataFrame:
- ...
-
- @overload
- def fillna(self, value: Dict[str, "LiteralType"]) -> ParentDataFrame:
- ...
-
def fillna(
self,
value: Union["LiteralType", Dict[str, "LiteralType"]],
@@ -1906,7 +1894,7 @@ class DataFrameNaFunctions(ParentDataFrameNaFunctions):
value: Union["LiteralType", Dict[str, "LiteralType"]],
subset: Optional[List[str]] = None,
) -> ParentDataFrame:
- return self.df.fillna(value=value, subset=subset) # type:
ignore[arg-type]
+ return self.df.fillna(value=value, subset=subset)
@overload
def replace(
diff --git a/python/pyspark/sql/connect/dataframe.py
b/python/pyspark/sql/connect/dataframe.py
index 114276f97bb0..9cb12d8388a5 100644
--- a/python/pyspark/sql/connect/dataframe.py
+++ b/python/pyspark/sql/connect/dataframe.py
@@ -2291,7 +2291,7 @@ class DataFrameNaFunctions(ParentDataFrameNaFunctions):
value: Union["LiteralType", Dict[str, "LiteralType"]],
subset: Optional[Union[str, Tuple[str, ...], List[str]]] = None,
) -> ParentDataFrame:
- return self.df.fillna(value=value, subset=subset) # type:
ignore[arg-type]
+ return self.df.fillna(value=value, subset=subset)
def drop(
self,
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index 72b86ca3f43e..2ee3e4e9d703 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -5131,18 +5131,6 @@ class DataFrame:
"""
...
- @overload
- def fillna(
- self,
- value: "LiteralType",
- subset: Optional[Union[str, Tuple[str, ...], List[str]]] = ...,
- ) -> "DataFrame":
- ...
-
- @overload
- def fillna(self, value: Dict[str, "LiteralType"]) -> "DataFrame":
- ...
-
@dispatch_df_method
def fillna(
self,
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]