This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new e5a900a80b84 [SPARK-55955][PYTHON] Remove overload type hint for drop
e5a900a80b84 is described below
commit e5a900a80b849a8d5d05372ba6174235aa70b6b2
Author: Tian Gao <[email protected]>
AuthorDate: Wed Mar 11 13:43:11 2026 +0900
[SPARK-55955][PYTHON] Remove overload type hint for drop
### What changes were proposed in this pull request?
Removed incorrect overload type hints for `drop`.
Notice that a type test is changed - the test does not make sense because
it assumes `drop(Column, Column)` would fail, even though we never document
about it and it actually works.
### Why are the changes needed?
```python
overload
def drop(self, cols: "ColumnOrName") -> ParentDataFrame:
...
overload
def drop(self, *cols: str) -> ParentDataFrame:
...
```
This means - if only one argument is provided, it can be either `str` or
`Column`. If multiple arguments are provided, they have to be `str`. That's not
true. We have no such indication in our documentation and our code treats `str`
and `Column` the same. We can totally deal with multiple `Column`s as arguments.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Local mypy passed.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #54744 from gaogaotiantian/drop-overload.
Authored-by: Tian Gao <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
python/pyspark/sql/classic/dataframe.py | 10 +---------
python/pyspark/sql/connect/dataframe.py | 10 +---------
python/pyspark/sql/dataframe.py | 10 +---------
python/pyspark/sql/tests/typing/test_dataframe.yml | 7 -------
4 files changed, 3 insertions(+), 34 deletions(-)
diff --git a/python/pyspark/sql/classic/dataframe.py
b/python/pyspark/sql/classic/dataframe.py
index 854ab0ff89a0..5b3bcaa9598f 100644
--- a/python/pyspark/sql/classic/dataframe.py
+++ b/python/pyspark/sql/classic/dataframe.py
@@ -1723,15 +1723,7 @@ class DataFrame(ParentDataFrame, PandasMapOpsMixin,
PandasConversionMixin):
)
return DataFrame(self._jdf.withMetadata(columnName, jmeta),
self.sparkSession)
- @overload
- def drop(self, cols: "ColumnOrName") -> ParentDataFrame:
- ...
-
- @overload
- def drop(self, *cols: str) -> ParentDataFrame:
- ...
-
- def drop(self, *cols: "ColumnOrName") -> ParentDataFrame: # type:
ignore[misc]
+ def drop(self, *cols: "ColumnOrName") -> ParentDataFrame:
column_names: List[str] = []
java_columns: List["JavaObject"] = []
diff --git a/python/pyspark/sql/connect/dataframe.py
b/python/pyspark/sql/connect/dataframe.py
index f07a14403698..114276f97bb0 100644
--- a/python/pyspark/sql/connect/dataframe.py
+++ b/python/pyspark/sql/connect/dataframe.py
@@ -535,15 +535,7 @@ class DataFrame(ParentDataFrame):
res._cached_schema = self._cached_schema
return res
- @overload
- def drop(self, cols: "ColumnOrName") -> ParentDataFrame:
- ...
-
- @overload
- def drop(self, *cols: str) -> ParentDataFrame:
- ...
-
- def drop(self, *cols: "ColumnOrName") -> ParentDataFrame: # type:
ignore[misc]
+ def drop(self, *cols: "ColumnOrName") -> ParentDataFrame:
_cols = list(cols)
if any(not isinstance(c, (str, Column)) for c in _cols):
raise PySparkTypeError(
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index a1607df4ecef..72b86ca3f43e 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -5923,15 +5923,7 @@ class DataFrame:
"""
...
- @overload
- def drop(self, cols: "ColumnOrName") -> "DataFrame":
- ...
-
- @overload
- def drop(self, *cols: str) -> "DataFrame":
- ...
-
- @dispatch_df_method # type: ignore[misc]
+ @dispatch_df_method
def drop(self, *cols: "ColumnOrName") -> "DataFrame":
"""
Returns a new :class:`DataFrame` without specified columns.
diff --git a/python/pyspark/sql/tests/typing/test_dataframe.yml
b/python/pyspark/sql/tests/typing/test_dataframe.yml
index 93ee8d8852f9..5e4b20d3588c 100644
--- a/python/pyspark/sql/tests/typing/test_dataframe.yml
+++ b/python/pyspark/sql/tests/typing/test_dataframe.yml
@@ -118,15 +118,8 @@
df.drop("id")
df.drop("id", "foo")
df.drop(df.id)
-
df.drop(col("id"), col("foo"))
- out: |
- main:10: error: No overload variant of "drop" of "DataFrame" matches
argument types "Column", "Column" [call-overload]
- main:10: note: Possible overload variants:
- main:10: note: def drop(self, cols: Column | str) -> DataFrame
- main:10: note: def drop(self, *cols: str) -> DataFrame
-
- case: fillNullValues
main: |
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]