(spark) branch master updated: [SPARK-55955][PYTHON] Remove overload type hint for drop

gurwls223 Tue, 10 Mar 2026 21:43:29 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new e5a900a80b84 [SPARK-55955][PYTHON] Remove overload type hint for drop
e5a900a80b84 is described below

commit e5a900a80b849a8d5d05372ba6174235aa70b6b2
Author: Tian Gao <[email protected]>
AuthorDate: Wed Mar 11 13:43:11 2026 +0900

    [SPARK-55955][PYTHON] Remove overload type hint for drop
    
    ### What changes were proposed in this pull request?
    
    Removed incorrect overload type hints for `drop`.
    
    Notice that a type test is changed - the test does not make sense because 
it assumes `drop(Column, Column)` would fail, even though we never document 
about it and it actually works.
    
    ### Why are the changes needed?
    
    ```python
        overload
        def drop(self, cols: "ColumnOrName") -> ParentDataFrame:
            ...
    
        overload
        def drop(self, *cols: str) -> ParentDataFrame:
            ...
    ```
    
    This means - if only one argument is provided, it can be either `str` or 
`Column`. If multiple arguments are provided, they have to be `str`. That's not 
true. We have no such indication in our documentation and our code treats `str` 
and `Column` the same. We can totally deal with multiple `Column`s as arguments.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    Local mypy passed.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #54744 from gaogaotiantian/drop-overload.
    
    Authored-by: Tian Gao <[email protected]>
    Signed-off-by: Hyukjin Kwon <[email protected]>
---
 python/pyspark/sql/classic/dataframe.py            | 10 +---------
 python/pyspark/sql/connect/dataframe.py            | 10 +---------
 python/pyspark/sql/dataframe.py                    | 10 +---------
 python/pyspark/sql/tests/typing/test_dataframe.yml |  7 -------
 4 files changed, 3 insertions(+), 34 deletions(-)

diff --git a/python/pyspark/sql/classic/dataframe.py 
b/python/pyspark/sql/classic/dataframe.py
index 854ab0ff89a0..5b3bcaa9598f 100644
--- a/python/pyspark/sql/classic/dataframe.py
+++ b/python/pyspark/sql/classic/dataframe.py
@@ -1723,15 +1723,7 @@ class DataFrame(ParentDataFrame, PandasMapOpsMixin, 
PandasConversionMixin):
         )
         return DataFrame(self._jdf.withMetadata(columnName, jmeta), 
self.sparkSession)
 
-    @overload
-    def drop(self, cols: "ColumnOrName") -> ParentDataFrame:
-        ...
-
-    @overload
-    def drop(self, *cols: str) -> ParentDataFrame:
-        ...
-
-    def drop(self, *cols: "ColumnOrName") -> ParentDataFrame:  # type: 
ignore[misc]
+    def drop(self, *cols: "ColumnOrName") -> ParentDataFrame:
         column_names: List[str] = []
         java_columns: List["JavaObject"] = []
 
diff --git a/python/pyspark/sql/connect/dataframe.py 
b/python/pyspark/sql/connect/dataframe.py
index f07a14403698..114276f97bb0 100644
--- a/python/pyspark/sql/connect/dataframe.py
+++ b/python/pyspark/sql/connect/dataframe.py
@@ -535,15 +535,7 @@ class DataFrame(ParentDataFrame):
         res._cached_schema = self._cached_schema
         return res
 
-    @overload
-    def drop(self, cols: "ColumnOrName") -> ParentDataFrame:
-        ...
-
-    @overload
-    def drop(self, *cols: str) -> ParentDataFrame:
-        ...
-
-    def drop(self, *cols: "ColumnOrName") -> ParentDataFrame:  # type: 
ignore[misc]
+    def drop(self, *cols: "ColumnOrName") -> ParentDataFrame:
         _cols = list(cols)
         if any(not isinstance(c, (str, Column)) for c in _cols):
             raise PySparkTypeError(
diff --git a/python/pyspark/sql/dataframe.py b/python/pyspark/sql/dataframe.py
index a1607df4ecef..72b86ca3f43e 100644
--- a/python/pyspark/sql/dataframe.py
+++ b/python/pyspark/sql/dataframe.py
@@ -5923,15 +5923,7 @@ class DataFrame:
         """
         ...
 
-    @overload
-    def drop(self, cols: "ColumnOrName") -> "DataFrame":
-        ...
-
-    @overload
-    def drop(self, *cols: str) -> "DataFrame":
-        ...
-
-    @dispatch_df_method  # type: ignore[misc]
+    @dispatch_df_method
     def drop(self, *cols: "ColumnOrName") -> "DataFrame":
         """
         Returns a new :class:`DataFrame` without specified columns.
diff --git a/python/pyspark/sql/tests/typing/test_dataframe.yml 
b/python/pyspark/sql/tests/typing/test_dataframe.yml
index 93ee8d8852f9..5e4b20d3588c 100644
--- a/python/pyspark/sql/tests/typing/test_dataframe.yml
+++ b/python/pyspark/sql/tests/typing/test_dataframe.yml
@@ -118,15 +118,8 @@
     df.drop("id")
     df.drop("id", "foo")
     df.drop(df.id)
-
     df.drop(col("id"), col("foo"))
 
-  out: |
-    main:10: error: No overload variant of "drop" of "DataFrame" matches 
argument types "Column", "Column"  [call-overload]
-    main:10: note: Possible overload variants:
-    main:10: note:     def drop(self, cols: Column | str) -> DataFrame
-    main:10: note:     def drop(self, *cols: str) -> DataFrame
-
 
 - case: fillNullValues
   main: |


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(spark) branch master updated: [SPARK-55955][PYTHON] Remove overload type hint for drop

Reply via email to