[GitHub] [spark] santosh-d3vpl3x commented on a diff in pull request #37335: [SPARK-39895][PYTHON] Support multiple column drop

GitBox Fri, 29 Jul 2022 07:01:16 -0700


santosh-d3vpl3x commented on code in PR #37335:
URL: https://github.com/apache/spark/pull/37335#discussion_r933285810



##########
python/pyspark/sql/dataframe.py:
##########
@@ -3244,10 +3244,14 @@ def drop(self, *cols: "ColumnOrName") -> "DataFrame":  
# type: ignore[misc]
             else:
                 raise TypeError("col should be a string or a Column")
         else:
-            for col in cols:
-                if not isinstance(col, str):
-                    raise TypeError("each col in the param list should be a 
string")
-            jdf = self._jdf.drop(self._jseq(cols))
+            if all(isinstance(col, str) for col in cols):
+                jdf = self._jdf.drop(self._jseq(cols))
+            elif all(isinstance(col, Column) for col in cols):
+                jdf = self._jdf
+                for col in cols:
+                    jdf = jdf.drop(col._jc)  # type: ignore[union-attr]

Review Comment:
   Welp, with that change it breaks existing `drop(col: Column)` public API for 
bindings. 
   ```
   py4j.Py4JException: Method drop([class org.apache.spark.sql.Column]) does 
not exist
   ```
   I have confirmed that to be the case for both python and sparkR. 
   
   I believe, addition of `drop(col: Column, cols: Column*)` or `drop(col: 
Column*)` on scala is pretty hard at the moment while trying to keep the 
compatibility.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] santosh-d3vpl3x commented on a diff in pull request #37335: [SPARK-39895][PYTHON] Support multiple column drop

Reply via email to