[ 
https://issues.apache.org/jira/browse/SPARK-43439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17743548#comment-17743548
 ] 

Frederik Paradis commented on SPARK-43439:
------------------------------------------

Just went through the source code and it seems there is a different semantic to 
drop when passed strings vs Column objects. It seems that the documentation 
will highlight that in the next version:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L2850
https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py#L5144

However, it seems weird to me in Python that the arguments of this function 
have a different semantic than the select function which under the hood 
converts everything into Column objects. One suggestion I might do would be to 
create a drop_column function which would have the same semantic as select.

> Drop does not work when passed a string with an alias
> -----------------------------------------------------
>
>                 Key: SPARK-43439
>                 URL: https://issues.apache.org/jira/browse/SPARK-43439
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 3.4.0
>            Reporter: Frederik Paradis
>            Priority: Minor
>
> When passing a string to the drop method, if the string contains an alias, 
> the column is not dropped. However, passing a column object with the same 
> name and alias, it works.
> {code:python}
> from pyspark.sql import SparkSession
> import pyspark.sql.functions as F
> spark = 
> SparkSession.builder.master("local[1]").appName("local-spark-session").getOrCreate()
> df = spark.createDataFrame([(1, 10)], ["any", "hour"]).alias("a")
> j = df.drop("a.hour")
> print(j)  # DataFrame[any: bigint, hour: bigint]
> jj = df.drop(F.col("a.hour"))
> print(jj)  # DataFrame[any: bigint]
> {code}
>  
> Related issues:
> https://issues.apache.org/jira/browse/SPARK-31123
> https://issues.apache.org/jira/browse/SPARK-14759
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to