Erik Parmann created SPARK-38554:
------------------------------------

             Summary: Dataframe API: withColumn "comment" parameter
                 Key: SPARK-38554
                 URL: https://issues.apache.org/jira/browse/SPARK-38554
             Project: Spark
          Issue Type: Wish
          Components: PySpark, Spark Core
    Affects Versions: 3.2.1
            Reporter: Erik Parmann


I often find that the right time to document a column in a dataframe is when I 
create it. It would be nice if withColumn took an optional comment parameter, 
then one could write e.g.

 
{code:java}
df = df.withColumn("tax", F.col("salary")*F.col("tax_percentage"), comment="The 
amount of tax payed in dollars.")
{code}
It is possible to do something similiar with alias, but as far as I know the 
equivalent would be the much more clunky:
{code:java}
df = df.withColumn("tax", F.col("salary")*F.col("tax_percentage")).alias("tax", 
metadata={"comment": "The amount of tax payed in dollars."}) {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to