[ https://issues.apache.org/jira/browse/SPARK-25333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-25333: ------------------------------------ Assignee: Apache Spark > Ability to add new columns in the beginning of a Dataset > -------------------------------------------------------- > > Key: SPARK-25333 > URL: https://issues.apache.org/jira/browse/SPARK-25333 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Walid Mellouli > Assignee: Apache Spark > Priority: Minor > Original Estimate: 2h > Remaining Estimate: 2h > > When we add new columns in a Dataset, they are added automatically at the end > of the Dataset. > {code:java} > val df = sc.parallelize(Seq(1, 2, 3)).toDF > df.printSchema > root > |-- value: integer (nullable = true) > {code} > When we add a new column: > {code:java} > val newDf = df.withColumn("newColumn", col("value") + 1) > newDf.printSchema > root > |-- value: integer (nullable = true) > |-- newColumn: integer (nullable = true) > {code} > Generally users want to add new columns either at the end or in the > beginning, depends on use cases. > In my case for example, we add technical columns in the beginning of a > Dataset and we add business columns at the end. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org