Walid Mellouli created SPARK-25333: -------------------------------------- Summary: Ability to add new columns in the beginning of a Dataset Key: SPARK-25333 URL: https://issues.apache.org/jira/browse/SPARK-25333 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.0 Reporter: Walid Mellouli
When we add new columns in a Dataset, they are added automatically at the end of the Dataset. {code:java} val df = sc.parallelize(Seq(1, 2, 3)).toDF df.printSchema root |-- value: integer (nullable = true) {code} When we add a new column: {code:java} val newDf = df.withColumn("newColumn", col("value") + 1) newDf.printSchema root |-- value: integer (nullable = true) |-- newColumn: integer (nullable = true) {code} Generally users want to add new columns either at the end or in the beginning, depends on use cases. In my case for example, we add technical columns in the beginning of a Dataset and we add business columns at the end. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org