[
https://issues.apache.org/jira/browse/SPARK-7616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cheng Lian updated SPARK-7616:
------------------------------
Description:
When saved as a partitioned table, partition columns of a DataFrame are
appended after data columns. However, column names are not adjusted accordingly.
{code}
import sqlContext._
import sqlContext.implicits._
val df = (1 to 3).map(i => i -> i * 2).toDF("a", "b")
df.write
.format("parquet")
.mode("overwrite")
.partitionBy("a")
.saveAsTable("t")
table("t").orderBy('a).show()
{code}
Expected output:
{noformat}
+-+-+
|b|a|
+-+-+
|2|1|
|4|2|
|6|3|
+-+-+
{noformat}
Actual output:
{noformat}
+-+-+
|b|a|
+-+-+
|1|2|
|2|4|
|3|6|
+-+-+
{noformat}
was:
{code}
import sqlContext._
import sqlContext.implicits._
import org.apache.spark.sql.SaveMode
val df = createDataFrame(Seq((1,2),(2,3),(3,4))).toDF("a", "b")
df.saveAsTable("test2", "parquet", SaveMode.Overwrite, Map.empty[String,
String], Seq("b"))
table("test2").show
// You will see
// +-+-+
// |a|b|
// +-+-+
// |1|2|
// |2|3|
// |3|4|
// +-+-+
df.saveAsTable("test2", "parquet", SaveMode.Overwrite, Map.empty[String,
String], Seq("a"))
table("test2").show
// You will see
// +-+-+
// |b|a|
// +-+-+
// |1|2|
// |2|3|
// |3|4|
// +-+-+
{code}
> Column order can be corrupted when saving DataFrame as a partitioned table
> --------------------------------------------------------------------------
>
> Key: SPARK-7616
> URL: https://issues.apache.org/jira/browse/SPARK-7616
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.4.0
> Reporter: Yin Huai
> Assignee: Cheng Lian
> Priority: Blocker
>
> When saved as a partitioned table, partition columns of a DataFrame are
> appended after data columns. However, column names are not adjusted
> accordingly.
> {code}
> import sqlContext._
> import sqlContext.implicits._
> val df = (1 to 3).map(i => i -> i * 2).toDF("a", "b")
> df.write
> .format("parquet")
> .mode("overwrite")
> .partitionBy("a")
> .saveAsTable("t")
> table("t").orderBy('a).show()
> {code}
> Expected output:
> {noformat}
> +-+-+
> |b|a|
> +-+-+
> |2|1|
> |4|2|
> |6|3|
> +-+-+
> {noformat}
> Actual output:
> {noformat}
> +-+-+
> |b|a|
> +-+-+
> |1|2|
> |2|4|
> |3|6|
> +-+-+
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]