Charlie Evans created SPARK-15804:
-------------------------------------
Summary: Manually added metadata not saving with parquet
Key: SPARK-15804
URL: https://issues.apache.org/jira/browse/SPARK-15804
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.0.0
Reporter: Charlie Evans
Adding metadata with col().as(_, metadata) then saving the resultant dataframe
does not save the metadata. No error is thrown. Only see the schema contains
the metadata before saving and does not contain the metadata after saving and
loading the dataframe.
{code}
case class TestRow(a: String, b: Int)
val rows = TestRow("a", 0) :: TestRow("b", 1) :: TestRow("c", 2) :: Nil
val df = spark.createDataFrame(rows)
import org.apache.spark.sql.types.MetadataBuilder
val md = new MetadataBuilder().putString("key", "value").build()
val dfWithMeta = df.select(col("a"), col("b").as("b", md))
println(dfWithMeta.schema.json)
dfWithMeta.write.parquet("dfWithMeta")
val dfWithMeta2 = spark.read.parquet("dfWithMeta")
println(dfWithMeta2.schema.json)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]