David Herskovics created SPARK-24733: ----------------------------------------
Summary: Dataframe saved to parquet can have different metadata then the resulting parquet file Key: SPARK-24733 URL: https://issues.apache.org/jira/browse/SPARK-24733 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.3.0 Reporter: David Herskovics See the repro using spark-shell below: Let's say that we have a dataframe called *df_with_metadata* which has column *name* with metadata. {code:scala} scala> df_with_metadata.schema.json // Check that we have the metadata here. scala> df_with_metadata.createOrReplaceTempView("input") scala> val df2 = spark.sql("select case when true then name else null end as name from input") scala> df2.schema.json // We don't have the metadata anymore. scala> df2.write.parquet("no_metadata_expected") scala> val df3 = spark.read.parquet("no_metadata_expected") scala> df3.schema.json // And the metadata is there again so the no_metadata_expected does have metadata. {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org