Repository: spark Updated Branches: refs/heads/master 44a71741d -> cf5c9c4b5
[SPARK-20937][DOCS] Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide ## What changes were proposed in this pull request? Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide. ## How was this patch tested? N/A Closes #22453 from seancxmao/SPARK-20937. Authored-by: seancxmao <seancx...@gmail.com> Signed-off-by: hyukjinkwon <gurwls...@apache.org> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cf5c9c4b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cf5c9c4b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cf5c9c4b Branch: refs/heads/master Commit: cf5c9c4b550c3a8ed59d7ef9404f2689ea763fa9 Parents: 44a7174 Author: seancxmao <seancx...@gmail.com> Authored: Wed Sep 26 22:14:14 2018 +0800 Committer: hyukjinkwon <gurwls...@apache.org> Committed: Wed Sep 26 22:14:14 2018 +0800 ---------------------------------------------------------------------- docs/sql-programming-guide.md | 11 +++++++++++ .../scala/org/apache/spark/sql/internal/SQLConf.scala | 7 +++++-- 2 files changed, 16 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/cf5c9c4b/docs/sql-programming-guide.md ---------------------------------------------------------------------- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index c72fa3d..6de9de9 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1004,6 +1004,17 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession </p> </td> </tr> +<tr> + <td><code>spark.sql.parquet.writeLegacyFormat</code></td> + <td>false</td> + <td> + If true, data will be written in a way of Spark 1.4 and earlier. For example, decimal values + will be written in Apache Parquet's fixed-length byte array format, which other systems such as + Apache Hive and Apache Impala use. If false, the newer format in Parquet will be used. For + example, decimals will be written in int-based format. If Parquet output is intended for use + with systems that do not support this newer format, set to true. + </td> +</tr> </table> ## ORC Files http://git-wip-us.apache.org/repos/asf/spark/blob/cf5c9c4b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index e7c9a83..2f4d660 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -451,8 +451,11 @@ object SQLConf { .createWithDefault(10) val PARQUET_WRITE_LEGACY_FORMAT = buildConf("spark.sql.parquet.writeLegacyFormat") - .doc("Whether to be compatible with the legacy Parquet format adopted by Spark 1.4 and prior " + - "versions, when converting Parquet schema to Spark SQL schema and vice versa.") + .doc("If true, data will be written in a way of Spark 1.4 and earlier. For example, decimal " + + "values will be written in Apache Parquet's fixed-length byte array format, which other " + + "systems such as Apache Hive and Apache Impala use. If false, the newer format in Parquet " + + "will be used. For example, decimals will be written in int-based format. If Parquet " + + "output is intended for use with systems that do not support this newer format, set to true.") .booleanConf .createWithDefault(false) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org