Repository: spark Updated Branches: refs/heads/branch-2.4 3f203050a -> d44b863a2
[SPARK-20937][DOCS] Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide ## What changes were proposed in this pull request? Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide. ## How was this patch tested? N/A Closes #22453 from seancxmao/SPARK-20937. Authored-by: seancxmao <seancx...@gmail.com> Signed-off-by: hyukjinkwon <gurwls...@apache.org> (cherry picked from commit cf5c9c4b550c3a8ed59d7ef9404f2689ea763fa9) Signed-off-by: hyukjinkwon <gurwls...@apache.org> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d44b863a Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d44b863a Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d44b863a Branch: refs/heads/branch-2.4 Commit: d44b863a2d58d3b57af1e8aa1550c6e925446032 Parents: 3f20305 Author: seancxmao <seancx...@gmail.com> Authored: Wed Sep 26 22:14:14 2018 +0800 Committer: hyukjinkwon <gurwls...@apache.org> Committed: Wed Sep 26 22:14:27 2018 +0800 ---------------------------------------------------------------------- docs/sql-programming-guide.md | 11 +++++++++++ .../scala/org/apache/spark/sql/internal/SQLConf.scala | 7 +++++-- 2 files changed, 16 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/d44b863a/docs/sql-programming-guide.md ---------------------------------------------------------------------- diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index b5302bb..2546064 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1002,6 +1002,17 @@ Configuration of Parquet can be done using the `setConf` method on `SparkSession </p> </td> </tr> +<tr> + <td><code>spark.sql.parquet.writeLegacyFormat</code></td> + <td>false</td> + <td> + If true, data will be written in a way of Spark 1.4 and earlier. For example, decimal values + will be written in Apache Parquet's fixed-length byte array format, which other systems such as + Apache Hive and Apache Impala use. If false, the newer format in Parquet will be used. For + example, decimals will be written in int-based format. If Parquet output is intended for use + with systems that do not support this newer format, set to true. + </td> +</tr> </table> ## ORC Files http://git-wip-us.apache.org/repos/asf/spark/blob/d44b863a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---------------------------------------------------------------------- diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index 68daf9d..bacd5e9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -431,8 +431,11 @@ object SQLConf { .createWithDefault(10) val PARQUET_WRITE_LEGACY_FORMAT = buildConf("spark.sql.parquet.writeLegacyFormat") - .doc("Whether to be compatible with the legacy Parquet format adopted by Spark 1.4 and prior " + - "versions, when converting Parquet schema to Spark SQL schema and vice versa.") + .doc("If true, data will be written in a way of Spark 1.4 and earlier. For example, decimal " + + "values will be written in Apache Parquet's fixed-length byte array format, which other " + + "systems such as Apache Hive and Apache Impala use. If false, the newer format in Parquet " + + "will be used. For example, decimals will be written in int-based format. If Parquet " + + "output is intended for use with systems that do not support this newer format, set to true.") .booleanConf .createWithDefault(false) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org