spark git commit: [SPARK-20937][DOCS] Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide

gurwls223 Wed, 26 Sep 2018 07:14:33 -0700

Repository: spark
Updated Branches:
  refs/heads/master 44a71741d -> cf5c9c4b5



[SPARK-20937][DOCS] Describe spark.sql.parquet.writeLegacyFormat property in 
Spark SQL, DataFrames and Datasets Guide

## What changes were proposed in this pull request?
Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames 
and Datasets Guide.

## How was this patch tested?
N/A

Closes #22453 from seancxmao/SPARK-20937.

Authored-by: seancxmao <seancx...@gmail.com>
Signed-off-by: hyukjinkwon <gurwls...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cf5c9c4b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cf5c9c4b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cf5c9c4b

Branch: refs/heads/master
Commit: cf5c9c4b550c3a8ed59d7ef9404f2689ea763fa9
Parents: 44a7174
Author: seancxmao <seancx...@gmail.com>
Authored: Wed Sep 26 22:14:14 2018 +0800
Committer: hyukjinkwon <gurwls...@apache.org>
Committed: Wed Sep 26 22:14:14 2018 +0800

----------------------------------------------------------------------
 docs/sql-programming-guide.md                            | 11 +++++++++++
 .../scala/org/apache/spark/sql/internal/SQLConf.scala    |  7 +++++--
 2 files changed, 16 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/cf5c9c4b/docs/sql-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md
index c72fa3d..6de9de9 100644
--- a/docs/sql-programming-guide.md
+++ b/docs/sql-programming-guide.md
@@ -1004,6 +1004,17 @@ Configuration of Parquet can be done using the `setConf` 
method on `SparkSession
     </p>
   </td>
 </tr>
+<tr>
+  <td><code>spark.sql.parquet.writeLegacyFormat</code></td>
+  <td>false</td>
+  <td>
+    If true, data will be written in a way of Spark 1.4 and earlier. For 
example, decimal values
+    will be written in Apache Parquet's fixed-length byte array format, which 
other systems such as
+    Apache Hive and Apache Impala use. If false, the newer format in Parquet 
will be used. For
+    example, decimals will be written in int-based format. If Parquet output 
is intended for use
+    with systems that do not support this newer format, set to true.
+  </td>
+</tr>
 </table>
 
 ## ORC Files

http://git-wip-us.apache.org/repos/asf/spark/blob/cf5c9c4b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
----------------------------------------------------------------------
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index e7c9a83..2f4d660 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -451,8 +451,11 @@ object SQLConf {
       .createWithDefault(10)
 
   val PARQUET_WRITE_LEGACY_FORMAT = 
buildConf("spark.sql.parquet.writeLegacyFormat")
-    .doc("Whether to be compatible with the legacy Parquet format adopted by 
Spark 1.4 and prior " +
-      "versions, when converting Parquet schema to Spark SQL schema and vice 
versa.")
+    .doc("If true, data will be written in a way of Spark 1.4 and earlier. For 
example, decimal " +
+      "values will be written in Apache Parquet's fixed-length byte array 
format, which other " +
+      "systems such as Apache Hive and Apache Impala use. If false, the newer 
format in Parquet " +
+      "will be used. For example, decimals will be written in int-based 
format. If Parquet " +
+      "output is intended for use with systems that do not support this newer 
format, set to true.")
     .booleanConf
     .createWithDefault(false)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-20937][DOCS] Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide

Reply via email to