Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/22453#discussion_r219719166
--- Diff: docs/sql-programming-guide.md ---
@@ -1002,6 +1002,15 @@ Configuration of Parquet can be done using the
`setConf` method on `SparkSession
</p>
</td>
</tr>
+<tr>
+ <td><code>spark.sql.parquet.writeLegacyFormat</code></td>
--- End diff --
@srowen, actually, this configuration specifically related with
compatibility with other systems like Impala (not only old Spark ones) where
decimals are written based on fixed binary format (nowdays it's written in
int-based in Spark). If this configurations is not enabled, they are unable to
read what Spark wrote.
Given
https://stackoverflow.com/questions/44279870/why-cant-impala-read-parquet-files-after-spark-sqls-write
and JIRA like
[SPARK-20297](https://issues.apache.org/jira/browse/SPARK-20297), I think this
configuration is kind of important. I even expected more documentation about
this configuration specifically at the first place.
Personally I have been thinking it would better to leave this configuration
after 3.0 as well for better compatibility.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]