[
https://issues.apache.org/jira/browse/SPARK-21786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598432#comment-16598432
]
Apache Spark commented on SPARK-21786:
--------------------------------------
User 'fjh100456' has created a pull request for this issue:
https://github.com/apache/spark/pull/22301
> The 'spark.sql.parquet.compression.codec' configuration doesn't take effect
> on tables with partition field(s)
> -------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-21786
> URL: https://issues.apache.org/jira/browse/SPARK-21786
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: Jinhua Fu
> Assignee: Jinhua Fu
> Priority: Major
> Fix For: 2.3.0
>
>
> Since Hive 1.1, Hive allows users to set parquet compression codec via
> table-level properties parquet.compression. See the JIRA:
> https://issues.apache.org/jira/browse/HIVE-7858 . We do support
> orc.compression for ORC. Thus, for external users, it is more straightforward
> to support both. See the stackflow question:
> https://stackoverflow.com/questions/36941122/spark-sql-ignores-parquet-compression-propertie-specified-in-tblproperties
> In Spark side, our table-level compression conf compression was added by
> #11464 since Spark 2.0.
> We need to support both table-level conf. Users might also use session-level
> conf spark.sql.parquet.compression.codec. The priority rule will be like
> If other compression codec configuration was found through hive or parquet,
> the precedence would be compression, parquet.compression,
> spark.sql.parquet.compression.codec. Acceptable values include: none,
> uncompressed, snappy, gzip, lzo.
> The rule for Parquet is consistent with the ORC after the change.
> Changes:
> 1.Increased acquiring 'compressionCodecClassName' from
> parquet.compression,and the precedence order is
> compression,parquet.compression,spark.sql.parquet.compression.codec, just
> like what we do in OrcOptions.
> 2.Change spark.sql.parquet.compression.codec to support "none".Actually in
> ParquetOptions,we do support "none" as equivalent to "uncompressed", but it
> does not allowed to configured to "none".
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]