[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

fjh100456 Fri, 22 Dec 2017 01:40:21 -0800

Github user fjh100456 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19218#discussion_r158457806
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetOptions.scala
 ---
    @@ -42,8 +43,15 @@ private[parquet] class ParquetOptions(
        * Acceptable values are defined in 
[[shortParquetCompressionCodecNames]].
        */
       val compressionCodecClassName: String = {
    -    val codecName = parameters.getOrElse("compression",
    -      sqlConf.parquetCompressionCodec).toLowerCase(Locale.ROOT)
    +    // `compression`, `parquet.compression`(i.e., 
ParquetOutputFormat.COMPRESSION), and
    +    // `spark.sql.parquet.compression.codec`
    +    // are in order of precedence from highest to lowest.
    +    val parquetCompressionConf = 
parameters.get(ParquetOutputFormat.COMPRESSION)
    +    val codecName = parameters
    +      .get("compression")
    +      .orElse(parquetCompressionConf)
    --- End diff --
    
    Yes it's new. I guess `PartitionOptions` did not used when writing hive 
table before, because it's invisible for hive. I changeed it to `public`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...

Reply via email to