Github user rtreffer commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6796#discussion_r33420742
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala
 ---
    @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter(
               .length(minBytesForPrecision(precision))
               .named(field.name)
     
    -      case dec @ DecimalType() if !followParquetFormatSpec =>
    -        throw new AnalysisException(
    -          s"Data type $dec is not supported. " +
    -            s"When ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is 
set to false," +
    -            "decimal precision and scale must be specified, " +
    -            "and precision must be less than or equal to 18.")
    -
           // =====================================
           // Decimals (follow Parquet format spec)
           // =====================================
     
    -      // Uses INT32 for 1 <= precision <= 9
    +      // Uses INT32 for 4 byte encodings / precision <= 9
           case DecimalType.Fixed(precision, scale)
    -        if precision <= maxPrecisionForBytes(4) && followParquetFormatSpec 
=>
    +        if followParquetFormatSpec && maxPrecisionForBytes(3) < precision 
&& 
    +          precision <= maxPrecisionForBytes(4) =>
    --- End diff --
    
    We had a debate about using the most compact storage type if possible.
    
    As such INT32 looses compared to a 3 byte fixed length array.
    
    Am 28. Juni 2015 10:59:15 MESZ, schrieb Cheng Lian 
<notificati...@github.com>:
    >>        case DecimalType.Fixed(precision, scale)
    >> -        if precision <= maxPrecisionForBytes(4) &&
    >followParquetFormatSpec =>
    >> +        if followParquetFormatSpec && maxPrecisionForBytes(3) <
    >precision && 
    >> +          precision <= maxPrecisionForBytes(4) =>
    >
    >Why do we want `maxPrecisionForBytes(3) < precision` here? Did I miss
    >something?
    >
    >---
    >Reply to this email directly or view it on GitHub:
    >https://github.com/apache/spark/pull/6796/files#r33420647
    
    -- 
    Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to