[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...

rtreffer Sun, 28 Jun 2015 02:11:21 -0700

Github user rtreffer commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6796#discussion_r33420742
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/parquet/CatalystSchemaConverter.scala
 ---
    @@ -383,20 +386,14 @@ private[parquet] class CatalystSchemaConverter(
               .length(minBytesForPrecision(precision))
               .named(field.name)
     
    -      case dec @ DecimalType() if !followParquetFormatSpec =>
    -        throw new AnalysisException(
    -          s"Data type $dec is not supported. " +
    -            s"When ${SQLConf.PARQUET_FOLLOW_PARQUET_FORMAT_SPEC.key} is 
set to false," +
    -            "decimal precision and scale must be specified, " +
    -            "and precision must be less than or equal to 18.")
    -
           // =====================================
           // Decimals (follow Parquet format spec)
           // =====================================
     
    -      // Uses INT32 for 1 <= precision <= 9
    +      // Uses INT32 for 4 byte encodings / precision <= 9
           case DecimalType.Fixed(precision, scale)
    -        if precision <= maxPrecisionForBytes(4) && followParquetFormatSpec 
=>
    +        if followParquetFormatSpec && maxPrecisionForBytes(3) < precision 
&& 
    +          precision <= maxPrecisionForBytes(4) =>
    --- End diff --
    
    We had a debate about using the most compact storage type if possible.
    
    As such INT32 looses compared to a 3 byte fixed length array.
    
    Am 28. Juni 2015 10:59:15 MESZ, schrieb Cheng Lian 
<[email protected]>:
    >>        case DecimalType.Fixed(precision, scale)
    >> -        if precision <= maxPrecisionForBytes(4) &&
    >followParquetFormatSpec =>
    >> +        if followParquetFormatSpec && maxPrecisionForBytes(3) <
    >precision && 
    >> +          precision <= maxPrecisionForBytes(4) =>
    >
    >Why do we want `maxPrecisionForBytes(3) < precision` here? Did I miss
    >something?
    >
    >---
    >Reply to this email directly or view it on GitHub:
    >https://github.com/apache/spark/pull/6796/files#r33420647
    
    -- 
    Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-4176][WIP] Support decimal types with p...

Reply via email to