[jira] [Updated] (HUDI-2958) Automatically set spark.sql.parquet.writelegacyformat; When using bulkinsert to insert data which contains decimal Type.

sivabalan narayanan (Jira) Wed, 15 Dec 2021 14:20:04 -0800


     [ 
https://issues.apache.org/jira/browse/HUDI-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


sivabalan narayanan updated HUDI-2958:
--------------------------------------
    Status: In Progress  (was: Open)

> Automatically set spark.sql.parquet.writelegacyformat; When using bulkinsert 
> to insert data which contains decimal Type.
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HUDI-2958
>                 URL: https://issues.apache.org/jira/browse/HUDI-2958
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: tao meng
>            Priority: Minor
>              Labels: pull-request-available, query-eng, sev:high
>             Fix For: 0.11.0
>
>
> Now by default ParquetWriteSupport will write DecimalType to parquet as 
> int32/int64 when the scale of decimalType < Decimal.MAX_LONG_DIGITS(),
> but AvroParquetReader which used by HoodieParquetReader cannot support read 
> int32/int64 as DecimalType. this will lead follow error
> Caused by: java.lang.UnsupportedOperationException: 
> org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary
>     at org.apache.parquet.column.Dictionary.decodeToBinary(Dictionary.java:41)
>     at 
> org.apache.parquet.avro.AvroConverters$BinaryConverter.setDictionary(AvroConverters.java:75)
>     ......



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (HUDI-2958) Automatically set spark.sql.parquet.writelegacyformat; When using bulkinsert to insert data which contains decimal Type.

Reply via email to