[jira] [Updated] (HIVE-21987) Hive is unable to read Parquet int32 annotated with decimal

Nandor Kollar (JIRA) Thu, 11 Jul 2019 08:34:07 -0700


     [ 
https://issues.apache.org/jira/browse/HIVE-21987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nandor Kollar updated HIVE-21987:
---------------------------------
    Description: 
When I tried to read a Parquet file from a Hive (with Tez execution engine) 
table with a small decimal column, I got the following exception:
{code}
Caused by: java.lang.UnsupportedOperationException: 
org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$1
        at 
org.apache.parquet.io.api.PrimitiveConverter.addInt(PrimitiveConverter.java:98)
        at 
org.apache.parquet.column.impl.ColumnReaderImpl$2$3.writeValue(ColumnReaderImpl.java:248)
        at 
org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367)
        at 
org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406)
        at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226)
        ... 28 more
{code}

Steps to reproduce:
- Create a Hive table with a single decimal(4, 2) column
- Create a Parquet file with int32 column annotated with decimal(4, 2) logical 
type, put it into the previously created table location (or use the attached 
parquet file)
- Execute a {{select *}} on this table

Also, I'm afraid that similar problems can happen with int64 decimals too. 
[Parquet specification | 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] allows 
both of these cases.

  was:
When I tried to read a Parquet file from a Hive (with Tez execution engine) 
table with a small decimal column, I got the following exception:
{code}
Caused by: java.lang.UnsupportedOperationException: 
org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$1
        at 
org.apache.parquet.io.api.PrimitiveConverter.addInt(PrimitiveConverter.java:98)
        at 
org.apache.parquet.column.impl.ColumnReaderImpl$2$3.writeValue(ColumnReaderImpl.java:248)
        at 
org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367)
        at 
org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406)
        at 
org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226)
        ... 28 more
{code}

Steps to reproduce:
- Create a Hive table with a single decimal(4, 2) column
- Create a Parquet file with int32 column annotated with decimal(4, 2) logical 
type, put it into the previously created table location (or use the attached 
parquet file)
- Execute a {{select *}} on this table


> Hive is unable to read Parquet int32 annotated with decimal
> -----------------------------------------------------------
>
>                 Key: HIVE-21987
>                 URL: https://issues.apache.org/jira/browse/HIVE-21987
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Nandor Kollar
>            Priority: Major
>         Attachments: 
> part-00000-d6ee992d-ef56-4384-8855-5a170d3e3660-c000.snappy.parquet
>
>
> When I tried to read a Parquet file from a Hive (with Tez execution engine) 
> table with a small decimal column, I got the following exception:
> {code}
> Caused by: java.lang.UnsupportedOperationException: 
> org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$8$1
>       at 
> org.apache.parquet.io.api.PrimitiveConverter.addInt(PrimitiveConverter.java:98)
>       at 
> org.apache.parquet.column.impl.ColumnReaderImpl$2$3.writeValue(ColumnReaderImpl.java:248)
>       at 
> org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:367)
>       at 
> org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:406)
>       at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:226)
>       ... 28 more
> {code}
> Steps to reproduce:
> - Create a Hive table with a single decimal(4, 2) column
> - Create a Parquet file with int32 column annotated with decimal(4, 2) 
> logical type, put it into the previously created table location (or use the 
> attached parquet file)
> - Execute a {{select *}} on this table
> Also, I'm afraid that similar problems can happen with int64 decimals too. 
> [Parquet specification | 
> https://github.com/apache/parquet-format/blob/master/LogicalTypes.md] allows 
> both of these cases.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Updated] (HIVE-21987) Hive is unable to read Parquet int32 annotated with decimal

Reply via email to