[
https://issues.apache.org/jira/browse/HIVE-8909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220487#comment-14220487
]
Brock Noland commented on HIVE-8909:
------------------------------------
Not sure which test this is from:
{noformat}
Caused by: parquet.io.ParquetDecodingException: Can not read value at 0 in
block 0 in file
pfile:/Users/noland/workspaces/hive-apache/hive/itests/qtest/target/warehouse/parquet_jointable2/000000_0
at
parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:213)
at
parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:204)
at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:102)
at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:71)
at
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:71)
at
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65)
... 16 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
at
org.apache.hadoop.hive.ql.io.parquet.convert.HiveStructConverter.set(HiveStructConverter.java:96)
at
org.apache.hadoop.hive.ql.io.parquet.convert.ETypeConverter$BinaryConverter.addBinary(ETypeConverter.java:219)
at
parquet.column.impl.ColumnReaderImpl$2$6.writeValue(ColumnReaderImpl.java:306)
at
parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:353)
at
parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:402)
at
parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:194)
... 21 more
{noformat}
All failed parquet tests with the patch:
{noformat}
<testcase name="testCliDriver_parquet_array_null_element"
classname="org.apache.hadoop.hive.cli.TestCliDriver" time="4.945">
<testcase name="testCliDriver_parquet_create"
classname="org.apache.hadoop.hive.cli.TestCliDriver" time="4.416">
<testcase name="testCliDriver_parquet_decimal"
classname="org.apache.hadoop.hive.cli.TestCliDriver" time="5.478">
<testcase name="testCliDriver_parquet_join"
classname="org.apache.hadoop.hive.cli.TestCliDriver" time="8.928">
<testcase name="testCliDriver_parquet_types"
classname="org.apache.hadoop.hive.cli.TestCliDriver" time="4.094">
{noformat}
> Hive doesn't correctly read Parquet nested types
> ------------------------------------------------
>
> Key: HIVE-8909
> URL: https://issues.apache.org/jira/browse/HIVE-8909
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.13.1
> Reporter: Ryan Blue
> Assignee: Ryan Blue
> Attachments: HIVE-8909-1.patch, HIVE-8909-2.patch, HIVE-8909.2.patch,
> HIVE-8909.3.patch, parquet-test-data.tar.gz
>
>
> Parquet's Avro and Thrift object models don't produce the same parquet type
> representation for lists and maps that Hive does. In the Parquet community,
> we've defined what should be written and backward-compatibility rules for
> existing data written by parquet-avro and parquet-thrift in PARQUET-113. We
> need to implement those rules in the Hive Converter classes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)