Ganesha Shreedhara created HIVE-22670: -----------------------------------------
Summary: ArrayIndexOutOfBoundsException when vectorized reader is used for reading a parquet file Key: HIVE-22670 URL: https://issues.apache.org/jira/browse/HIVE-22670 Project: Hive Issue Type: Bug Affects Versions: 2.3.6, 3.1.2 Reporter: Ganesha Shreedhara Assignee: Ganesha Shreedhara ArrayIndexOutOfBoundsException is getting thrown while decoding dictionaryIds of a row group in parquet file with vectorization enabled. *Exception stack trace:* {code:java} Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainBinaryDictionary.decodeToBinary(PlainValuesDictionary.java:122) at org.apache.hadoop.hive.ql.io.parquet.vector.ParquetDataColumnReaderFactory$DefaultParquetDataColumnReader.readString(ParquetDataColumnReaderFactory.java:95) at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedPrimitiveColumnReader.decodeDictionaryIds(VectorizedPrimitiveColumnReader.java:467) at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedPrimitiveColumnReader.readBatch(VectorizedPrimitiveColumnReader.java:68) at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:410) at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) at org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365) ... 24 more{code} This issue seems to be caused by re-using the same dictionary column vector while reading consecutive row groups. This looks like one of the corner case bug which occurs for a certain distribution of dictionary/plain encoded data while we read/populate the underlying bit packed dictionary data into a column-vector based data structure. -- This message was sent by Atlassian Jira (v8.3.4#803005)