[
https://issues.apache.org/jira/browse/HIVE-25893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Soumyakanti Das reassigned HIVE-25893:
--------------------------------------
> NPE when reading Parquet data because ColumnVector isNull[] is not updated
> --------------------------------------------------------------------------
>
> Key: HIVE-25893
> URL: https://issues.apache.org/jira/browse/HIVE-25893
> Project: Hive
> Issue Type: Bug
> Reporter: Soumyakanti Das
> Assignee: Soumyakanti Das
> Priority: Major
>
> In
> [VectorizedListColumnReader.java|https://github.com/apache/hive/blob/595f3bc9d612f02581bd3377ee0107efd6553ae6/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedListColumnReader.java]
> {{isNull[]}} is used in the comparison methods ( eg.
> [columnVectorsDifferNullForSameIndex
> |https://github.com/apache/hive/blob/595f3bc9d612f02581bd3377ee0107efd6553ae6/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedListColumnReader.java#L524]
> ), however, {{isNull}} is always {{false}} as it is never updated in
> [getChildData|https://github.com/apache/hive/blob/595f3bc9d612f02581bd3377ee0107efd6553ae6/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedListColumnReader.java#L401].
> This could result in NullPointerException like,
> {code}
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.compareBytesColumnVector(VectorizedListColumnReader.java:506)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.compareColumnVector(VectorizedListColumnReader.java:432)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.setIsRepeating(VectorizedListColumnReader.java:367)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.convertValueListToListColumnVector(VectorizedListColumnReader.java:360)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedListColumnReader.readBatch(VectorizedListColumnReader.java:83)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedMapColumnReader.readBatch(VectorizedMapColumnReader.java:57)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:438)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:377)
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:100)
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:375)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)