[
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16520523#comment-16520523
]
Matt McCline commented on HIVE-19016:
-------------------------------------
Committed to master.
[~vihangk1] thank you for your code review. A Support enumeration was recently
added to VectorizedInputFormatInterface. So, a method for examining the
TypeInfo list might be a good idea. Ideally, I think the Parquet vectorized
input format code needs to just support everything that row mode does. And, it
also looked like it could be written for better performance. I should create a
follow on JIRA.
[~teddy.choi] thank you for your code review.
> Vectorization and Parquet: Disable vectorization for nested complex types
> -------------------------------------------------------------------------
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 3.0.0
> Reporter: Matt McCline
> Assignee: Matt McCline
> Priority: Critical
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19016.01.patch, HIVE-19016.02.patch
>
>
> Original title: Vectorization and Parquet: When vectorized,
> parquet_nested_complex.q produces RuntimeException: Unsupported type used
>
> Adding "SET hive.vectorized.execution.enabled=true;" to
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in
> list:array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<array<int>>>>>>>>>>>>>>>>>>>>>>
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
> ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)