[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used
[ https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516871#comment-16516871 ] Matt McCline commented on HIVE-19016: - Adding full nested support for complex types is "complex" to say the least. For now, just disabling vectorization of PARQUET when nested complex types detected. > Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces > RuntimeException: Unsupported type used > - > > Key: HIVE-19016 > URL: https://issues.apache.org/jira/browse/HIVE-19016 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-19016.01.patch > > > Adding "SET hive.vectorized.execution.enabled=true;" to > parquet_nested_complex.q triggers this call stack: > {noformat} > Caused by: java.lang.RuntimeException: Unsupported type used in > list:array> > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > {noformat} > FYI: [~vihangk1] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used
[ https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516336#comment-16516336 ] Vihang Karajgaonkar commented on HIVE-19016: Sure. Thanks for offering the help [~mmccline] > Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces > RuntimeException: Unsupported type used > - > > Key: HIVE-19016 > URL: https://issues.apache.org/jira/browse/HIVE-19016 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Haifeng Chen >Priority: Critical > > Adding "SET hive.vectorized.execution.enabled=true;" to > parquet_nested_complex.q triggers this call stack: > {noformat} > Caused by: java.lang.RuntimeException: Unsupported type used in > list:array> > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > {noformat} > FYI: [~vihangk1] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used
[ https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516319#comment-16516319 ] Matt McCline commented on HIVE-19016: - [~vihangk1] [~jerrychenhf] Should I assign this to me? I like to get it finished soon. Thank You > Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces > RuntimeException: Unsupported type used > - > > Key: HIVE-19016 > URL: https://issues.apache.org/jira/browse/HIVE-19016 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Haifeng Chen >Priority: Critical > > Adding "SET hive.vectorized.execution.enabled=true;" to > parquet_nested_complex.q triggers this call stack: > {noformat} > Caused by: java.lang.RuntimeException: Unsupported type used in > list:array> > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > {noformat} > FYI: [~vihangk1] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used
[ https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505733#comment-16505733 ] Matt McCline commented on HIVE-19016: - Any way I can help move this issue forward? > Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces > RuntimeException: Unsupported type used > - > > Key: HIVE-19016 > URL: https://issues.apache.org/jira/browse/HIVE-19016 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Haifeng Chen >Priority: Critical > > Adding "SET hive.vectorized.execution.enabled=true;" to > parquet_nested_complex.q triggers this call stack: > {noformat} > Caused by: java.lang.RuntimeException: Unsupported type used in > list:array> > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > {noformat} > FYI: [~vihangk1] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used
[ https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469822#comment-16469822 ] Haifeng Chen commented on HIVE-19016: - [~vihangk1] I will assign to to me if you have not yet started the work. > Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces > RuntimeException: Unsupported type used > - > > Key: HIVE-19016 > URL: https://issues.apache.org/jira/browse/HIVE-19016 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Vihang Karajgaonkar >Priority: Critical > > Adding "SET hive.vectorized.execution.enabled=true;" to > parquet_nested_complex.q triggers this call stack: > {noformat} > Caused by: java.lang.RuntimeException: Unsupported type used in > list:array> at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > {noformat} > FYI: [~vihangk1] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used
[ https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469808#comment-16469808 ] Haifeng Chen commented on HIVE-19016: - [~vihangk1] I have some research on this. And the nested complex type (nested struct, map and list) is not yet implement in Parquet vectorized reader. I got a study on the details and trying figure out a implementation. I will try to work out a patch. > Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces > RuntimeException: Unsupported type used > - > > Key: HIVE-19016 > URL: https://issues.apache.org/jira/browse/HIVE-19016 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 3.0.0 >Reporter: Matt McCline >Assignee: Vihang Karajgaonkar >Priority: Critical > > Adding "SET hive.vectorized.execution.enabled=true;" to > parquet_nested_complex.q triggers this call stack: > {noformat} > Caused by: java.lang.RuntimeException: Unsupported type used in > list:array> at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > {noformat} > FYI: [~vihangk1] -- This message was sent by Atlassian JIRA (v7.6.3#76005)