[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used

2018-06-19 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516871#comment-16516871
 ] 

Matt McCline commented on HIVE-19016:
-

Adding full nested support for complex types is "complex" to say the least.

For now, just disabling vectorization of PARQUET when nested complex types 
detected.

> Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces 
> RuntimeException: Unsupported type used
> -
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-19016.01.patch
>
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in 
> list:array>
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used

2018-06-18 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516336#comment-16516336
 ] 

Vihang Karajgaonkar commented on HIVE-19016:


Sure. Thanks for offering the help [~mmccline]

> Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces 
> RuntimeException: Unsupported type used
> -
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Haifeng Chen
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in 
> list:array>
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used

2018-06-18 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516319#comment-16516319
 ] 

Matt McCline commented on HIVE-19016:
-

[~vihangk1] [~jerrychenhf] Should I assign this to me?  I like to get it 
finished soon.  Thank You

> Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces 
> RuntimeException: Unsupported type used
> -
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Haifeng Chen
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in 
> list:array>
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used

2018-06-08 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16505733#comment-16505733
 ] 

Matt McCline commented on HIVE-19016:
-

Any way I can help move this issue forward?

> Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces 
> RuntimeException: Unsupported type used
> -
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Haifeng Chen
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in 
> list:array>
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used

2018-05-09 Thread Haifeng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469822#comment-16469822
 ] 

Haifeng Chen commented on HIVE-19016:
-

[~vihangk1] I will assign to to me if you have not yet started the work.

> Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces 
> RuntimeException: Unsupported type used
> -
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in 
> list:array
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19016) Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces RuntimeException: Unsupported type used

2018-05-09 Thread Haifeng Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-19016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469808#comment-16469808
 ] 

Haifeng Chen commented on HIVE-19016:
-

[~vihangk1] I have some research on this. And the nested complex type (nested 
struct, map and list) is not yet implement in Parquet vectorized reader. I got 
a study on the details and trying figure out a implementation. I will try to 
work out a patch.

> Vectorization and Parquet: When vectorized, parquet_nested_complex.q produces 
> RuntimeException: Unsupported type used
> -
>
> Key: HIVE-19016
> URL: https://issues.apache.org/jira/browse/HIVE-19016
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Vihang Karajgaonkar
>Priority: Critical
>
> Adding "SET hive.vectorized.execution.enabled=true;" to 
> parquet_nested_complex.q triggers this call stack:
> {noformat}
> Caused by: java.lang.RuntimeException: Unsupported type used in 
> list:array
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkListColumnSupport(VectorizedParquetRecordReader.java:589)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:525)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> {noformat}
> FYI: [~vihangk1]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)