[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076226#comment-17076226 ] Ganesha Shreedhara commented on HIVE-21492: --- Test failures are unrelated. These tests were passed in the previous run. [~Ferd] Are we good to merge this fix to master? > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, > HIVE-21492.4.patch, HIVE-21492.5.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074423#comment-17074423 ] Hive QA commented on HIVE-21492: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12998682/HIVE-21492.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 18158 tests executed *Failed tests:* {noformat} TestJdbcWithMiniLlapArrow - did not produce a TEST-*.xml file (likely timed out) (batchId=292) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[vector_outer_join5] (batchId=203) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/21418/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/21418/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-21418/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12998682 - PreCommit-HIVE-Build > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, > HIVE-21492.4.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074393#comment-17074393 ] Hive QA commented on HIVE-21492: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 41s{color} | {color:blue} ql in master has 1528 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 39s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-21418/dev-support/hive-personality.sh | | git revision | master / 3ab174d | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-21418/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, > HIVE-21492.4.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at >
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074251#comment-17074251 ] Ganesha Shreedhara commented on HIVE-21492: --- For some reason test didn't pick the patch (cancelling and resubmitting the patch didn't work). I have created a new patch to make sure that it gets picked by Hive QA job in the next run. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, > HIVE-21492.4.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074194#comment-17074194 ] Hive QA commented on HIVE-21492: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12998415/HIVE-21492.3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/21413/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/21413/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-21413/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Tests exited with: Exception: Patch URL https://issues.apache.org/jira/secure/attachment/12998415/HIVE-21492.3.patch was found in seen patch url's cache and a test was probably run already on it. Aborting... {noformat} This message is automatically generated. ATTACHMENT ID: 12998415 - PreCommit-HIVE-Build > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17074169#comment-17074169 ] Ferdinand Xu commented on HIVE-21492: - Yes, I manually started another job to double check this. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073324#comment-17073324 ] Ganesha Shreedhara commented on HIVE-21492: --- Test failures are unrelated and mostly because the metstore server was down. {code:java} Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift.transport.TSocket.open(TSocket.java:226) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:686){code} Can we rerun the tests? > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073305#comment-17073305 ] Hive QA commented on HIVE-21492: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12998415/HIVE-21492.3.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 68 failed/errored test(s), 18162 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.metastore.TestMetastoreHousekeepingLeaderEmptyConfig.testHouseKeepingThreadExistence (batchId=252) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.alterTableBogusCatalog[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.createTableInBogusCatalog[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.dropTableBogusCatalog[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.getAllTablesInBogusCatalog[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.getMaterializedViewsInBogusCatalog[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.getTableInBogusCatalog[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.getTableObjectsByNameBogusCatalog[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.moveTablesBetweenCatalogsOnAlter[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.tablesInOtherCatalogs[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableAlreadyExists[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableCascade[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableChangeCols[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableChangingDatabase[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableEmptyTableNameInNew[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableExternalTableChangeLocation[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableExternalTable[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorAddPartitionColumns[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorAlterPartitionColumnName[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorInvalidColumnType[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorNullCols[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorNullColumnType[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorNullLocation[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorNullSerdeInfo[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidStorageDescriptorRemovePartitionColumn[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableInvalidTableNameInNew[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNoSuchDatabase[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNoSuchTableInThisDatabase[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNoSuchTable[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullDatabaseInNew[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullDatabase[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullNewTable[Remote] (batchId=230) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Remote] (batchId=230)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073278#comment-17073278 ] Hive QA commented on HIVE-21492: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 41s{color} | {color:blue} ql in master has 1528 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 32s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-21377/dev-support/hive-personality.sh | | git revision | master / 709235c | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-21377/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at >
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17073277#comment-17073277 ] Ferdinand Xu commented on HIVE-21492: - LGTM +1 pending on the test > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072517#comment-17072517 ] Ganesha Shreedhara commented on HIVE-21492: --- [~Ferd] Please review the latest patch. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072471#comment-17072471 ] Ganesha Shreedhara commented on HIVE-21492: --- Done. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.3.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072466#comment-17072466 ] Ferdinand Xu commented on HIVE-21492: - Could you update the indents below? {code:java} + return childType.asGroupType().getFields().get(0) + .asPrimitiveType(); {code} > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072337#comment-17072337 ] Ganesha Shreedhara commented on HIVE-21492: --- [~Ferd] I could actually reproduce this problem using the available dataset _data/files/ThriftPrimitiveInList.parquet_ in hive repo. __ I have modified the parquet_thrift_array_of_primitives.q file accordingly. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.2.patch, HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072322#comment-17072322 ] Hive QA commented on HIVE-21492: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12963424/HIVE-21492.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 18163 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[topnkey_grouping_sets] (batchId=1) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/21362/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/21362/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-21362/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12963424 - PreCommit-HIVE-Build > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072308#comment-17072308 ] Ganesha Shreedhara commented on HIVE-21492: --- [~Ferd] This change is in a private method. It looks like there is no UT created for VectorizedParquetRecordReader class. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072302#comment-17072302 ] Hive QA commented on HIVE-21492: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 3m 39s{color} | {color:blue} ql in master has 1529 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 49s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-21362/dev-support/hive-personality.sh | | git revision | master / aa142d1 | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-21362/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at >
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071623#comment-17071623 ] Ferdinand Xu commented on HIVE-21492: - [~ganeshas], could you create some UTs for this? > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071620#comment-17071620 ] Ganesha Shreedhara commented on HIVE-21492: --- This issue exists in 3.x version because of the changes done as part of HIVE-18553. [~Ferd], [~vihangk1], [~ashutoshc] Could you please review the patch? > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16813673#comment-16813673 ] Nitin commented on HIVE-21492: -- [~Ferd] Can you please review the patch ? > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-21492) VectorizedParquetRecordReader can't to read parquet file generated using thrift/custom tool
[ https://issues.apache.org/jira/browse/HIVE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803742#comment-16803742 ] Ganesha Shreedhara commented on HIVE-21492: --- Please review the patch. > VectorizedParquetRecordReader can't to read parquet file generated using > thrift/custom tool > --- > > Key: HIVE-21492 > URL: https://issues.apache.org/jira/browse/HIVE-21492 > Project: Hive > Issue Type: Bug >Reporter: Ganesha Shreedhara >Assignee: Ganesha Shreedhara >Priority: Major > Attachments: HIVE-21492.patch > > > Taking an example of a parquet table having array of integers as below. > {code:java} > CREATE EXTERNAL TABLE ( list_of_ints` array) > STORED AS PARQUET > LOCATION '{location}'; > {code} > Parquet file generated using hive will have schema for Type as below: > {code:java} > group list_of_ints (LIST) { repeated group bag { optional int32 array;\n};\n} > {code} > Parquet file generated using thrift or any custom tool (using > org.apache.parquet.io.api.RecordConsumer) > may have schema for Type as below: > {code:java} > required group list_of_ints (LIST) { repeated int32 list_of_tuple} {code} > VectorizedParquetRecordReader handles only parquet file generated using hive. > It throws the following exception when parquet file generated using thrift is > read because of the changes done as part of HIVE-18553 . > {code:java} > Caused by: java.lang.ClassCastException: repeated int32 list_of_ints_tuple is > not a group > at org.apache.parquet.schema.Type.asGroupType(Type.java:207) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.getElementType(VectorizedParquetRecordReader.java:479) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:532) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:440) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353) > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365){code} > > I have done a small change to handle the case where the child type of group > type can be PrimitiveType. -- This message was sent by Atlassian JIRA (v7.6.3#76005)