[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360323#comment-16360323 ] KaiXu commented on HIVE-18553: -- Thanks for your email. I am taking annual leave, email responses can be delayed. Sorry for any inconveniences. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, > HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.patch, > test_result_based_on_HIVE-18553.xlsx > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359650#comment-16359650 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909934/HIVE-18553.7.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 34 failed/errored test(s), 13159 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=241) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde] (batchId=180) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] (batchId=251) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=222) org.apache.hadoop.hive.metastore.TestAcidTableSetup.testTransactionalValidation (batchId=224) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded] (batchId=206) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=257) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch (batchId=271) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch (batchId=271) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch (batchId=271) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadEqualOneBatch (batchId=272) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadLessOneBatch (batchId=272) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadMoreOneBatch (batchId=272) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.hcatalog.templeton.TestConcurrentJobRequestsThreadsAndTimeout.ConcurrentListJobsVerifyExceptions (batchId=191) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235) org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerDagTotalTasks (batchId=236) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9144/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9144/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9144/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 34 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12909934 - PreCommit-HIVE-Build > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359641#comment-16359641 ] Hive QA commented on HIVE-18553: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 22 new + 211 unchanged - 87 fixed = 233 total (was 298) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 14m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / ddd4c9a | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9144/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-9144/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, > HIVE-18553.7.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by:
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359059#comment-16359059 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909934/HIVE-18553.7.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 13147 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=241) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=180) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde] (batchId=180) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] (batchId=251) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=222) org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote (batchId=226) org.apache.hadoop.hive.metastore.TestMarkPartition.testMarkingPartitionSet (batchId=215) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=257) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch (batchId=271) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch (batchId=271) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch (batchId=271) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadEqualOneBatch (batchId=272) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadLessOneBatch (batchId=272) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadMoreOneBatch (batchId=272) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235) org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill (batchId=236) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9124/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9124/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9124/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 33 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12909934 - PreCommit-HIVE-Build > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch,
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359013#comment-16359013 ] Hive QA commented on HIVE-18553: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 23s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 22 new + 211 unchanged - 87 fixed = 233 total (was 298) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 15m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 58bbfc7 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9124/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-9124/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, > HIVE-18553.7.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by:
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357248#comment-16357248 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909816/HIVE-18553.6.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 118 failed/errored test(s), 12954 tests executed *Failed tests:* {noformat} TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=107) [join_cond_pushdown_unqual4.q,union_remove_7.q,join13.q,join_vc.q,groupby_cube1.q,parquet_vectorization_2.q,bucket_map_join_spark2.q,sample3.q,smb_mapjoin_19.q,union23.q,union.q,union31.q,cbo_udf_udaf.q,ptf_decimal.q,bucketmapjoin2.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=116) [skewjoinopt3.q,skewjoinopt19.q,timestamp_comparison.q,join_merge_multi_expressions.q,union5.q,insert_into1.q,vectorization_4.q,parquet_vectorization_10.q,vector_left_outer_join.q,decimal_1_1.q,semijoin.q,skewjoinopt9.q,smb_mapjoin_3.q,stats10.q,rcfile_bigdata.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=144) [groupby2_noskew_multi_distinct.q,load_dyn_part12.q,scriptfile1.q,join15.q,auto_join17.q,subquery_multiinsert.q,join_hive_626.q,tez_join_tests.q,parquet_vectorization_16.q,auto_join21.q,join_view.q,join_cond_pushdown_4.q,vectorization_0.q,union_null.q,auto_join3.q] org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_10] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_11] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_12] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_13] (batchId=52) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_14] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_15] (batchId=87) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_16] (batchId=82) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_17] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_2] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_5] (batchId=71) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_6] (batchId=42) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_7] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_8] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_9] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_not] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part_project] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part_varchar] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[schema_evol_par_vec_table_non_dictionary_encoding] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types] (batchId=67) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_types_vectorization] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171)
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357165#comment-16357165 ] Hive QA commented on HIVE-18553: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 44s{color} | {color:red} ql: The patch generated 33 new + 214 unchanged - 84 fixed = 247 total (was 298) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 14m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / b8fdd13 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9096/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-9096/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.patch, > test_result_based_on_HIVE-18553.xlsx > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by:
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354916#comment-16354916 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909348/test_result_based_on_HIVE-18553.xlsx {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9060/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9060/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9060/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-02-07 03:40:44.316 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-9060/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-02-07 03:40:44.320 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 3972bf0 HIVE-18613: Extend JsonSerDe to support BINARY type (Jesus Camacho Rodriguez, reviewed by Prasanth Jayachandran) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 3972bf0 HIVE-18613: Extend JsonSerDe to support BINARY type (Jesus Camacho Rodriguez, reviewed by Prasanth Jayachandran) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-02-07 03:40:46.829 + rm -rf ../yetus + mkdir ../yetus + git gc + cp -R . ../yetus + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9060/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch fatal: unrecognized input fatal: unrecognized input fatal: unrecognized input The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12909348 - PreCommit-HIVE-Build > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32,
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353488#comment-16353488 ] Vihang Karajgaonkar commented on HIVE-18553: Thanks [~Ferd] for the patch. I have left some minor comments on the review board. Rest looks good to me. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352225#comment-16352225 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12909194/HIVE-18553.4.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 12971 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded] (batchId=206) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9017/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9017/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9017/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12909194 - PreCommit-HIVE-Build > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352196#comment-16352196 ] Hive QA commented on HIVE-18553: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} ql: The patch generated 0 new + 126 unchanged - 11 fixed = 126 total (was 137) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 13m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 0a328f0 | | Default Java | 1.8.0_111 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-9017/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, > HIVE-18553.4.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352016#comment-16352016 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908946/HIVE-18553.3.patch {color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 12972 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_bmj_schema_evolution] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded] (batchId=206) org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280) org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9013/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9013/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9013/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 27 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908946 - PreCommit-HIVE-Build > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0:
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351996#comment-16351996 ] Hive QA commented on HIVE-18553: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s{color} | {color:red} ql: The patch generated 46 new + 137 unchanged - 1 fixed = 183 total (was 138) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 13m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 0a328f0 | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-9013/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-9013/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346383#comment-16346383 ] Ferdinand Xu commented on HIVE-18553: - At my current understanding, rename or type conversion may require index access. It may not work for the current workaround for vectorization path. We can spend some time to investigate it for fully support. The current patch can be considered as a quick workaround. Any thoughts on this? > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343725#comment-16343725 ] Vihang Karajgaonkar commented on HIVE-18553: Hi [~Ferd] Thanks for the patch. Can you please test with the patch what happens when we change the column types? Does it work in non-vectorized code? > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342843#comment-16342843 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908077/HIVE-18553.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 12632 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] (batchId=13) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=78) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestDropPartitions.testDropPartition[Embedded] (batchId=207) org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded] (batchId=207) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8901/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8901/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8901/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 23 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908077 - PreCommit-HIVE-Build > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342828#comment-16342828 ] Hive QA commented on HIVE-18553: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 47s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s{color} | {color:red} ql: The patch generated 10 new + 40 unchanged - 0 fixed = 50 total (was 40) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 18s{color} | {color:red} The patch generated 6 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 16m 13s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 1dd863a | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8901/yetus/diff-checkstyle-ql.txt | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-8901/yetus/patch-asflicense-problems.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8901/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32,
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342813#comment-16342813 ] Ferdinand Xu commented on HIVE-18553: - Thanks [~colinma] for the review. It should work for adding cases since reader builder is involved each time for row group checking. But current patch surely not support other cases like type conversion. It may require further investigations. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.2.patch, HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342800#comment-16342800 ] Colin Ma commented on HIVE-18553: - [~Ferd], I think the patch doesn't cover the case add a column and insert new value into table. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342736#comment-16342736 ] Hive QA commented on HIVE-18553: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12908040/HIVE-18553.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 12602 tests executed *Failed tests:* {noformat} TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=160) [vector_reduce_groupby_duplicate_cols.q,explainuser_1.q,multi_insert.q,tez_dml.q,cross_prod_4.q,vector_bround.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vector_groupby_grouping_sets2.q,cte_3.q,vector_reduce_groupby_decimal.q,vector_ptf_part_simple.q,vector_decimal_cast.q,groupby_grouping_id2.q,tez_smb_empty.q,schema_evol_text_vecrow_part_all_primitive_llap_io.q,orc_merge6.q,cte_mat_1.q,vector_char_mapjoin1.q,cte_5.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,vector_string_concat.q,vector_windowing_windowspec.q,sharedworkext.q,vectorized_context.q,auto_sortmerge_join_12.q,tez_union_multiinsert.q,mapjoin_hint.q] org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=240) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[interval_comparison] (batchId=76) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=78) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl] (batchId=175) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=172) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] (batchId=171) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=166) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=164) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes] (batchId=163) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=122) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=221) org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded] (batchId=207) org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded] (batchId=207) org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap (batchId=282) org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256) org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188) org.apache.hive.hcatalog.common.TestHiveClientCache.testCloseAllClients (batchId=200) org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234) org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234) org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8897/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8897/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8897/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 25 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12908040 - PreCommit-HIVE-Build > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342712#comment-16342712 ] Hive QA commented on HIVE-18553: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 35s{color} | {color:red} ql: The patch generated 10 new + 40 unchanged - 0 fixed = 50 total (was 40) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 12s{color} | {color:red} The patch generated 6 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 12m 33s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh | | git revision | master / 1dd863a | | Default Java | 1.8.0_111 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-8897/yetus/diff-checkstyle-ql.txt | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-8897/yetus/patch-asflicense-problems.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-8897/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2]
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342434#comment-16342434 ] Ferdinand Xu commented on HIVE-18553: - The patch only covers the add case. For rename or deletion, we also need to consider it. We can file other tickets if problem exists. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > Attachments: HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342197#comment-16342197 ] Ferdinand Xu commented on HIVE-18553: - For schema evolution, we can create a dummy reader to handle it as the POC patch did. Any thoughts on it? > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu >Priority: Major > Attachments: HIVE-18553.patch > > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341484#comment-16341484 ] Vihang Karajgaonkar commented on HIVE-18553: Thanks [~colinma] for taking at look. I think there is an issue with your insert statement. The newly added column type is a timestamp but you are adding 15. If you add a timestamp value it works in non-vectorized code. {code} insert into test_p values (1,2,3,4, '2018-01-01 01:01:01.123456'); select * from test_p; +++--++-+ | test_p.t1 | test_p.t2 | test_p.i1 | test_p.i2 | test_p.ts | +++--++-+ | -125 | 10 | 2147483647 | -10| NULL | | 125| -10| -2147483648 | 10 | NULL | | 125| 10 | 2147483647 | 10 | NULL | | 1 | 2 | 3| 4 | 2018-01-01 01:01:01.123456 | +++--++-+ {code} In vectorized execution the select * fails. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340731#comment-16340731 ] Colin Ma commented on HIVE-18553: - Hi [~vihangk1], I did some investigation on this problem: Step 1: with out vectorization, check the impacts for ORC and Parquet when adding a new column, the following is the related statements: {code:java} create table test_p_parquet(t1 tinyint, t2 tinyint, i1 int, i2 int) stored as parquet; create table test_p_orc(t1 tinyint, t2 tinyint, i1 int, i2 int) stored as orc; insert into test_p_parquet values (1,2,3,4),(5,6,7,8); insert into test_p_orc values (1,2,3,4),(5,6,7,8); alter table test_p_parquet add columns (ts timestamp); alter table test_p_orc add columns (ts timestamp); select * from test_p_parquet; 1 2 3 4 NULL 5 6 7 8 NULL select * from test_p_orc; 1 2 3 4 NULL 5 6 7 8 NULL{code} The result is what we expected by now, but when insert new data, there has some problems: {code:java} insert into test_p_parquet values (11,12,13,14,15); insert into test_p_orc values (11,12,13,14,15); select * from test_p_parquet; 1 2 3 4 NULL 5 6 7 8 NULL 11 12 13 14 NULL select * from test_p_orc; 1 2 3 4 NULL 5 6 7 8 NULL 11 12 13 14 NULL {code} The new column still is null, new data is lost. From this result, I think with Parquet and ORC, Hive can't add data to new column. Step 2: with vectorization, the result of Parquet is what you describe, and the result of ORC is also incorrect which is an empty result. Check the implementation of VectorizedParquetRecordReader, the exception is because of the new column doesn't exist in parquet file. That means the data file won't change when new column added. I think the root problem is if Hive support add column to Parquet/ORC dynamically, and it's not the problem of VectorizedParquetRecordReader. > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at >
[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table
[ https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340435#comment-16340435 ] Vihang Karajgaonkar commented on HIVE-18553: cc: [~Ferd] > VectorizedParquetReader fails after adding a new column to table > > > Key: HIVE-18553 > URL: https://issues.apache.org/jira/browse/HIVE-18553 > Project: Hive > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.4.0, 2.3.2 >Reporter: Vihang Karajgaonkar >Priority: Major > > VectorizedParquetReader throws an exception when trying to reading from a > parquet table on which new columns are added. Steps to reproduce below: > {code} > 0: jdbc:hive2://localhost:1/default> desc test_p; > +---++--+ > | col_name | data_type | comment | > +---++--+ > | t1| tinyint| | > | t2| tinyint| | > | i1| int| | > | i2| int| | > +---++--+ > 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none; > 0: jdbc:hive2://localhost:1/default> set > hive.vectorized.execution.enabled=true; > 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts > timestamp); > 0: jdbc:hive2://localhost:1/default> select * from test_p; > Error: Error while processing statement: FAILED: Execution Error, return code > 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2) > {code} > Following exception is seen in the logs > {code} > Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the > store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3 > at > org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142) > ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > ~[hadoop-mapreduce-client-common-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?] > at >