[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-11 Thread KaiXu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360323#comment-16360323
 ] 

KaiXu commented on HIVE-18553:
--

Thanks for your email. I am taking annual leave, email responses can be 
delayed. Sorry for any inconveniences.


> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359650#comment-16359650
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909934/HIVE-18553.7.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 34 failed/errored test(s), 13159 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.TestAcidTableSetup.testTransactionalValidation 
(batchId=224)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=257)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadEqualOneBatch
 (batchId=272)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadLessOneBatch
 (batchId=272)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadMoreOneBatch
 (batchId=272)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.hcatalog.templeton.TestConcurrentJobRequestsThreadsAndTimeout.ConcurrentListJobsVerifyExceptions
 (batchId=191)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerDagTotalTasks 
(batchId=236)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9144/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9144/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9144/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 34 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909934 - PreCommit-HIVE-Build

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359641#comment-16359641
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 22 new + 211 unchanged - 87 
fixed = 233 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / ddd4c9a |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9144/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9144/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359059#comment-16359059
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909934/HIVE-18553.7.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 13147 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote.org.apache.hadoop.hive.metastore.TestAuthzApiEmbedAuthorizerInRemote
 (batchId=226)
org.apache.hadoop.hive.metastore.TestMarkPartition.testMarkingPartitionSet 
(batchId=215)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=257)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadEqualOneBatch
 (batchId=272)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadLessOneBatch
 (batchId=272)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedMapColumnReader.testMapReadMoreOneBatch
 (batchId=272)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveAndKill 
(batchId=236)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9124/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9124/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9124/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 33 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909934 - PreCommit-HIVE-Build

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-09 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359013#comment-16359013
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 22 new + 211 unchanged - 87 
fixed = 233 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 58bbfc7 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9124/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9124/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357248#comment-16357248
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909816/HIVE-18553.6.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 118 failed/errored test(s), 12954 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=107)

[join_cond_pushdown_unqual4.q,union_remove_7.q,join13.q,join_vc.q,groupby_cube1.q,parquet_vectorization_2.q,bucket_map_join_spark2.q,sample3.q,smb_mapjoin_19.q,union23.q,union.q,union31.q,cbo_udf_udaf.q,ptf_decimal.q,bucketmapjoin2.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=116)

[skewjoinopt3.q,skewjoinopt19.q,timestamp_comparison.q,join_merge_multi_expressions.q,union5.q,insert_into1.q,vectorization_4.q,parquet_vectorization_10.q,vector_left_outer_join.q,decimal_1_1.q,semijoin.q,skewjoinopt9.q,smb_mapjoin_3.q,stats10.q,rcfile_bigdata.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=144)

[groupby2_noskew_multi_distinct.q,load_dyn_part12.q,scriptfile1.q,join15.q,auto_join17.q,subquery_multiinsert.q,join_hive_626.q,tez_join_tests.q,parquet_vectorization_16.q,auto_join21.q,join_view.q,join_cond_pushdown_4.q,vectorization_0.q,union_null.q,auto_join3.q]
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_0] 
(batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_10]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_11]
 (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_12]
 (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_13]
 (batchId=52)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_14]
 (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_15]
 (batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_16]
 (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_17]
 (batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_2] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_5] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_6] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_7] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_8] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_9] 
(batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_not]
 (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part_project]
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part_varchar]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[schema_evol_par_vec_table_non_dictionary_encoding]
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
 (batchId=67)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_types_vectorization]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-08 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357165#comment-16357165
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 33 new + 214 unchanged - 84 
fixed = 247 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / b8fdd13 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9096/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9096/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354916#comment-16354916
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909348/test_result_based_on_HIVE-18553.xlsx

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9060/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9060/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9060/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2018-02-07 03:40:44.316
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-9060/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2018-02-07 03:40:44.320
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 3972bf0 HIVE-18613: Extend JsonSerDe to support BINARY type 
(Jesus Camacho Rodriguez, reviewed by Prasanth Jayachandran)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 3972bf0 HIVE-18613: Extend JsonSerDe to support BINARY type 
(Jesus Camacho Rodriguez, reviewed by Prasanth Jayachandran)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2018-02-07 03:40:46.829
+ rm -rf ../yetus
+ mkdir ../yetus
+ git gc
+ cp -R . ../yetus
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-9060/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
fatal: unrecognized input
fatal: unrecognized input
fatal: unrecognized input
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909348 - PreCommit-HIVE-Build

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-05 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353488#comment-16353488
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


Thanks [~Ferd] for the patch. I have left some minor comments on the review 
board. Rest looks good to me.

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352225#comment-16352225
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12909194/HIVE-18553.4.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 12971 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=49)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9017/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9017/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9017/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 26 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12909194 - PreCommit-HIVE-Build

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352196#comment-16352196
 ] 

Hive QA commented on HIVE-18553:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} ql: The patch generated 0 new + 126 unchanged - 11 
fixed = 126 total (was 137) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 0a328f0 |
| Default Java | 1.8.0_111 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9017/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352016#comment-16352016
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908946/HIVE-18553.3.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 27 failed/errored test(s), 12972 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] 
(batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_bmj_schema_evolution]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.TestTxnNoBuckets.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testCTAS (batchId=280)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9013/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9013/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9013/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 27 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12908946 - PreCommit-HIVE-Build

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-02-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351996#comment-16351996
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 46 new + 137 unchanged - 1 
fixed = 183 total (was 138) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 0a328f0 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9013/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9013/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-30 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346383#comment-16346383
 ] 

Ferdinand Xu commented on HIVE-18553:
-

At my current understanding, rename or type conversion may require index 
access. It may not work for the current workaround for vectorization path. We 
can spend some time to investigate it for fully support. The current patch can 
be considered as a quick workaround. Any thoughts on this?

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
>   

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-29 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16343725#comment-16343725
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


Hi [~Ferd] Thanks for the patch. Can you please test with the patch what 
happens when we change the column types? Does it work in non-vectorized code?

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342843#comment-16342843
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908077/HIVE-18553.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 12632 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=78)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestDropPartitions.testDropPartition[Embedded]
 (batchId=207)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=207)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8901/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8901/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8901/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12908077 - PreCommit-HIVE-Build

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342828#comment-16342828
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
4s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 10 new + 40 unchanged - 0 
fixed = 50 total (was 40) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
18s{color} | {color:red} The patch generated 6 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 1dd863a |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8901/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8901/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8901/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-28 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342813#comment-16342813
 ] 

Ferdinand Xu commented on HIVE-18553:
-

Thanks [~colinma] for the review. It should work for adding cases since reader 
builder is involved each time for row group checking. But current patch surely 
not support other cases like type conversion. It may require further 
investigations.

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-28 Thread Colin Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342800#comment-16342800
 ] 

Colin Ma commented on HIVE-18553:
-

[~Ferd], I think the patch doesn't cover the case add a column and insert new 
value into table.

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342736#comment-16342736
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12908040/HIVE-18553.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 12602 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=160)

[vector_reduce_groupby_duplicate_cols.q,explainuser_1.q,multi_insert.q,tez_dml.q,cross_prod_4.q,vector_bround.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vector_groupby_grouping_sets2.q,cte_3.q,vector_reduce_groupby_decimal.q,vector_ptf_part_simple.q,vector_decimal_cast.q,groupby_grouping_id2.q,tez_smb_empty.q,schema_evol_text_vecrow_part_all_primitive_llap_io.q,orc_merge6.q,cte_mat_1.q,vector_char_mapjoin1.q,cte_5.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,vector_string_concat.q,vector_windowing_windowspec.q,sharedworkext.q,vectorized_context.q,auto_sortmerge_join_12.q,tez_union_multiinsert.q,mapjoin_hint.q]
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[interval_comparison] 
(batchId=76)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=78)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=163)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded]
 (batchId=207)
org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded]
 (batchId=207)
org.apache.hadoop.hive.ql.exec.TestOperators.testNoConditionalTaskSizeForLlap 
(batchId=282)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=256)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.hcatalog.common.TestHiveClientCache.testCloseAllClients 
(batchId=200)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8897/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8897/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8897/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12908040 - PreCommit-HIVE-Build

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342712#comment-16342712
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
35s{color} | {color:red} ql: The patch generated 10 new + 40 unchanged - 0 
fixed = 50 total (was 40) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 6 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 1dd863a |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8897/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8897/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8897/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-27 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342434#comment-16342434
 ] 

Ferdinand Xu commented on HIVE-18553:
-

The patch only covers the add case. For rename or deletion, we also need to 
consider it. We can file other tickets if problem exists.

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-27 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16342197#comment-16342197
 ] 

Ferdinand Xu commented on HIVE-18553:
-

For schema evolution, we can create a dummy reader to handle it as the POC 
patch did. Any thoughts on it?

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.patch
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-26 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341484#comment-16341484
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


Thanks [~colinma] for taking at look. I think there is an issue with your 
insert statement. The newly added column type is a timestamp but you are adding 
15. If you add a timestamp value it works in non-vectorized code.

{code}
insert into test_p values (1,2,3,4, '2018-01-01 01:01:01.123456');
select * from test_p;
+++--++-+
| test_p.t1  | test_p.t2  |  test_p.i1   | test_p.i2  |  test_p.ts  
|
+++--++-+
| -125   | 10 | 2147483647   | -10| NULL
|
| 125| -10| -2147483648  | 10 | NULL
|
| 125| 10 | 2147483647   | 10 | NULL
|
| 1  | 2  | 3| 4  | 2018-01-01 
01:01:01.123456  |
+++--++-+
{code}

In vectorized execution the select * fails.

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-26 Thread Colin Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340731#comment-16340731
 ] 

Colin Ma commented on HIVE-18553:
-

Hi [~vihangk1], I did some investigation on this problem:

Step 1: with out vectorization, check the impacts for ORC and Parquet when 
adding a new column, the following is the related statements:
{code:java}
create table test_p_parquet(t1 tinyint, t2 tinyint, i1 int, i2 int) stored as 
parquet;
create table test_p_orc(t1 tinyint, t2 tinyint, i1 int, i2 int) stored as orc;
insert into test_p_parquet values (1,2,3,4),(5,6,7,8); 
insert into test_p_orc values (1,2,3,4),(5,6,7,8);
alter table test_p_parquet add columns (ts timestamp);
alter table test_p_orc add columns (ts timestamp);

select * from test_p_parquet;
1   2   3   4   NULL
5   6   7   8   NULL
select * from test_p_orc;
1   2   3   4   NULL
5   6   7   8   NULL{code}
 The result is what we expected by now, but when insert new data, there has 
some problems:
{code:java}
insert into test_p_parquet values (11,12,13,14,15);
insert into test_p_orc values (11,12,13,14,15);
select * from test_p_parquet;
1   2   3   4   NULL
5   6   7   8   NULL
11  12  13  14  NULL
select * from test_p_orc;
1   2   3   4   NULL
5   6   7   8   NULL
11  12  13  14  NULL
{code}
 The new column still is null, new data is lost.
 From this result, I think with Parquet and ORC, Hive can't add data to new 
column. 
  

Step 2: with vectorization, the result of Parquet is what you describe, and the 
result of ORC is also incorrect which is an empty result.

 Check the implementation of VectorizedParquetRecordReader, the exception is 
because of the new column doesn't exist in parquet file. That means the data 
file won't change when new column added.

  I think the root problem is if Hive support add column to Parquet/ORC 
dynamically, and it's not the problem of VectorizedParquetRecordReader.

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-18553) VectorizedParquetReader fails after adding a new column to table

2018-01-25 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340435#comment-16340435
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


cc: [~Ferd]

> VectorizedParquetReader fails after adding a new column to table
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Priority: Major
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
>