[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-15 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366491#comment-16366491
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


Merged an addendum which removes the extraneous q.out file

commit 01f34e49b352bd06ad8e65a1da613de45773c1c6
Author: Vihang Karajgaonkar 
Date:   Thu Feb 15 17:04:44 2018 -0800

Addendum to HIVE-18553 : Support schema evolution in Parquet Vectorization 
reader. Removes extra q.out file

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, 
> HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-15 Thread KaiXu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366489#comment-16366489
 ] 

KaiXu commented on HIVE-18553:
--

Thanks for your email. I am taking annual leave, email responses can be 
delayed. Sorry for any inconveniences.


> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, 
> HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-15 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366488#comment-16366488
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


Looks like {{schema_evol_par_vec_table.q.out}} doesn't need to be in this 
patch. There is a no corresponding .q file introduced and it looks like a copy 
of newly added {{schema_evol_par_vec_table_dictionary_encoding.q}}

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, 
> HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-14 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365071#comment-16365071
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


Test failures are unrelated. Patch merged to master branch. Thanks for your 
contribution [~Ferd]

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, 
> HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364974#comment-16364974
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910631/HIVE-18553.91.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 13103 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=240)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=78)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=174)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=160)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=179)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=121)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=250)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=221)
org.apache.hadoop.hive.metastore.client.TestFunctions.testGetFunctionNullDatabase[Embedded]
 (batchId=205)
org.apache.hadoop.hive.metastore.client.TestTablesGetExists.testGetAllTablesCaseInsensitive[Embedded]
 (batchId=205)
org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded]
 (batchId=205)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=224)
org.apache.hadoop.hive.ql.TestTxnNoBucketsVectorized.testInsertFromUnion 
(batchId=280)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=187)
org.apache.hive.hcatalog.listener.TestDbNotificationListener.alterIndex 
(batchId=242)
org.apache.hive.hcatalog.listener.TestDbNotificationListener.createIndex 
(batchId=242)
org.apache.hive.hcatalog.listener.TestDbNotificationListener.dropIndex 
(batchId=242)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=234)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=234)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveConflictKill
 (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9218/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9218/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9218/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910631 - PreCommit-HIVE-Build

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, 
> HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364892#comment-16364892
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} ql: The patch generated 0 new + 68 unchanged - 230 
fixed = 68 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 15m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / a2d22b4 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9218/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9218/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, 
> HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-14 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364803#comment-16364803
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


precommit didn't trigger for this for some reason. Reattched the latest patch 
as HIVE-18553.91.patch

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.91.patch, 
> HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363373#comment-16363373
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


+1 (pending tests) LGTM.

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.11.patch, 
> HIVE-18553.2.patch, HIVE-18553.3.patch, HIVE-18553.4.patch, 
> HIVE-18553.5.patch, HIVE-18553.6.patch, HIVE-18553.7.patch, 
> HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> For schema evolution, it includes the following points:
> 1. column changes
> column reorder
> column add, column delete
> column rename
> 2. type conversion
> low precision to high precision
> type to String
> For 1st type, current the code is not supporting the column addition 
> operation. Detailed error is as follows:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363303#comment-16363303
 ] 

Ferdinand Xu commented on HIVE-18553:
-

[~vihangk1], they're related to the change of boolean type handling. A fix is 
included in 10th patch.

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.10.patch, HIVE-18553.2.patch, 
> HIVE-18553.3.patch, HIVE-18553.4.patch, HIVE-18553.5.patch, 
> HIVE-18553.6.patch, HIVE-18553.7.patch, HIVE-18553.8.patch, 
> HIVE-18553.9.patch, HIVE-18553.patch, test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52) 
> ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459) 
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363018#comment-16363018
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910349/HIVE-18553.9.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 30 failed/errored test(s), 13154 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
 (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.TestHiveMetaTool.testExecuteJDOQL (batchId=226)
org.apache.hadoop.hive.metastore.TestHiveMetaTool.testListFSRoot (batchId=226)
org.apache.hadoop.hive.metastore.TestHiveMetaTool.testUpdateFSRootLocation 
(batchId=226)
org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocks (batchId=225)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch
 (batchId=271)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestJdbcWithMiniLlap.testLlapInputFormatEndToEnd 
(batchId=236)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9195/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9195/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9195/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 30 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910349 - PreCommit-HIVE-Build

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362958#comment-16362958
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} ql: The patch generated 0 new + 68 unchanged - 230 
fixed = 68 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
13s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 8cf36e7 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9195/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9195/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362928#comment-16362928
 ] 

Vihang Karajgaonkar commented on HIVE-18553:


Hi [~Ferd] are these failures related?

org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch
 (batchId=271)

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> {code}
> Caused by: java.lang.IllegalArgumentException: [ts] BINARY is not in the 
> store: [[i1] INT32, [i2] INT32, [t1] INT32, [t2] INT32] 3
> at 
> org.apache.parquet.hadoop.ColumnChunkPageReadStore.getPageReader(ColumnChunkPageReadStore.java:160)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.buildVectorizedParquetReader(VectorizedParquetRecordReader.java:479)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:432)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:393)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:345)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:88)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:167)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:52)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:229)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:142)
>  ~[hive-exec-3.0.0-SNAPSHOT.jar:3.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
>  ~[hadoop-mapreduce-client-core-3.0.0-alpha3-cdh6.x-SNAPSHOT.jar:?]
> at 
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185) 
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362924#comment-16362924
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910349/HIVE-18553.9.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 13178 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mapjoin_hook] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_1]
 (batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.client.TestTablesCreateDropAlterTruncate.testAlterTableNullStorageDescriptorInNew[Embedded]
 (batchId=206)
org.apache.hadoop.hive.metastore.client.TestTablesList.testListTableNamesByFilterNullDatabase[Embedded]
 (batchId=206)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadEqualOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadLessOneBatch
 (batchId=271)
org.apache.hadoop.hive.ql.io.parquet.TestVectorizedListColumnReader.testListReadMoreOneBatch
 (batchId=271)
org.apache.hive.beeline.TestBeeLineWithArgs.testEscapeCRLFOffInDSVOutput 
(batchId=232)
org.apache.hive.beeline.TestBeeLineWithArgs.testQueryProgress (batchId=232)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestTriggersMoveWorkloadManager.testTriggerMoveBackKill 
(batchId=236)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9194/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9194/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9194/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910349 - PreCommit-HIVE-Build

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1   

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362854#comment-16362854
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} ql: The patch generated 0 new + 68 unchanged - 230 
fixed = 68 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 5ddd585 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9194/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9194/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.9.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
> {code}
> Following exception is seen in the logs
> 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360527#comment-16360527
 ] 

Hive QA commented on HIVE-18553:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12910162/HIVE-18553.8.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 13169 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries]
 (batchId=241)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=79)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_move_tbl]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_opt_shuffle_serde]
 (batchId=180)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=122)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query1] 
(batchId=251)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=222)
org.apache.hadoop.hive.metastore.TestMarkPartition.testMarkingPartitionSet 
(batchId=215)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=257)
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.checkExpectedLocks 
(batchId=294)
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testMetadataOperationLocks 
(batchId=294)
org.apache.hive.beeline.cli.TestHiveCli.testNoErrorDB (batchId=188)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=235)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=235)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=235)
org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithBinary.org.apache.hive.service.cli.thrift.TestThriftCLIServiceWithBinary
 (batchId=233)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/9169/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/9169/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-9169/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12910162 - PreCommit-HIVE-Build

> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing 

[jira] [Commented] (HIVE-18553) Support schema evolution in Parquet Vectorization reader

2018-02-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360461#comment-16360461
 ] 

Hive QA commented on HIVE-18553:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 1 new + 68 unchanged - 230 
fixed = 69 total (was 298) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 49 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 2338846 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9169/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9169/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-9169/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support schema evolution in Parquet Vectorization reader
> 
>
> Key: HIVE-18553
> URL: https://issues.apache.org/jira/browse/HIVE-18553
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0, 2.4.0, 2.3.2
>Reporter: Vihang Karajgaonkar
>Assignee: Ferdinand Xu
>Priority: Major
> Attachments: HIVE-18553.2.patch, HIVE-18553.3.patch, 
> HIVE-18553.4.patch, HIVE-18553.5.patch, HIVE-18553.6.patch, 
> HIVE-18553.7.patch, HIVE-18553.8.patch, HIVE-18553.patch, 
> test_result_based_on_HIVE-18553.xlsx
>
>
> VectorizedParquetReader throws an exception when trying to reading from a 
> parquet table on which new columns are added. Steps to reproduce below:
> {code}
> 0: jdbc:hive2://localhost:1/default> desc test_p;
> +---++--+
> | col_name  | data_type  | comment  |
> +---++--+
> | t1| tinyint|  |
> | t2| tinyint|  |
> | i1| int|  |
> | i2| int|  |
> +---++--+
> 0: jdbc:hive2://localhost:1/default> set hive.fetch.task.conversion=none;
> 0: jdbc:hive2://localhost:1/default> set 
> hive.vectorized.execution.enabled=true;
> 0: jdbc:hive2://localhost:1/default> alter table test_p add columns (ts 
> timestamp);
> 0: jdbc:hive2://localhost:1/default> select * from test_p;
> Error: Error while processing statement: FAILED: Execution Error, return code 
> 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask