[jira] [Updated] (HIVE-16976) DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN

2019-01-02 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16976:
--
Attachment: HIVE-16976.5.patch

> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
>
> Key: HIVE-16976
> URL: https://issues.apache.org/jira/browse/HIVE-16976
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.1.1, 3.0.0
>Reporter: Gopal V
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-16976.1.patch, HIVE-16976.2.patch, 
> HIVE-16976.3.patch, HIVE-16976.4.patch, HIVE-16976.5.patch
>
>
> Tez DPP does not kick in for scenarios where a user wants to run a comparison 
> clause instead of a JOIN/IN clause.
> {code}
> explain select count(1) from store_sales where ss_sold_date_sk > (select 
> max(d_Date_sk) from date_dim where d_year = 2017);
> Warning: Map Join MAPJOIN[21][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Reducer 4 (BROADCAST_EDGE)
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2 vectorized, llap
>   File Output Operator [FS_36]
> Group By Operator [GBY_35] (rows=1 width=8)
>   Output:["_col0"],aggregations:["count(VALUE._col0)"]
> <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap
>   PARTITION_ONLY_SHUFFLE [RS_34]
> Group By Operator [GBY_33] (rows=1 width=8)
>   Output:["_col0"],aggregations:["count(1)"]
>   Select Operator [SEL_32] (rows=9600142089 width=16)
> Filter Operator [FIL_31] (rows=9600142089 width=16)
>   predicate:(_col0 > _col1)
>   Map Join Operator [MAPJOIN_30] (rows=28800426268 width=16)
> Conds:(Inner),Output:["_col0","_col1"]
>   <-Reducer 4 [BROADCAST_EDGE] vectorized, llap
> BROADCAST [RS_28]
>   Group By Operator [GBY_27] (rows=1 width=8)
> Output:["_col0"],aggregations:["max(VALUE._col0)"]
>   <-Map 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap
> PARTITION_ONLY_SHUFFLE [RS_26]
>   Group By Operator [GBY_25] (rows=1 width=8)
> Output:["_col0"],aggregations:["max(d_date_sk)"]
> Select Operator [SEL_24] (rows=652 width=12)
>   Output:["d_date_sk"]
>   Filter Operator [FIL_23] (rows=652 width=12)
> predicate:(d_year = 2017)
> TableScan [TS_2] (rows=73049 width=12)
>   
> tpcds_bin_partitioned_newschema_orc_1@date_dim,date_dim,Tbl:COMPLETE,Col:COMPLETE,Output:["d_date_sk","d_year"]
>   <-Select Operator [SEL_29] (rows=28800426268 width=8)
>   Output:["_col0"]
>   TableScan [TS_0] (rows=28800426268 width=172)
> 
> tpcds_bin_partitioned_newschema_orc_1@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE
> {code}
> The SyntheticJoinPredicate is only injected for equi joins, not for < or > 
> scalar subqueries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20960) remove CompactorMR.createCompactorMarker()

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732721#comment-16732721
 ] 

Hive QA commented on HIVE-20960:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
40s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
37s{color} | {color:red} ql: The patch generated 2 new + 161 unchanged - 1 
fixed = 163 total (was 162) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15469/dev-support/hive-personality.sh
 |
| git revision | master / dc215b1 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15469/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15469/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> remove CompactorMR.createCompactorMarker()
> --
>
> Key: HIVE-20960
> URL: https://issues.apache.org/jira/browse/HIVE-20960
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20960.01.patch, HIVE-20960.02.patch, 
> HIVE-20960.03.patch
>
>
> Now that we have HIVE-20823, we know if a dir is produced by compactor from 
> the name and {{CompactorMR.createCompactorMarker()}} can be removed.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21078) Replicate column and table level statistics for unpartitioned Hive tables

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732710#comment-16732710
 ] 

Hive QA commented on HIVE-21078:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953506/HIVE-21078.01.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 198 failed/errored test(s), 15761 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_subquery] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_create_temp_table]
 (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_7] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_partlvl] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnstats_quoting] 
(batchId=96)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[constprog3] (batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_transactional_full_acid]
 (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cte_mat_4] (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_tmp_table] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_locks] 
(batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[foldts] (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_tmp_table] 
(batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_10] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_9] (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_acid_no_masking] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_all] (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_cttas] (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mm_iow_temp] (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partition_coltype_literals]
 (batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_create_table_temp_table]
 (batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_partial_size] 
(batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_display_colstats_tbllvl]
 (batchId=84)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_gb1] 
(batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_join1] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_names] 
(batchId=53)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_options1] 
(batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_precedence] 
(batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_subquery1] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_truncate] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[temp_table_windowing_expressions]
 (batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[tunable_ndv] (batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_const] 
(batchId=67)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join_no_keys]
 (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_tablesample_rows] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] 
(batchId=25)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin3] 
(batchId=13)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cttl] 
(batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] 
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_cttas] 
(batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] 
(batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[temp_table_external]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[approx_distinct]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[check_constraint]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cte_mat_4] 
(batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[current_date_timestamp]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_tmp_table]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3]
 (batchId=179)

[jira] [Updated] (HIVE-21078) Replicate column and table level statistics for unpartitioned Hive tables

2019-01-02 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21078:
--
Labels: pull-request-available  (was: )

> Replicate column and table level statistics for unpartitioned Hive tables
> -
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21078.01.patch
>
>
> This task is for replicating column and table level statistics for 
> unpartitioned tables.  The same for partitioned tables will be worked upon in 
> a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21078) Replicate column and table level statistics for unpartitioned Hive tables

2019-01-02 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732707#comment-16732707
 ] 

ASF GitHub Bot commented on HIVE-21078:
---

GitHub user ashutosh-bapat opened a pull request:

https://github.com/apache/hive/pull/511

HIVE-21078: Replicate column and table level statistics for unpartitioned 
Hive tables

@maheshk114, @sankarh can you please review?



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ashutosh-bapat/hive hive21078

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/hive/pull/511.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #511


commit db98502a44f69f255924231b03e2145248c9be0f
Author: Ashutosh Bapat 
Date:   2018-12-19T04:49:29Z

HIVE-21078: Replicate column and table level statistics for unpartitioned 
Hive tables

The column statistics is included as part of the Table object during 
bootstrap dump and loaded when
corresponding table is created on replica.

During incremental dump and load, UpdateTableColStats event is used to 
replicate the statistics.

In both the cases, the statistics is replicated only when the data is 
replicated.

Ashutosh Bapat




> Replicate column and table level statistics for unpartitioned Hive tables
> -
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21078.01.patch
>
>
> This task is for replicating column and table level statistics for 
> unpartitioned tables.  The same for partitioned tables will be worked upon in 
> a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21078) Replicate column and table level statistics for unpartitioned Hive tables

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732699#comment-16732699
 ] 

Hive QA commented on HIVE-21078:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
15s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
2s{color} | {color:blue} standalone-metastore/metastore-server in master has 
188 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
44s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
41s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 10 new + 535 unchanged - 1 
fixed = 545 total (was 536) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} itests/hive-unit: The patch generated 3 new + 154 
unchanged - 0 fixed = 157 total (was 154) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
50s{color} | {color:red} ql generated 1 new + 2312 unchanged - 0 fixed = 2313 
total (was 2312) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
49s{color} | {color:red} standalone-metastore_metastore-common generated 1 new 
+ 17 unchanged - 0 fixed = 18 total (was 17) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} standalone-metastore_metastore-server generated 0 
new + 49 unchanged - 1 fixed = 49 total (was 50) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} ql in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} hcatalog-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} hive-unit in the patch passed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 42m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Class org.apache.hadoop.hive.ql.plan.MoveWork defines non-transient 
non-serializable instance field replicationSpec  In MoveWork.java:instance 
field replicationSpec  In MoveWork.java |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 

[jira] [Commented] (HIVE-20911) External Table Replication for Hive

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732681#comment-16732681
 ] 

Hive QA commented on HIVE-20911:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953565/HIVE-20911.08.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 15694 tests 
executed
*Failed tests:*
{noformat}
TestAlterTableMetadata - did not produce a TEST-*.xml file (likely timed out) 
(batchId=250)
TestReplAcidTablesWithJsonMessage - did not produce a TEST-*.xml file (likely 
timed out) (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplTableMigrationWithJsonFormat.testBootstrapLoadMigrationManagedToAcid
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplTableMigrationWithJsonFormat.testBootstrapLoadMigrationToAcidWithMoveOptimization
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplTableMigrationWithJsonFormat.testIncrementalLoadMigrationManagedToAcid
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplTableMigrationWithJsonFormat.testIncrementalLoadMigrationManagedToAcidAllOp
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplTableMigrationWithJsonFormat.testIncrementalLoadMigrationManagedToAcidFailure
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplTableMigrationWithJsonFormat.testIncrementalLoadMigrationManagedToAcidFailurePart
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplTableMigrationWithJsonFormat.testIncrementalLoadMigrationToAcidWithMoveOptimization
 (batchId=250)
org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testBootstrapLoadMigrationManagedToAcid
 (batchId=244)
org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testBootstrapLoadMigrationToAcidWithMoveOptimization
 (batchId=244)
org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationManagedToAcid
 (batchId=244)
org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationManagedToAcidFailure
 (batchId=244)
org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationManagedToAcidFailurePart
 (batchId=244)
org.apache.hadoop.hive.ql.parse.TestReplicationWithTableMigration.testIncrementalLoadMigrationToAcidWithMoveOptimization
 (batchId=244)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15467/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15467/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15467/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12953565 - PreCommit-HIVE-Build

> External Table Replication for Hive
> ---
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, 
> HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, 
> HIVE-20911.06.patch, HIVE-20911.07.patch, HIVE-20911.07.patch, 
> HIVE-20911.08.patch, HIVE-20911.08.patch
>
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster. This can be 
> provided using the following configuration:
> {code}
> hive.repl.replica.external.table.base.dir=/
> {code}
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> tableName,base64Encoded(tableDataLocation)
> {code}
> In case there are different partitions in the table 

[jira] [Updated] (HIVE-20960) remove CompactorMR.createCompactorMarker()

2019-01-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20960:
--
Attachment: HIVE-20960.03.patch

> remove CompactorMR.createCompactorMarker()
> --
>
> Key: HIVE-20960
> URL: https://issues.apache.org/jira/browse/HIVE-20960
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20960.01.patch, HIVE-20960.02.patch, 
> HIVE-20960.03.patch
>
>
> Now that we have HIVE-20823, we know if a dir is produced by compactor from 
> the name and {{CompactorMR.createCompactorMarker()}} can be removed.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20911) External Table Replication for Hive

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732678#comment-16732678
 ] 

Hive QA commented on HIVE-20911:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
48s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
40s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} testutils/ptest2 in master has 24 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 13 new + 431 unchanged - 12 
fixed = 444 total (was 443) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
20s{color} | {color:red} itests/hive-unit: The patch generated 55 new + 729 
unchanged - 47 fixed = 784 total (was 776) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
44s{color} | {color:red} ql generated 2 new + 2311 unchanged - 1 fixed = 2313 
total (was 2312) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  The field 
org.apache.hadoop.hive.ql.exec.repl.ReplLoadWork.pathsToCopyIterator is 
transient but isn't set by deserialization  In ReplLoadWork.java:but isn't set 
by deserialization  In ReplLoadWork.java |
|  |  Write to static field 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.numIteration
 from instance method 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(DriverContext,
 Hive, Logger, ReplLoadWork, TaskTracker)  At 
IncrementalLoadTasksBuilder.java:from instance method 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(DriverContext,
 Hive, Logger, ReplLoadWork, TaskTracker)  At 
IncrementalLoadTasksBuilder.java:[line 100] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15467/dev-support/hive-personality.sh
 |
| git revision | master / dc215b1 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Commented] (HIVE-19521) Add docker support for standalone-metastore

2019-01-02 Thread Ke Zhu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732653#comment-16732653
 ] 

Ke Zhu commented on HIVE-19521:
---

Now I use a Dockerfile like [this 
gist|https://gist.github.com/shawnzhu/d19507336145053e4b871dd1271aed32] to 
create a docker image as standalone metastore.

 

I'm interested in how docker support works in Hive code base and glad to 
contribute to this issue to get out of box support as long as I have pointers 
to get started.

> Add docker support for standalone-metastore
> ---
>
> Key: HIVE-19521
> URL: https://issues.apache.org/jira/browse/HIVE-19521
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sriharsha Chintalapani
>Assignee: Sriharsha Chintalapani
>Priority: Major
>
> Build docker module to run hive metastore & registry servers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20911) External Table Replication for Hive

2019-01-02 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-20911:
---
Attachment: HIVE-20911.08.patch

> External Table Replication for Hive
> ---
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, 
> HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, 
> HIVE-20911.06.patch, HIVE-20911.07.patch, HIVE-20911.07.patch, 
> HIVE-20911.08.patch, HIVE-20911.08.patch
>
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster. This can be 
> provided using the following configuration:
> {code}
> hive.repl.replica.external.table.base.dir=/
> {code}
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> tableName,base64Encoded(tableDataLocation)
> {code}
> In case there are different partitions in the table pointing to different 
> locations there will be multiple entries in the file for the same table name 
> with location pointing to different partition locations. For partitions 
> created in a table without specifying the _set location_ command will be 
> within the same table Data location and hence there will not be different 
> entries in the file above 
> ** *repl load* will read the  *\_external\_tables\_info* to identify what 
> locations are to be copied from source to target and create corresponding 
> tasks for them.
> * New External tables will be created with metadata only with no data copied 
> as part of regular tasks while incremental load/bootstrap load.
> * Bootstrap dump will also create  *\_external\_tables\_info* which will be 
> used to copy data from source to target  as part of boostrap load.
> * Bootstrap load will create a DAG, that can use parallelism in the execution 
> phase, the hdfs copy related tasks are created, once the bootstrap phase is 
> complete.
> * Since incremental load results in a DAG with only sequential execution ( 
> events applied in sequence ) to effectively use the parallelism capability in 
> execution mode, we create tasks for hdfs copy along with the incremental DAG. 
> This requires a few basic calculations to approximately meet the configured 
> value in  "hive.repl.approx.max.load.tasks" 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732589#comment-16732589
 ] 

Hive QA commented on HIVE-17020:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953543/HIVE-17020.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 15759 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[implicit_cast_during_insert]
 (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge3] (batchId=63)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=159)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15461/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15461/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15461/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12953543 - PreCommit-HIVE-Build

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch, 
> HIVE-17020.3.patch, HIVE-17020.4.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-18884) Simplify Logging in Hive Metastore Client

2019-01-02 Thread Mani M (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732585#comment-16732585
 ] 

Mani M commented on HIVE-18884:
---

Hi Peter
Thanks for the info.
I hv tried with patch 03,04 and 05. The test cases are getting failed in
the different classes other than the class file changed.
How can we avoid it, any suggestions.
With Regards
M.Mani
+61 432 461 087





> Simplify Logging in Hive Metastore Client
> -
>
> Key: HIVE-18884
> URL: https://issues.apache.org/jira/browse/HIVE-18884
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Mani M
>Priority: Minor
>  Labels: noob
> Attachments: HIVE.18884.02.patch, HIVE.18884.03.patch, 
> HIVE.18884.04.patch, HIVE.18884.05.patch, HIVE.18884.patch
>
>
> https://github.com/apache/hive/blob/4047befe48c8f762c58d8854e058385c1df151c6/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> The current logging is:
> {code}
> 2018-02-26 07:02:44,883  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-65]: Trying to connect to metastore with URI 
> thrift://host.company.com:9083
> 2018-02-26 07:02:44,892  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-65]: Connected to metastore.
> 2018-02-26 07:02:44,892  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-65]: Opened a connection to metastore, current connections: 2
> {code}
> Please simplify to something like:
> {code}
> 2018-02-26 07:02:44,892  INFO  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-65]: Opened a connection to the Metastore Server (URI 
> thrift://host.company.com:9083), current connections: 2
> ... or ...
> 2018-02-26 07:02:44,892  ERROR  hive.metastore: [HiveServer2-Handler-Pool: 
> Thread-65]: Failed to connect to the Metastore Server (URI 
> thrift://host.company.com:9083)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20960) remove CompactorMR.createCompactorMarker()

2019-01-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20960:
--
Attachment: HIVE-20960.02.patch

> remove CompactorMR.createCompactorMarker()
> --
>
> Key: HIVE-20960
> URL: https://issues.apache.org/jira/browse/HIVE-20960
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20960.01.patch, HIVE-20960.02.patch
>
>
> Now that we have HIVE-20823, we know if a dir is produced by compactor from 
> the name and {{CompactorMR.createCompactorMarker()}} can be removed.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732566#comment-16732566
 ] 

Hive QA commented on HIVE-17020:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
41s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15461/dev-support/hive-personality.sh
 |
| git revision | master / dc215b1 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15461/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch, 
> HIVE-17020.3.patch, HIVE-17020.4.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20977) Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-02 Thread Karthik Manamcheri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri updated HIVE-20977:
--
Attachment: HIVE-20977.4.patch

> Lazy evaluate the table object in PreReadTableEvent to improve get_partition 
> performance
> 
>
> Key: HIVE-20977
> URL: https://issues.apache.org/jira/browse/HIVE-20977
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
> Attachments: HIVE-20977.1.patch, HIVE-20977.2.patch, 
> HIVE-20977.3.patch, HIVE-20977.4.patch
>
>
> The PreReadTableEvent is generated for non-table operations (such as 
> get_partitions), but only if there is an event listener attached. However, 
> this is also not necessary if the event listener is not interested in the 
> read table event.
> For example, the TransactionalValidationListener's onEvent looks like this
> {code:java}
> @Override
> public void onEvent(PreEventContext context) throws MetaException, 
> NoSuchObjectException,
> InvalidOperationException {
>   switch (context.getEventType()) {
> case CREATE_TABLE:
>   handle((PreCreateTableEvent) context);
>   break;
> case ALTER_TABLE:
>   handle((PreAlterTableEvent) context);
>   break;
> default:
>   //no validation required..
>   }
> }{code}
>  
> Note that for read table events it is a no-op. The problem is that the 
> get_table is evaluated when creating the PreReadTableEvent finally to be just 
> ignored!
> Look at the code below.. {{getMS().getTable(..)}} is evaluated irrespective 
> of if the listener uses it or not.
> {code:java}
> private void fireReadTablePreEvent(String catName, String dbName, String 
> tblName)
> throws MetaException, NoSuchObjectException {
>   if(preListeners.size() > 0) {
> // do this only if there is a pre event listener registered (avoid 
> unnecessary
> // metastore api call)
> Table t = getMS().getTable(catName, dbName, tblName);
> if (t == null) {
>   throw new NoSuchObjectException(TableName.getQualified(catName, dbName, 
> tblName)
>   + " table not found");
> }
> firePreEvent(new PreReadTableEvent(t, this));
>   }
> }
> {code}
> This can be improved by using a {{Supplier}} and lazily evaluating the table 
> when needed (once when the first time it is called, memorized after that).
> *Motivation*
> Whenever a partition call occurs (get_partition, etc.), we fire the 
> PreReadTableEvent. This affects performance since it fetches the table even 
> if it is not being used. This change will improve performance on the 
> get_partition calls.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16957) Support CTAS for auto gather column stats

2019-01-02 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16957:
---
Attachment: HIVE-16957.08.patch

> Support CTAS for auto gather column stats
> -
>
> Key: HIVE-16957
> URL: https://issues.apache.org/jira/browse/HIVE-16957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-16957.01.patch, HIVE-16957.02.patch, 
> HIVE-16957.03.patch, HIVE-16957.04.patch, HIVE-16957.05.patch, 
> HIVE-16957.06.patch, HIVE-16957.07.patch, HIVE-16957.07.patch, 
> HIVE-16957.08.patch, HIVE-16957.patch
>
>
> The idea is to rely as much as possible on the logic in 
> ColumnStatsSemanticAnalyzer as other operations do. In particular, they 
> create a 'analyze table t compute statistics for columns', use 
> ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to 
> existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE 
> MATERIALIZED VIEW is that the table object does not exist yet, hence we 
> cannot rely fully on ColumnStatsSemanticAnalyzer.
> Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a 
> statement for column stats collection that uses a table values clause instead 
> of the original table reference:
> {code}
> select compute_stats(col1), compute_stats(col2), compute_stats(col3)
> from table(values(cast(null as int), cast(null as int), cast(null as 
> string))) as t(col1, col2, col3);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16957) Support CTAS for auto gather column stats

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732554#comment-16732554
 ] 

Hive QA commented on HIVE-16957:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953542/HIVE-16957.07.patch

{color:green}SUCCESS:{color} +1 due to 30 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15759 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid2] 
(batchId=161)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15460/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15460/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15460/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12953542 - PreCommit-HIVE-Build

> Support CTAS for auto gather column stats
> -
>
> Key: HIVE-16957
> URL: https://issues.apache.org/jira/browse/HIVE-16957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-16957.01.patch, HIVE-16957.02.patch, 
> HIVE-16957.03.patch, HIVE-16957.04.patch, HIVE-16957.05.patch, 
> HIVE-16957.06.patch, HIVE-16957.07.patch, HIVE-16957.07.patch, 
> HIVE-16957.patch
>
>
> The idea is to rely as much as possible on the logic in 
> ColumnStatsSemanticAnalyzer as other operations do. In particular, they 
> create a 'analyze table t compute statistics for columns', use 
> ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to 
> existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE 
> MATERIALIZED VIEW is that the table object does not exist yet, hence we 
> cannot rely fully on ColumnStatsSemanticAnalyzer.
> Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a 
> statement for column stats collection that uses a table values clause instead 
> of the original table reference:
> {code}
> select compute_stats(col1), compute_stats(col2), compute_stats(col3)
> from table(values(cast(null as int), cast(null as int), cast(null as 
> string))) as t(col1, col2, col3);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20960) remove CompactorMR.createCompactorMarker()

2019-01-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20960:
--
Status: Patch Available  (was: Open)

> remove CompactorMR.createCompactorMarker()
> --
>
> Key: HIVE-20960
> URL: https://issues.apache.org/jira/browse/HIVE-20960
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20960.01.patch
>
>
> Now that we have HIVE-20823, we know if a dir is produced by compactor from 
> the name and {{CompactorMR.createCompactorMarker()}} can be removed.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20960) remove CompactorMR.createCompactorMarker()

2019-01-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20960:
--
Attachment: HIVE-20960.01.patch

> remove CompactorMR.createCompactorMarker()
> --
>
> Key: HIVE-20960
> URL: https://issues.apache.org/jira/browse/HIVE-20960
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20960.01.patch
>
>
> Now that we have HIVE-20823, we know if a dir is produced by compactor from 
> the name and {{CompactorMR.createCompactorMarker()}} can be removed.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16957) Support CTAS for auto gather column stats

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732537#comment-16732537
 ] 

Hive QA commented on HIVE-16957:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
42s{color} | {color:blue} ql in master has 2312 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 6 new + 563 unchanged - 7 
fixed = 569 total (was 570) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
6s{color} | {color:red} ql generated 1 new + 2310 unchanged - 2 fixed = 2311 
total (was 2312) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
56s{color} | {color:red} ql generated 2 new + 98 unchanged - 2 fixed = 100 
total (was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  
org.apache.hadoop.hive.ql.parse.ColumnStatsSemanticAnalyzer.genPartitionClause(Table,
 Map) makes inefficient use of keySet iterator instead of entrySet iterator  At 
ColumnStatsSemanticAnalyzer.java:of keySet iterator instead of entrySet 
iterator  At ColumnStatsSemanticAnalyzer.java:[line 160] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15460/dev-support/hive-personality.sh
 |
| git revision | master / dc215b1 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15460/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15460/yetus/new-findbugs-ql.html
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15460/yetus/diff-javadoc-javadoc-ql.txt
 |
| modules | C: ql itests/hcatalog-unit itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15460/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support CTAS for auto gather column stats
> -
>
> Key: HIVE-16957
> URL: https://issues.apache.org/jira/browse/HIVE-16957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-16957.01.patch, 

[jira] [Updated] (HIVE-16976) DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN

2019-01-02 Thread Deepak Jaiswal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-16976:
--
Attachment: HIVE-16976.4.patch

> DPP: SyntheticJoinPredicate transitivity for < > and BETWEEN
> 
>
> Key: HIVE-16976
> URL: https://issues.apache.org/jira/browse/HIVE-16976
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.1.1, 3.0.0
>Reporter: Gopal V
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-16976.1.patch, HIVE-16976.2.patch, 
> HIVE-16976.3.patch, HIVE-16976.4.patch
>
>
> Tez DPP does not kick in for scenarios where a user wants to run a comparison 
> clause instead of a JOIN/IN clause.
> {code}
> explain select count(1) from store_sales where ss_sold_date_sk > (select 
> max(d_Date_sk) from date_dim where d_year = 2017);
> Warning: Map Join MAPJOIN[21][bigTable=?] in task 'Map 1' is a cross product
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Map 1 <- Reducer 4 (BROADCAST_EDGE)
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2 vectorized, llap
>   File Output Operator [FS_36]
> Group By Operator [GBY_35] (rows=1 width=8)
>   Output:["_col0"],aggregations:["count(VALUE._col0)"]
> <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap
>   PARTITION_ONLY_SHUFFLE [RS_34]
> Group By Operator [GBY_33] (rows=1 width=8)
>   Output:["_col0"],aggregations:["count(1)"]
>   Select Operator [SEL_32] (rows=9600142089 width=16)
> Filter Operator [FIL_31] (rows=9600142089 width=16)
>   predicate:(_col0 > _col1)
>   Map Join Operator [MAPJOIN_30] (rows=28800426268 width=16)
> Conds:(Inner),Output:["_col0","_col1"]
>   <-Reducer 4 [BROADCAST_EDGE] vectorized, llap
> BROADCAST [RS_28]
>   Group By Operator [GBY_27] (rows=1 width=8)
> Output:["_col0"],aggregations:["max(VALUE._col0)"]
>   <-Map 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap
> PARTITION_ONLY_SHUFFLE [RS_26]
>   Group By Operator [GBY_25] (rows=1 width=8)
> Output:["_col0"],aggregations:["max(d_date_sk)"]
> Select Operator [SEL_24] (rows=652 width=12)
>   Output:["d_date_sk"]
>   Filter Operator [FIL_23] (rows=652 width=12)
> predicate:(d_year = 2017)
> TableScan [TS_2] (rows=73049 width=12)
>   
> tpcds_bin_partitioned_newschema_orc_1@date_dim,date_dim,Tbl:COMPLETE,Col:COMPLETE,Output:["d_date_sk","d_year"]
>   <-Select Operator [SEL_29] (rows=28800426268 width=8)
>   Output:["_col0"]
>   TableScan [TS_0] (rows=28800426268 width=172)
> 
> tpcds_bin_partitioned_newschema_orc_1@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE
> {code}
> The SyntheticJoinPredicate is only injected for equi joins, not for < or > 
> scalar subqueries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Karthik Manamcheri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732524#comment-16732524
 ] 

Karthik Manamcheri commented on HIVE-21044:
---

Finally, all tests passed! [~pvary] can you merge this change? Thank you.

> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch, HIVE-21044.4.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732517#comment-16732517
 ] 

Hive QA commented on HIVE-21044:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953537/HIVE-21044.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15760 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15459/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15459/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15459/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12953537 - PreCommit-HIVE-Build

> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch, HIVE-21044.4.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2019-01-02 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17020:
---
Status: Patch Available  (was: Reopened)

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch, 
> HIVE-17020.3.patch, HIVE-17020.4.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-17020) Aggressive RS dedup can incorrectly remove OP tree branch

2019-01-02 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-17020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17020:
---
Attachment: HIVE-17020.4.patch

> Aggressive RS dedup can incorrectly remove OP tree branch
> -
>
> Key: HIVE-17020
> URL: https://issues.apache.org/jira/browse/HIVE-17020
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-17020.1.patch, HIVE-17020.2.patch, 
> HIVE-17020.3.patch, HIVE-17020.4.patch
>
>
> Suppose we have an OP tree like this:
> {noformat}
>  ...
>   |
>  RS[1]
>   |
> SEL[2]
> /\
> SEL[3]   SEL[4]
>   | |
> RS[5] FS[6]
>   |
>  ... 
> {noformat}
> When doing aggressive RS dedup, we'll remove all the operators between RS5 
> and RS1, and thus the branch containing FS6 is lost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732486#comment-16732486
 ] 

Hive QA commented on HIVE-21044:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
16s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
3s{color} | {color:blue} standalone-metastore/metastore-server in master has 
188 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15459/dev-support/hive-personality.sh
 |
| git revision | master / dc215b1 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15459/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch, HIVE-21044.4.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16957) Support CTAS for auto gather column stats

2019-01-02 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16957:
---
Attachment: HIVE-16957.07.patch

> Support CTAS for auto gather column stats
> -
>
> Key: HIVE-16957
> URL: https://issues.apache.org/jira/browse/HIVE-16957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-16957.01.patch, HIVE-16957.02.patch, 
> HIVE-16957.03.patch, HIVE-16957.04.patch, HIVE-16957.05.patch, 
> HIVE-16957.06.patch, HIVE-16957.07.patch, HIVE-16957.07.patch, 
> HIVE-16957.patch
>
>
> The idea is to rely as much as possible on the logic in 
> ColumnStatsSemanticAnalyzer as other operations do. In particular, they 
> create a 'analyze table t compute statistics for columns', use 
> ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to 
> existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE 
> MATERIALIZED VIEW is that the table object does not exist yet, hence we 
> cannot rely fully on ColumnStatsSemanticAnalyzer.
> Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a 
> statement for column stats collection that uses a table values clause instead 
> of the original table reference:
> {code}
> select compute_stats(col1), compute_stats(col2), compute_stats(col3)
> from table(values(cast(null as int), cast(null as int), cast(null as 
> string))) as t(col1, col2, col3);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20977) Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732468#comment-16732468
 ] 

Hive QA commented on HIVE-20977:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953526/HIVE-20977.3.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 59 failed/errored test(s), 15761 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testPartitionFilter 
(batchId=223)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartitionFilter 
(batchId=224)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStoreZK.testPartitionFilter 
(batchId=226)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStoreZKBindHost.testPartitionFilter
 (batchId=230)
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartitionFilter
 (batchId=221)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testPartitionFilter 
(batchId=219)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testPartitionFilter 
(batchId=229)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.getPartitionsByNamesBogusCatalog[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.getPartitionsByNamesBogusCatalog[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullDbName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullDbName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullTblName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullTblName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDbName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDbName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDb[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDb[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTable[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTable[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTblName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTblName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullDbName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullDbName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullTblName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullTblName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestListPartitions.listPartitionNamesBogusCatalog[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.listPartitionNamesBogusCatalog[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullDbName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullDbName[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullTblName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullTblName[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDbName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDbName[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDb[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDb[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTable[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTable[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTblName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTblName[Remote]
 (batchId=220)

[jira] [Updated] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Karthik Manamcheri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri updated HIVE-21044:
--
Attachment: HIVE-21044.4.patch

> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch, HIVE-21044.4.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20435) Failed Dynamic Partition Insert into insert only table may loose transaction metadata

2019-01-02 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20435:
--
Description: 
{{TxnHandler.enqueueLockWithRetry()}} has an optimization where it doesn't writ 
to {{TXN_COMPONENTS}} if the write is a dynamic partition insert because it 
expects to write to this table from {{addDynamicPartitions()}}.

For insert-only, transactional tables, we create the target dir and start 
writing to it before {{addDynamicPartitions()}} is called. So if a txn is 
aborted, we may have a delta dir in the partition but no corresponding entry in 
{{TXN_COMPONENTS}}. This means {{TxnStore.cleanEmptyAbortedTxns()}} may clean 
up {{TXNS}} entry for the aborted transaction before Compactor removes this 
delta dir, at which point it looks like committed data.

Streaming API V2 with dynamic partition mode also has this problem.

Full CRUD are currently immune to this since they rely on "move" operation in 
MoveTask but longer term they should follow the same model as insert-only 
tables.

  was:
{{TxnHandler.enqueueLockWithRetry()}} has an optimization where it doesn't writ 
to {{TXN_COMPONENTS}} if the write is a dynamic partition insert because it 
expects to write to this table from {{addDynamicPartitions()}}.  

For insert-only, transactional tables, we create the target dir and start 
writing to it before {{addDynamicPartitions()}} is called.  So if a txn is 
aborted, we may have a delta dir in the partition but no corresponding entry in 
{{TXN_COMPONENTS}}.  This means {{TxnStore.cleanEmptyAbortedTxns()}} may clean 
up {{TXNS}} entry for the aborted transaction before Compactor removes this 
delta dir, at which point it looks like committed data.

Full CRUD are currently immune to this since they rely on "move" operation in 
MoveTask but longer term they should follow the same model as insert-only 
tables.


> Failed Dynamic Partition Insert into insert only table may loose transaction 
> metadata
> -
>
> Key: HIVE-20435
> URL: https://issues.apache.org/jira/browse/HIVE-20435
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> {{TxnHandler.enqueueLockWithRetry()}} has an optimization where it doesn't 
> writ to {{TXN_COMPONENTS}} if the write is a dynamic partition insert because 
> it expects to write to this table from {{addDynamicPartitions()}}.
> For insert-only, transactional tables, we create the target dir and start 
> writing to it before {{addDynamicPartitions()}} is called. So if a txn is 
> aborted, we may have a delta dir in the partition but no corresponding entry 
> in {{TXN_COMPONENTS}}. This means {{TxnStore.cleanEmptyAbortedTxns()}} may 
> clean up {{TXNS}} entry for the aborted transaction before Compactor removes 
> this delta dir, at which point it looks like committed data.
> Streaming API V2 with dynamic partition mode also has this problem.
> Full CRUD are currently immune to this since they rely on "move" operation in 
> MoveTask but longer term they should follow the same model as insert-only 
> tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20977) Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732432#comment-16732432
 ] 

Hive QA commented on HIVE-20977:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
2s{color} | {color:blue} standalone-metastore/metastore-server in master has 
188 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15458/dev-support/hive-personality.sh
 |
| git revision | master / dc215b1 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15458/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15458/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Lazy evaluate the table object in PreReadTableEvent to improve get_partition 
> performance
> 
>
> Key: HIVE-20977
> URL: https://issues.apache.org/jira/browse/HIVE-20977
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
> Attachments: HIVE-20977.1.patch, HIVE-20977.2.patch, 
> HIVE-20977.3.patch
>
>
> The PreReadTableEvent is generated for non-table operations (such as 
> get_partitions), but only if there is an event listener attached. However, 
> this is also not necessary if the event listener is not interested in the 
> read table event.
> For example, the TransactionalValidationListener's onEvent looks like this
> {code:java}
> @Override
> public void onEvent(PreEventContext context) throws MetaException, 
> NoSuchObjectException,
> InvalidOperationException {
>   switch (context.getEventType()) {
> case CREATE_TABLE:
>   handle((PreCreateTableEvent) context);
>   break;
> case ALTER_TABLE:
>   handle((PreAlterTableEvent) context);
>   break;
> default:
>   //no validation required..
>   }
> }{code}
>  
> Note that for read table events it is a no-op. The problem is that the 
> get_table is evaluated when creating the PreReadTableEvent finally to be just 
> ignored!
> Look at the code below.. 

[jira] [Updated] (HIVE-20977) Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-02 Thread Karthik Manamcheri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri updated HIVE-20977:
--
Attachment: HIVE-20977.3.patch

> Lazy evaluate the table object in PreReadTableEvent to improve get_partition 
> performance
> 
>
> Key: HIVE-20977
> URL: https://issues.apache.org/jira/browse/HIVE-20977
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
> Attachments: HIVE-20977.1.patch, HIVE-20977.2.patch, 
> HIVE-20977.3.patch
>
>
> The PreReadTableEvent is generated for non-table operations (such as 
> get_partitions), but only if there is an event listener attached. However, 
> this is also not necessary if the event listener is not interested in the 
> read table event.
> For example, the TransactionalValidationListener's onEvent looks like this
> {code:java}
> @Override
> public void onEvent(PreEventContext context) throws MetaException, 
> NoSuchObjectException,
> InvalidOperationException {
>   switch (context.getEventType()) {
> case CREATE_TABLE:
>   handle((PreCreateTableEvent) context);
>   break;
> case ALTER_TABLE:
>   handle((PreAlterTableEvent) context);
>   break;
> default:
>   //no validation required..
>   }
> }{code}
>  
> Note that for read table events it is a no-op. The problem is that the 
> get_table is evaluated when creating the PreReadTableEvent finally to be just 
> ignored!
> Look at the code below.. {{getMS().getTable(..)}} is evaluated irrespective 
> of if the listener uses it or not.
> {code:java}
> private void fireReadTablePreEvent(String catName, String dbName, String 
> tblName)
> throws MetaException, NoSuchObjectException {
>   if(preListeners.size() > 0) {
> // do this only if there is a pre event listener registered (avoid 
> unnecessary
> // metastore api call)
> Table t = getMS().getTable(catName, dbName, tblName);
> if (t == null) {
>   throw new NoSuchObjectException(TableName.getQualified(catName, dbName, 
> tblName)
>   + " table not found");
> }
> firePreEvent(new PreReadTableEvent(t, this));
>   }
> }
> {code}
> This can be improved by using a {{Supplier}} and lazily evaluating the table 
> when needed (once when the first time it is called, memorized after that).
> *Motivation*
> Whenever a partition call occurs (get_partition, etc.), we fire the 
> PreReadTableEvent. This affects performance since it fetches the table even 
> if it is not being used. This change will improve performance on the 
> get_partition calls.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21081) DATE_FORMAT incorrectly returns results on the last week of the calendar year

2019-01-02 Thread Wilson Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilson Lu updated HIVE-21081:
-
Description: 
The Hive DATE_FORMAT does not perform the correct operation on the last week of 
the calendar year. The following statements incorrectly formats the data:

select DATE_FORMAT('2017-12-31', 'MM')
select DATE_FORMAT('2018-12-30', 'MM')
 select DATE_FORMAT('2018-12-31', 'MM')
select DATE_FORMAT('2019-12-29', 'MM')
 select DATE_FORMAT('2019-12-30', 'MM')
 select DATE_FORMAT('2019-12-31', 'MM')

select DATE_FORMAT( '2019-12-29', 'MMdd') 

 

  was:
The hive DATE_FORMAT does not perform the correct operation on the last week of 
the calendar year. The following statements incorrectly formats the data:

select DATE_FORMAT('2017-12-31', 'MM')

select DATE_FORMAT('2018-12-30', 'MM')
select DATE_FORMAT('2018-12-31', 'MM')

select DATE_FORMAT('2019-12-29', 'MM')
select DATE_FORMAT('2019-12-30', 'MM')
select DATE_FORMAT('2019-12-31', 'MM')

 

 


> DATE_FORMAT incorrectly returns results on the last week of the calendar year
> -
>
> Key: HIVE-21081
> URL: https://issues.apache.org/jira/browse/HIVE-21081
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.1, 2.3.2, 2.3.3
>Reporter: Wilson Lu
>Priority: Minor
>
> The Hive DATE_FORMAT does not perform the correct operation on the last week 
> of the calendar year. The following statements incorrectly formats the data:
> select DATE_FORMAT('2017-12-31', 'MM')
> select DATE_FORMAT('2018-12-30', 'MM')
>  select DATE_FORMAT('2018-12-31', 'MM')
> select DATE_FORMAT('2019-12-29', 'MM')
>  select DATE_FORMAT('2019-12-30', 'MM')
>  select DATE_FORMAT('2019-12-31', 'MM')
> select DATE_FORMAT( '2019-12-29', 'MMdd') 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2019-01-02 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21040:
---
   Resolution: Fixed
Fix Version/s: 3.2.0
   4.0.0
   2.4.0
   Status: Resolved  (was: Patch Available)

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Fix For: 2.4.0, 4.0.0, 3.2.0
>
> Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, 
> HIVE-21040.03.patch, HIVE-21040.04.patch, HIVE-21040.05.branch-3.patch, 
> HIVE-21040.06.branch-2.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2019-01-02 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21040:
---
Attachment: HIVE-21040.06.branch-2.patch

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, 
> HIVE-21040.03.patch, HIVE-21040.04.patch, HIVE-21040.05.branch-3.patch, 
> HIVE-21040.06.branch-2.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2019-01-02 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732396#comment-16732396
 ] 

Vihang Karajgaonkar commented on HIVE-21040:


attached branch-2 patch. Minor conflicts during cherry picking the branch-3 
patch were resolved.

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, 
> HIVE-21040.03.patch, HIVE-21040.04.patch, HIVE-21040.05.branch-3.patch, 
> HIVE-21040.06.branch-2.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21033) Forgetting to close operation cuts off any more HiveServer2 output

2019-01-02 Thread Aihua Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732372#comment-16732372
 ] 

Aihua Xu commented on HIVE-21033:
-

[~szehon] The patch looks good to me. +1. 



> Forgetting to close operation cuts off any more HiveServer2 output
> --
>
> Key: HIVE-21033
> URL: https://issues.apache.org/jira/browse/HIVE-21033
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-21033.2.patch, HIVE-21033.3.patch, 
> HIVE-21033.4.patch, HIVE-21033.5.patch, HIVE-21033.patch
>
>
> We had a custom client that did not handle closing the operations, until the 
> end of the session.  it is a mistake in the client, but it reveals kind of a 
> vulnerability in HiveServer2
> This happens if you have a session with  (1) HiveCommandOperation and (2) 
> SQLOperation and don't close them right after.  For example a session that 
> does the operations (set a=b; select * from foobar; ). 
> When SQLOperation runs , it set SessionState.out and err to be System.out and 
> System.err . Ref:  
> [SQLOperation#setupSessionIO|https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java#L139]
> Then the client closes the session, or disconnects which triggers 
> closeSession() on the Thrift side.  In this case, the closeSession closes all 
> the operations, starting with HiveCommandOperation.  This closes all the 
> streams, which is System.out and System.err as set by SQLOperation earlier.  
> Ref: 
> [HiveCommandOperation#tearDownSessionIO|https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java#L101]
>  
> After this, no more HiveServer2 output appears as System.out and System.err 
> are closed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20977) Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732361#comment-16732361
 ] 

Hive QA commented on HIVE-20977:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953518/HIVE-20977.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 55 failed/errored test(s), 15761 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testPartitionFilter 
(batchId=223)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testPartitionFilter 
(batchId=224)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStoreZK.testPartitionFilter 
(batchId=226)
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStoreZKBindHost.testPartitionFilter
 (batchId=230)
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testPartitionFilter
 (batchId=221)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testPartitionFilter 
(batchId=219)
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testPartitionFilter 
(batchId=229)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.getPartitionsByNamesBogusCatalog[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.getPartitionsByNamesBogusCatalog[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullDbName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullDbName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullTblName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionWithAuthInfoNullTblName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDbName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDbName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDb[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoDb[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTable[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTable[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTblName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNoTblName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullDbName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullDbName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullTblName[Embedded]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestGetPartitions.testGetPartitionsByNamesNullTblName[Remote]
 (batchId=222)
org.apache.hadoop.hive.metastore.client.TestListPartitions.listPartitionNamesBogusCatalog[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.listPartitionNamesBogusCatalog[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullDbName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullDbName[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullTblName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesByValuesNullTblName[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDbName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDbName[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDb[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoDb[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTable[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTable[Remote]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTblName[Embedded]
 (batchId=220)
org.apache.hadoop.hive.metastore.client.TestListPartitions.testListPartitionNamesNoTblName[Remote]
 (batchId=220)

[jira] [Updated] (HIVE-21065) Upgrade Hive to use ORC 1.5.4

2019-01-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-21065:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Patch committed to master. Thanks [~ikryvenko] and [~ekoifman]

> Upgrade Hive to use ORC 1.5.4
> -
>
> Key: HIVE-21065
> URL: https://issues.apache.org/jira/browse/HIVE-21065
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Reporter: Igor Kryvenko
>Assignee: Igor Kryvenko
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21065.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HIVE-21080) Update Hive to use ORC-1.5.4

2019-01-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta resolved HIVE-21080.
-
Resolution: Duplicate

> Update Hive to use ORC-1.5.4
> 
>
> Key: HIVE-21080
> URL: https://issues.apache.org/jira/browse/HIVE-21080
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-21080.1.patch
>
>
> Now that ORC 1.5.4 is released, we should update Hive's version of ORC so 
> that HIVE-20699 can use it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21080) Update Hive to use ORC-1.5.4

2019-01-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-21080:

Attachment: HIVE-21080.1.patch

> Update Hive to use ORC-1.5.4
> 
>
> Key: HIVE-21080
> URL: https://issues.apache.org/jira/browse/HIVE-21080
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-21080.1.patch
>
>
> Now that ORC 1.5.4 is released, we should update Hive's version of ORC so 
> that HIVE-20699 can use it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21080) Update Hive to use ORC-1.5.4

2019-01-02 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732349#comment-16732349
 ] 

Vaibhav Gumashta commented on HIVE-21080:
-

cc [~ekoifman]

> Update Hive to use ORC-1.5.4
> 
>
> Key: HIVE-21080
> URL: https://issues.apache.org/jira/browse/HIVE-21080
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
> Attachments: HIVE-21080.1.patch
>
>
> Now that ORC 1.5.4 is released, we should update Hive's version of ORC so 
> that HIVE-20699 can use it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2019-01-02 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732345#comment-16732345
 ] 

Vihang Karajgaonkar commented on HIVE-21040:


Attached patch for branch-3. In order to make the patch compile I had to remove 
the newly added test class and add it in the {{TestHiveMetaStoreChecker}} test 
instead. Rest of the patch remains the same.

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, 
> HIVE-21040.03.patch, HIVE-21040.04.patch, HIVE-21040.05.branch-3.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21040) msck does unnecessary file listing at last level of directory tree

2019-01-02 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-21040:
---
Attachment: HIVE-21040.05.branch-3.patch

> msck does unnecessary file listing at last level of directory tree
> --
>
> Key: HIVE-21040
> URL: https://issues.apache.org/jira/browse/HIVE-21040
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-21040.01.patch, HIVE-21040.02.patch, 
> HIVE-21040.03.patch, HIVE-21040.04.patch, HIVE-21040.05.branch-3.patch
>
>
> Here is the code snippet which is run by {{msck}} to list directories
> {noformat}
> final Path currentPath = pd.p;
>   final int currentDepth = pd.depth;
>   FileStatus[] fileStatuses = fs.listStatus(currentPath, 
> FileUtils.HIDDEN_FILES_PATH_FILTER);
>   // found no files under a sub-directory under table base path; it is 
> possible that the table
>   // is empty and hence there are no partition sub-directories created 
> under base path
>   if (fileStatuses.length == 0 && currentDepth > 0 && currentDepth < 
> partColNames.size()) {
> // since maxDepth is not yet reached, we are missing partition
> // columns in currentPath
> logOrThrowExceptionWithMsg(
> "MSCK is missing partition columns under " + 
> currentPath.toString());
>   } else {
> // found files under currentPath add them to the queue if it is a 
> directory
> for (FileStatus fileStatus : fileStatuses) {
>   if (!fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a file at depth which is less than number of partition 
> keys
> logOrThrowExceptionWithMsg(
> "MSCK finds a file rather than a directory when it searches 
> for "
> + fileStatus.getPath().toString());
>   } else if (fileStatus.isDirectory() && currentDepth < 
> partColNames.size()) {
> // found a sub-directory at a depth less than number of partition 
> keys
> // validate if the partition directory name matches with the 
> corresponding
> // partition colName at currentDepth
> Path nextPath = fileStatus.getPath();
> String[] parts = nextPath.getName().split("=");
> if (parts.length != 2) {
>   logOrThrowExceptionWithMsg("Invalid partition name " + 
> nextPath);
> } else if 
> (!parts[0].equalsIgnoreCase(partColNames.get(currentDepth))) {
>   logOrThrowExceptionWithMsg(
>   "Unexpected partition key " + parts[0] + " found at " + 
> nextPath);
> } else {
>   // add sub-directory to the work queue if maxDepth is not yet 
> reached
>   pendingPaths.add(new PathDepthInfo(nextPath, currentDepth + 1));
> }
>   }
> }
> if (currentDepth == partColNames.size()) {
>   return currentPath;
> }
>   }
> {noformat}
> You can see that when the {{currentDepth}} at the {{maxDepth}} it still does 
> a unnecessary listing of the files. We can improve this call by checking the 
> currentDepth and bailing out early.
> This can improve the performance of msck command significantly especially 
> when there are lot of files in each partitions on remote filesystems like S3 
> or ADLS



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21080) Update Hive to use ORC-1.5.4

2019-01-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-21080:

Description: Now that ORC 1.5.4 is released, we should update Hive's 
version of ORC so that HIVE-20699 can use it  (was: Now that ORC-1.5.4 is 
released, we should update Hive's version of ORC so that HIVE-20699 can use it)

> Update Hive to use ORC-1.5.4
> 
>
> Key: HIVE-21080
> URL: https://issues.apache.org/jira/browse/HIVE-21080
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> Now that ORC 1.5.4 is released, we should update Hive's version of ORC so 
> that HIVE-20699 can use it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21080) Update Hive to use ORC-1.5.4

2019-01-02 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-21080:
---

Assignee: Vaibhav Gumashta

> Update Hive to use ORC-1.5.4
> 
>
> Key: HIVE-21080
> URL: https://issues.apache.org/jira/browse/HIVE-21080
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> Now that ORC-1.5.4 is released, we should update Hive's version of ORC so 
> that HIVE-20699 can use it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20977) Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732303#comment-16732303
 ] 

Hive QA commented on HIVE-20977:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
3s{color} | {color:blue} standalone-metastore/metastore-server in master has 
188 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
12s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15455/dev-support/hive-personality.sh
 |
| git revision | master / 926c1e8 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15455/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15455/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Lazy evaluate the table object in PreReadTableEvent to improve get_partition 
> performance
> 
>
> Key: HIVE-20977
> URL: https://issues.apache.org/jira/browse/HIVE-20977
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
> Attachments: HIVE-20977.1.patch, HIVE-20977.2.patch
>
>
> The PreReadTableEvent is generated for non-table operations (such as 
> get_partitions), but only if there is an event listener attached. However, 
> this is also not necessary if the event listener is not interested in the 
> read table event.
> For example, the TransactionalValidationListener's onEvent looks like this
> {code:java}
> @Override
> public void onEvent(PreEventContext context) throws MetaException, 
> NoSuchObjectException,
> InvalidOperationException {
>   switch (context.getEventType()) {
> case CREATE_TABLE:
>   handle((PreCreateTableEvent) context);
>   break;
> case ALTER_TABLE:
>   handle((PreAlterTableEvent) context);
>   break;
> default:
>   //no validation required..
>   }
> }{code}
>  
> Note that for read table events it is a no-op. The problem is that the 
> get_table is evaluated when creating the PreReadTableEvent finally to be just 
> ignored!
> Look at the code below.. {{getMS().getTable(..)}} is evaluated 

[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732277#comment-16732277
 ] 

Hive QA commented on HIVE-21044:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953517/HIVE-21044.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15760 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
 (batchId=190)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15454/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15454/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15454/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12953517 - PreCommit-HIVE-Build

> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732241#comment-16732241
 ] 

Hive QA commented on HIVE-21044:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
12s{color} | {color:blue} standalone-metastore/metastore-common in master has 
29 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
1s{color} | {color:blue} standalone-metastore/metastore-server in master has 
188 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m 58s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15454/dev-support/hive-personality.sh
 |
| git revision | master / 926c1e8 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-common 
standalone-metastore/metastore-server U: standalone-metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15454/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20977) Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-02 Thread Karthik Manamcheri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri updated HIVE-20977:
--
Attachment: HIVE-20977.2.patch

> Lazy evaluate the table object in PreReadTableEvent to improve get_partition 
> performance
> 
>
> Key: HIVE-20977
> URL: https://issues.apache.org/jira/browse/HIVE-20977
> Project: Hive
>  Issue Type: Improvement
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
> Attachments: HIVE-20977.1.patch, HIVE-20977.2.patch
>
>
> The PreReadTableEvent is generated for non-table operations (such as 
> get_partitions), but only if there is an event listener attached. However, 
> this is also not necessary if the event listener is not interested in the 
> read table event.
> For example, the TransactionalValidationListener's onEvent looks like this
> {code:java}
> @Override
> public void onEvent(PreEventContext context) throws MetaException, 
> NoSuchObjectException,
> InvalidOperationException {
>   switch (context.getEventType()) {
> case CREATE_TABLE:
>   handle((PreCreateTableEvent) context);
>   break;
> case ALTER_TABLE:
>   handle((PreAlterTableEvent) context);
>   break;
> default:
>   //no validation required..
>   }
> }{code}
>  
> Note that for read table events it is a no-op. The problem is that the 
> get_table is evaluated when creating the PreReadTableEvent finally to be just 
> ignored!
> Look at the code below.. {{getMS().getTable(..)}} is evaluated irrespective 
> of if the listener uses it or not.
> {code:java}
> private void fireReadTablePreEvent(String catName, String dbName, String 
> tblName)
> throws MetaException, NoSuchObjectException {
>   if(preListeners.size() > 0) {
> // do this only if there is a pre event listener registered (avoid 
> unnecessary
> // metastore api call)
> Table t = getMS().getTable(catName, dbName, tblName);
> if (t == null) {
>   throw new NoSuchObjectException(TableName.getQualified(catName, dbName, 
> tblName)
>   + " table not found");
> }
> firePreEvent(new PreReadTableEvent(t, this));
>   }
> }
> {code}
> This can be improved by using a {{Supplier}} and lazily evaluating the table 
> when needed (once when the first time it is called, memorized after that).
> *Motivation*
> Whenever a partition call occurs (get_partition, etc.), we fire the 
> PreReadTableEvent. This affects performance since it fetches the table even 
> if it is not being used. This change will improve performance on the 
> get_partition calls.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Karthik Manamcheri (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732220#comment-16732220
 ] 

Karthik Manamcheri commented on HIVE-21044:
---

[~pvary] Nothing but comments changed in the second patch! I am reuploading the 
same as a third patch to re-trigger the tests.

> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21044) Add SLF4J reporter to the metastore metrics system

2019-01-02 Thread Karthik Manamcheri (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri updated HIVE-21044:
--
Attachment: HIVE-21044.3.patch

> Add SLF4J reporter to the metastore metrics system
> --
>
> Key: HIVE-21044
> URL: https://issues.apache.org/jira/browse/HIVE-21044
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Karthik Manamcheri
>Assignee: Karthik Manamcheri
>Priority: Minor
>  Labels: metrics
> Attachments: HIVE-21044.1.patch, HIVE-21044.2.patch, 
> HIVE-21044.3.patch
>
>
> Lets add SLF4J reporter as an option in Metrics reporting system. Currently 
> we support JMX, JSON and Console reporting.
> We will add a new option to {{hive.service.metrics.reporter}} called SLF4J. 
> We can use the 
> {{[Slf4jReporter|https://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/Slf4jReporter.html]}}
>  class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16957) Support CTAS for auto gather column stats

2019-01-02 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16732195#comment-16732195
 ] 

Ashutosh Chauhan commented on HIVE-16957:
-

+1

> Support CTAS for auto gather column stats
> -
>
> Key: HIVE-16957
> URL: https://issues.apache.org/jira/browse/HIVE-16957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-16957.01.patch, HIVE-16957.02.patch, 
> HIVE-16957.03.patch, HIVE-16957.04.patch, HIVE-16957.05.patch, 
> HIVE-16957.06.patch, HIVE-16957.07.patch, HIVE-16957.patch
>
>
> The idea is to rely as much as possible on the logic in 
> ColumnStatsSemanticAnalyzer as other operations do. In particular, they 
> create a 'analyze table t compute statistics for columns', use 
> ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to 
> existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE 
> MATERIALIZED VIEW is that the table object does not exist yet, hence we 
> cannot rely fully on ColumnStatsSemanticAnalyzer.
> Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a 
> statement for column stats collection that uses a table values clause instead 
> of the original table reference:
> {code}
> select compute_stats(col1), compute_stats(col2), compute_stats(col3)
> from table(values(cast(null as int), cast(null as int), cast(null as 
> string))) as t(col1, col2, col3);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20523) Improve table statistics for Parquet format

2019-01-02 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20523:
-
Status: Open  (was: Patch Available)

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.7.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20523) Improve table statistics for Parquet format

2019-01-02 Thread George Pachitariu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20523:
-
Attachment: HIVE-20523.8.patch
Status: Patch Available  (was: Open)

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.7.patch, HIVE-20523.8.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21078) Replicate column and table level statistics for unpartitioned Hive tables

2019-01-02 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21078:
--
Attachment: HIVE-21078.01.patch
Status: Patch Available  (was: Open)

Attaching first version to trigger ptests, checkstyle and findbugs tests.

> Replicate column and table level statistics for unpartitioned Hive tables
> -
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
> Attachments: HIVE-21078.01.patch
>
>
> This task is for replicating column and table level statistics for 
> unpartitioned tables.  The same for partitioned tables will be worked upon in 
> a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21078) Replicate column and table level statistics for unpartitioned Hive tables

2019-01-02 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21078:
--
Summary: Replicate column and table level statistics for unpartitioned Hive 
tables  (was: Replicate table level column statistics for unpartitioned Hive 
tables)

> Replicate column and table level statistics for unpartitioned Hive tables
> -
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> This task is for replicating column and table level statistics for 
> unpartitioned tables.  will be worked upon in a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21078) Replicate table level column statistics for unpartitioned Hive tables

2019-01-02 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21078:
--
Description: This task is for replicating column and table level statistics 
for unpartitioned tables.  will be worked upon in a separate sub-task.  (was: 
This task is for replicating table level statistics. Partition level statistics 
will be worked upon in a separate sub-task.)

> Replicate table level column statistics for unpartitioned Hive tables
> -
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> This task is for replicating column and table level statistics for 
> unpartitioned tables.  will be worked upon in a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21078) Replicate column and table level statistics for unpartitioned Hive tables

2019-01-02 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21078:
--
Description: This task is for replicating column and table level statistics 
for unpartitioned tables.  The same for partitioned tables will be worked upon 
in a separate sub-task.  (was: This task is for replicating column and table 
level statistics for unpartitioned tables.  will be worked upon in a separate 
sub-task.)

> Replicate column and table level statistics for unpartitioned Hive tables
> -
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> This task is for replicating column and table level statistics for 
> unpartitioned tables.  The same for partitioned tables will be worked upon in 
> a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21078) Replicate table level column statistics for unpartitioned Hive tables

2019-01-02 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21078:
--
Summary: Replicate table level column statistics for unpartitioned Hive 
tables  (was: Replicate table level column statistics for Hive tables)

> Replicate table level column statistics for unpartitioned Hive tables
> -
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> This task is for replicating table level statistics. Partition level 
> statistics will be worked upon in a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731982#comment-16731982
 ] 

Hive QA commented on HIVE-21050:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12953492/HIVE-21050.5.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15809 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/15450/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/15450/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-15450/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12953492 - PreCommit-HIVE-Build

> Upgrade Parquet to 1.11.0 and use LogicalTypes
> --
>
> Key: HIVE-21050
> URL: https://issues.apache.org/jira/browse/HIVE-21050
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: Parquet, parquet
> Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, 
> HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch, 
> HIVE-21050.4.patch, HIVE-21050.4.patch, HIVE-21050.4.patch, 
> HIVE-21050.5.patch, HIVE-21050.5.patch, HIVE-21050.5.patch
>
>
> [WIP until Parquet community releases version 1.11.0]
> The new Parquet version (1.12.0) uses 
> [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md]
>  instead of OriginalTypes.
> These are backwards-compatible with OriginalTypes.
> Thanks to [~kuczoram] for her work on this patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes

2019-01-02 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731979#comment-16731979
 ] 

Hive QA commented on HIVE-21050:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
44s{color} | {color:blue} ql in master has 2311 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 2 new + 145 unchanged - 4 
fixed = 147 total (was 149) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 55m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-15450/dev-support/hive-personality.sh
 |
| git revision | master / 926c1e8 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15450/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-15450/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Upgrade Parquet to 1.11.0 and use LogicalTypes
> --
>
> Key: HIVE-21050
> URL: https://issues.apache.org/jira/browse/HIVE-21050
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: Parquet, parquet
> Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, 
> HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch, 
> HIVE-21050.4.patch, HIVE-21050.4.patch, HIVE-21050.4.patch, 
> HIVE-21050.5.patch, HIVE-21050.5.patch, HIVE-21050.5.patch
>
>
> [WIP until Parquet community releases version 1.11.0]
> The new Parquet version (1.12.0) uses 
> [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md]
>  instead of OriginalTypes.
> These are backwards-compatible with OriginalTypes.
> Thanks to [~kuczoram] for her work on this patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes

2019-01-02 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21050:
-
Attachment: HIVE-21050.5.patch
Status: Patch Available  (was: Open)

> Upgrade Parquet to 1.11.0 and use LogicalTypes
> --
>
> Key: HIVE-21050
> URL: https://issues.apache.org/jira/browse/HIVE-21050
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: Parquet, parquet
> Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, 
> HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch, 
> HIVE-21050.4.patch, HIVE-21050.4.patch, HIVE-21050.4.patch, 
> HIVE-21050.5.patch, HIVE-21050.5.patch, HIVE-21050.5.patch
>
>
> [WIP until Parquet community releases version 1.11.0]
> The new Parquet version (1.12.0) uses 
> [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md]
>  instead of OriginalTypes.
> These are backwards-compatible with OriginalTypes.
> Thanks to [~kuczoram] for her work on this patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes

2019-01-02 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21050:
-
Status: Open  (was: Patch Available)

> Upgrade Parquet to 1.11.0 and use LogicalTypes
> --
>
> Key: HIVE-21050
> URL: https://issues.apache.org/jira/browse/HIVE-21050
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: Parquet, parquet
> Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, 
> HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch, 
> HIVE-21050.4.patch, HIVE-21050.4.patch, HIVE-21050.4.patch, 
> HIVE-21050.5.patch, HIVE-21050.5.patch
>
>
> [WIP until Parquet community releases version 1.11.0]
> The new Parquet version (1.12.0) uses 
> [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md]
>  instead of OriginalTypes.
> These are backwards-compatible with OriginalTypes.
> Thanks to [~kuczoram] for her work on this patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes

2019-01-02 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21050:
-
Attachment: HIVE-21050.5.patch
Status: Patch Available  (was: Open)

> Upgrade Parquet to 1.11.0 and use LogicalTypes
> --
>
> Key: HIVE-21050
> URL: https://issues.apache.org/jira/browse/HIVE-21050
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: Parquet, parquet
> Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, 
> HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch, 
> HIVE-21050.4.patch, HIVE-21050.4.patch, HIVE-21050.4.patch, 
> HIVE-21050.5.patch, HIVE-21050.5.patch
>
>
> [WIP until Parquet community releases version 1.11.0]
> The new Parquet version (1.12.0) uses 
> [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md]
>  instead of OriginalTypes.
> These are backwards-compatible with OriginalTypes.
> Thanks to [~kuczoram] for her work on this patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21050) Upgrade Parquet to 1.11.0 and use LogicalTypes

2019-01-02 Thread Karen Coppage (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-21050:
-
Status: Open  (was: Patch Available)

> Upgrade Parquet to 1.11.0 and use LogicalTypes
> --
>
> Key: HIVE-21050
> URL: https://issues.apache.org/jira/browse/HIVE-21050
> Project: Hive
>  Issue Type: Improvement
>  Components: File Formats
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: Parquet, parquet
> Attachments: HIVE-21050.1.patch, HIVE-21050.1.patch, 
> HIVE-21050.1.patch, HIVE-21050.2.patch, HIVE-21050.3.patch, 
> HIVE-21050.4.patch, HIVE-21050.4.patch, HIVE-21050.4.patch, 
> HIVE-21050.5.patch, HIVE-21050.5.patch
>
>
> [WIP until Parquet community releases version 1.11.0]
> The new Parquet version (1.12.0) uses 
> [LogicalTypes|https://github.com/apache/parquet-format/blob/master/LogicalTypes.md]
>  instead of OriginalTypes.
> These are backwards-compatible with OriginalTypes.
> Thanks to [~kuczoram] for her work on this patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16957) Support CTAS for auto gather column stats

2019-01-02 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731850#comment-16731850
 ] 

Jesus Camacho Rodriguez commented on HIVE-16957:


[~ashutoshc], I have rebased the latest patch that addressed your comments in 
RB.

> Support CTAS for auto gather column stats
> -
>
> Key: HIVE-16957
> URL: https://issues.apache.org/jira/browse/HIVE-16957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-16957.01.patch, HIVE-16957.02.patch, 
> HIVE-16957.03.patch, HIVE-16957.04.patch, HIVE-16957.05.patch, 
> HIVE-16957.06.patch, HIVE-16957.07.patch, HIVE-16957.patch
>
>
> The idea is to rely as much as possible on the logic in 
> ColumnStatsSemanticAnalyzer as other operations do. In particular, they 
> create a 'analyze table t compute statistics for columns', use 
> ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to 
> existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE 
> MATERIALIZED VIEW is that the table object does not exist yet, hence we 
> cannot rely fully on ColumnStatsSemanticAnalyzer.
> Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a 
> statement for column stats collection that uses a table values clause instead 
> of the original table reference:
> {code}
> select compute_stats(col1), compute_stats(col2), compute_stats(col3)
> from table(values(cast(null as int), cast(null as int), cast(null as 
> string))) as t(col1, col2, col3);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16957) Support CTAS for auto gather column stats

2019-01-02 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-16957:
---
Attachment: HIVE-16957.07.patch

> Support CTAS for auto gather column stats
> -
>
> Key: HIVE-16957
> URL: https://issues.apache.org/jira/browse/HIVE-16957
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-16957.01.patch, HIVE-16957.02.patch, 
> HIVE-16957.03.patch, HIVE-16957.04.patch, HIVE-16957.05.patch, 
> HIVE-16957.06.patch, HIVE-16957.07.patch, HIVE-16957.patch
>
>
> The idea is to rely as much as possible on the logic in 
> ColumnStatsSemanticAnalyzer as other operations do. In particular, they 
> create a 'analyze table t compute statistics for columns', use 
> ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to 
> existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE 
> MATERIALIZED VIEW is that the table object does not exist yet, hence we 
> cannot rely fully on ColumnStatsSemanticAnalyzer.
> Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a 
> statement for column stats collection that uses a table values clause instead 
> of the original table reference:
> {code}
> select compute_stats(col1), compute_stats(col2), compute_stats(col3)
> from table(values(cast(null as int), cast(null as int), cast(null as 
> string))) as t(col1, col2, col3);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21079) Replicate column statistics for partitions of partitioned Hive table.

2019-01-02 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat reassigned HIVE-21079:
-


> Replicate column statistics for partitions of partitioned Hive table.
> -
>
> Key: HIVE-21079
> URL: https://issues.apache.org/jira/browse/HIVE-21079
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> This task is for replicating statistics for partitions of a partitioned Hive 
> table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21078) Replicate table level column statistics for Hive tables

2019-01-02 Thread Ashutosh Bapat (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat reassigned HIVE-21078:
-


> Replicate table level column statistics for Hive tables
> ---
>
> Key: HIVE-21078
> URL: https://issues.apache.org/jira/browse/HIVE-21078
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> This task is for replicating table level statistics. Partition level 
> statistics will be worked upon in a separate sub-task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20911) External Table Replication for Hive

2019-01-02 Thread Sankar Hariappan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16731845#comment-16731845
 ] 

Sankar Hariappan commented on HIVE-20911:
-

+1, pending tests for 08.patch

> External Table Replication for Hive
> ---
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, 
> HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, 
> HIVE-20911.06.patch, HIVE-20911.07.patch, HIVE-20911.07.patch, 
> HIVE-20911.08.patch
>
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster. This can be 
> provided using the following configuration:
> {code}
> hive.repl.replica.external.table.base.dir=/
> {code}
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> tableName,base64Encoded(tableDataLocation)
> {code}
> In case there are different partitions in the table pointing to different 
> locations there will be multiple entries in the file for the same table name 
> with location pointing to different partition locations. For partitions 
> created in a table without specifying the _set location_ command will be 
> within the same table Data location and hence there will not be different 
> entries in the file above 
> ** *repl load* will read the  *\_external\_tables\_info* to identify what 
> locations are to be copied from source to target and create corresponding 
> tasks for them.
> * New External tables will be created with metadata only with no data copied 
> as part of regular tasks while incremental load/bootstrap load.
> * Bootstrap dump will also create  *\_external\_tables\_info* which will be 
> used to copy data from source to target  as part of boostrap load.
> * Bootstrap load will create a DAG, that can use parallelism in the execution 
> phase, the hdfs copy related tasks are created, once the bootstrap phase is 
> complete.
> * Since incremental load results in a DAG with only sequential execution ( 
> events applied in sequence ) to effectively use the parallelism capability in 
> execution mode, we create tasks for hdfs copy along with the incremental DAG. 
> This requires a few basic calculations to approximately meet the configured 
> value in  "hive.repl.approx.max.load.tasks" 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20911) External Table Replication for Hive

2019-01-02 Thread anishek (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-20911:
---
Attachment: HIVE-20911.08.patch

> External Table Replication for Hive
> ---
>
> Key: HIVE-20911
> URL: https://issues.apache.org/jira/browse/HIVE-20911
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: anishek
>Assignee: anishek
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20911.01.patch, HIVE-20911.02.patch, 
> HIVE-20911.03.patch, HIVE-20911.04.patch, HIVE-20911.05.patch, 
> HIVE-20911.06.patch, HIVE-20911.07.patch, HIVE-20911.07.patch, 
> HIVE-20911.08.patch
>
>
> External tables are not replicated currently as part of hive replication. As 
> part of this jira we want to enable that.
> Approach:
> * Target cluster will have a top level base directory config that will be 
> used to copy all data relevant to external tables. This will be provided via 
> the *with* clause in the *repl load* command. This base path will be prefixed 
> to the path of the same external table on source cluster. This can be 
> provided using the following configuration:
> {code}
> hive.repl.replica.external.table.base.dir=/
> {code}
> * Since changes to directories on the external table can happen without hive 
> knowing it, hence we cant capture the relevant events when ever new data is 
> added or removed, we will have to copy the data from the source path to 
> target path for external tables every time we run incremental replication.
> ** this will require incremental *repl dump*  to now create an additional 
> file *\_external\_tables\_info* with data in the following form 
> {code}
> tableName,base64Encoded(tableDataLocation)
> {code}
> In case there are different partitions in the table pointing to different 
> locations there will be multiple entries in the file for the same table name 
> with location pointing to different partition locations. For partitions 
> created in a table without specifying the _set location_ command will be 
> within the same table Data location and hence there will not be different 
> entries in the file above 
> ** *repl load* will read the  *\_external\_tables\_info* to identify what 
> locations are to be copied from source to target and create corresponding 
> tasks for them.
> * New External tables will be created with metadata only with no data copied 
> as part of regular tasks while incremental load/bootstrap load.
> * Bootstrap dump will also create  *\_external\_tables\_info* which will be 
> used to copy data from source to target  as part of boostrap load.
> * Bootstrap load will create a DAG, that can use parallelism in the execution 
> phase, the hdfs copy related tasks are created, once the bootstrap phase is 
> complete.
> * Since incremental load results in a DAG with only sequential execution ( 
> events applied in sequence ) to effectively use the parallelism capability in 
> execution mode, we create tasks for hdfs copy along with the incremental DAG. 
> This requires a few basic calculations to approximately meet the configured 
> value in  "hive.repl.approx.max.load.tasks" 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)