date:20191008

[jira] [Updated] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-10-08 Thread Steve Carlin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Carlin updated HIVE-22274:

Attachment: HIVE-22274.3.patch

> Upgrade Calcite version to 1.21.0
> -
>
> Key: HIVE-22274
> URL: https://issues.apache.org/jira/browse/HIVE-22274
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.2
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
> Attachments: HIVE-22274.1.patch, HIVE-22274.2.patch, 
> HIVE-22274.3.patch, HIVE-22274.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22305) Add the kudu-handler to the packaging module

2019-10-08 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947354#comment-16947354
 ] 

Gopal Vijayaraghavan commented on HIVE-22305:
-

LGTM - +1

> Add the kudu-handler to the packaging module
> 
>
> Key: HIVE-22305
> URL: https://issues.apache.org/jira/browse/HIVE-22305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
> Attachments: HIVE-22305.0.patch
>
>
> The hive-kudu-handler needs to be added to the packaging module to ensure the 
> jars are packaged into the tar distribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947352#comment-16947352
 ] 

Gopal Vijayaraghavan commented on HIVE-14302:
-

Before commit, remove the {{ // All but decimal.}} comment on the top of 
the diff.

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947351#comment-16947351
 ] 

Gopal Vijayaraghavan commented on HIVE-14302:
-

LGTM - +1

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22305) Add the kudu-handler to the packaging module

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947330#comment-16947330
 ] 

Hive QA commented on HIVE-22305:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982522/HIVE-22305.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17516 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18914/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18914/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18914/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982522 - PreCommit-HIVE-Build

> Add the kudu-handler to the packaging module
> 
>
> Key: HIVE-22305
> URL: https://issues.apache.org/jira/browse/HIVE-22305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
> Attachments: HIVE-22305.0.patch
>
>
> The hive-kudu-handler needs to be added to the packaging module to ensure the 
> jars are packaged into the tar distribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22305) Add the kudu-handler to the packaging module

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947314#comment-16947314
 ] 

Hive QA commented on HIVE-22305:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18914/dev-support/hive-personality.sh
 |
| git revision | master / a254859 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18914/yetus/patch-asflicense-problems.txt
 |
| modules | C: packaging U: packaging |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18914/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add the kudu-handler to the packaging module
> 
>
> Key: HIVE-22305
> URL: https://issues.apache.org/jira/browse/HIVE-22305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
> Attachments: HIVE-22305.0.patch
>
>
> The hive-kudu-handler needs to be added to the packaging module to ensure the 
> jars are packaged into the tar distribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947303#comment-16947303
 ] 

Hive QA commented on HIVE-14302:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982521/HIVE-14302.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17517 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18913/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18913/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18913/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982521 - PreCommit-HIVE-Build

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22305) Add the kudu-handler to the packaging module

2019-10-08 Thread Grant Henke (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated HIVE-22305:
---
Status: Patch Available  (was: Open)

> Add the kudu-handler to the packaging module
> 
>
> Key: HIVE-22305
> URL: https://issues.apache.org/jira/browse/HIVE-22305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
> Attachments: HIVE-22305.0.patch
>
>
> The hive-kudu-handler needs to be added to the packaging module to ensure the 
> jars are packaged into the tar distribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22305) Add the kudu-handler to the packaging module

2019-10-08 Thread Grant Henke (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated HIVE-22305:
---
Attachment: HIVE-22305.0.patch

> Add the kudu-handler to the packaging module
> 
>
> Key: HIVE-22305
> URL: https://issues.apache.org/jira/browse/HIVE-22305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
> Attachments: HIVE-22305.0.patch
>
>
> The hive-kudu-handler needs to be added to the packaging module to ensure the 
> jars are packaged into the tar distribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947280#comment-16947280
 ] 

Hive QA commented on HIVE-14302:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
7s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18913/dev-support/hive-personality.sh
 |
| git revision | master / a254859 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18913/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18913/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0

[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Mustafa Iman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-14302:

Status: In Progress  (was: Patch Available)

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Mustafa Iman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-14302:

Attachment: HIVE-14302.3.patch
Status: Patch Available  (was: In Progress)

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.3.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947256#comment-16947256
 ] 

Hive QA commented on HIVE-14302:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982519/HIVE-14302.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 17517 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vector_decimal_mapjoin]
 (batchId=139)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18912/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18912/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18912/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982519 - PreCommit-HIVE-Build

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325379
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 23:05
Start Date: 08/Oct/19 23:05
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332770578
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -935,8 +942,11 @@ else 
if(colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME)){
 cs.setNumTrues(Math.max(1, numRows/2));
 cs.setNumFalses(Math.max(1, numRows/2));
 cs.setAvgColLen(JavaDataModel.get().primitive1());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
+  cs.setAvgColLen(JavaDataModel.get().lengthOfTimestamp());
+  // epoch, seconds since epoch
+  cs.setRange(0, 2177452799L);
 
 Review comment:
   I answer to same comment to @miklosgergely above, please see my response. I 
used a new heuristic, I do not think existing was a very good one...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325379)
Time Spent: 2h 50m  (was: 2h 40m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325378
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 23:02
Start Date: 08/Oct/19 23:02
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332736102
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -935,8 +942,11 @@ else 
if(colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME)){
 cs.setNumTrues(Math.max(1, numRows/2));
 cs.setNumFalses(Math.max(1, numRows/2));
 cs.setAvgColLen(JavaDataModel.get().primitive1());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
+  cs.setAvgColLen(JavaDataModel.get().lengthOfTimestamp());
+  // epoch, seconds since epoch
+  cs.setRange(0, 2177452799L);
 
 Review comment:
   I do not think this is critical, but I explored a bit and this seems to be a 
poor choice for a heuristic, since in most cases it will lead to 
underestimation of the data size (since most users will not have dates starting 
from 1970).
   I will take as lower limit `01-01-1999` and as upper limit `12-31-2024` 
(mentioned by Gopal). Let me know if you see any cons with this approach.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325378)
Time Spent: 2h 40m  (was: 2.5h)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947231#comment-16947231
 ] 

Hive QA commented on HIVE-14302:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
18s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 19s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18912/dev-support/hive-personality.sh
 |
| git revision | master / a254859 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18912/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18912/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>

[jira] [Updated] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread Mustafa Iman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mustafa Iman updated HIVE-14302:

Attachment: HIVE-14302.2.patch
Status: Patch Available  (was: In Progress)

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.2.patch, HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-14302) Tez: Optimized Hashtable can support DECIMAL keys of same precision

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14302?focusedWorklogId=325351=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325351
 ]

ASF GitHub Bot logged work on HIVE-14302:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 21:45
Start Date: 08/Oct/19 21:45
Worklog Time Spent: 10m 
  Work Description: mustafaiman commented on pull request #803: HIVE-14302
URL: https://github.com/apache/hive/pull/803#discussion_r332747892
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinKey.java
 ##
 @@ -85,6 +84,7 @@ public static MapJoinKey read(Output output, 
MapJoinObjectSerDeContext context,
 SUPPORTED_PRIMITIVES.add(PrimitiveCategory.BINARY);
 SUPPORTED_PRIMITIVES.add(PrimitiveCategory.VARCHAR);
 SUPPORTED_PRIMITIVES.add(PrimitiveCategory.CHAR);
+SUPPORTED_PRIMITIVES.add(PrimitiveCategory.DECIMAL);
 
 Review comment:
   Check is not necessary. During planning, common type is used for join keys. 
This is true when `hive.cbo.enable` is `true` or `false`. See 
`FunctionRegistry.getCommonClassForComparison`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325351)
Time Spent: 40m  (was: 0.5h)

> Tez: Optimized Hashtable can support DECIMAL keys of same precision
> ---
>
> Key: HIVE-14302
> URL: https://issues.apache.org/jira/browse/HIVE-14302
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.2.0
>Reporter: Gopal Vijayaraghavan
>Assignee: Mustafa Iman
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-14302.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Decimal support in the optimized hashtable was decided on the basis of the 
> fact that Decimal(10,1) == Decimal(10, 2) when both contain "1.0" and "1.00".
> However, the joins now don't have any issues with decimal precision because 
> they cast to common.
> {code}
> create temporary table x (a decimal(10,2), b decimal(10,1)) stored as orc;
> insert into x values (1.0, 1.0);
> > explain logical select count(1) from x, x x1 where x.a = x1.b;
> OK  
> LOGICAL PLAN:
> $hdt$_0:$hdt$_0:x
>   TableScan (TS_0)
> alias: x
> filterExpr: (a is not null and true) (type: boolean)
> Filter Operator (FIL_18)
>   predicate: (a is not null and true) (type: boolean)
>   Select Operator (SEL_2)
> expressions: a (type: decimal(10,2))
> outputColumnNames: _col0
> Reduce Output Operator (RS_6)
>   key expressions: _col0 (type: decimal(11,2))
>   sort order: +
>   Map-reduce partition columns: _col0 (type: decimal(11,2))
>   Join Operator (JOIN_8)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 _col0 (type: decimal(11,2))
>   1 _col0 (type: decimal(11,2))
> Group By Operator (GBY_11)
>   aggregations: count(1)
>   mode: hash
>   outputColumnNames: _col0
> {code}
> See cast up to Decimal(11, 2) in the plan, which normalizes both sides of the 
> join to be able to compare HiveDecimal as-is.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325350=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325350
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 21:44
Start Date: 08/Oct/19 21:44
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332747726
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 ##
 @@ -316,7 +321,7 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
 
 protected long evaluateExpression(Statistics stats, ExprNodeDesc pred,
 AnnotateStatsProcCtx aspCtx, List neededCols,
-Operator op, long currNumRows) throws SemanticException {
+Operator op, long currNumRows, boolean uniformWithinRange) throws 
SemanticException {
 
 Review comment:
   I have used the `AnnotateStatsProcCtx` to hold that value, thanks.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325350)
Time Spent: 2.5h  (was: 2h 20m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325332
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 21:16
Start Date: 08/Oct/19 21:16
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332736102
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -935,8 +942,11 @@ else 
if(colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME)){
 cs.setNumTrues(Math.max(1, numRows/2));
 cs.setNumFalses(Math.max(1, numRows/2));
 cs.setAvgColLen(JavaDataModel.get().primitive1());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
+  cs.setAvgColLen(JavaDataModel.get().lengthOfTimestamp());
+  // epoch, seconds since epoch
+  cs.setRange(0, 2177452799L);
 
 Review comment:
   I do not think this is critical, but I explored a bit and this seems to be a 
poor choice for a heuristic, since in most cases it will lead to 
underestimation of the data size (since most users will not have dates starting 
from 1970).
   I will take as lower limit `01-01-2015` (epoch for ORC) and as upper limit 
`12-31-2024`. Let me know if you see any cons with this approach.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325332)
Time Spent: 2h 20m  (was: 2h 10m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=325331=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325331
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 21:12
Start Date: 08/Oct/19 21:12
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332736102
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -935,8 +942,11 @@ else 
if(colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME)){
 cs.setNumTrues(Math.max(1, numRows/2));
 cs.setNumFalses(Math.max(1, numRows/2));
 cs.setAvgColLen(JavaDataModel.get().primitive1());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
+  cs.setAvgColLen(JavaDataModel.get().lengthOfTimestamp());
+  // epoch, seconds since epoch
+  cs.setRange(0, 2177452799L);
 
 Review comment:
   I do not think this is critical, but I explored a bit and this seems to be a 
poor choice for a heuristic, since in most cases it will lead to 
underestimation of the data size (since most users will not have dates starting 
from 1970).
   I will take as lower limit `01-01-2015` (epoch for ORC) and as upper limit 
`12-31-2025`. Let me know if you see any cons with this approach.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325331)
Time Spent: 2h 10m  (was: 2h)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947155#comment-16947155
 ] 

Hive QA commented on HIVE-21246:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982512/HIVE-21246.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17516 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18911/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18911/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18911/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982512 - PreCommit-HIVE-Build

> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21246.1.patch, HIVE-21246.1.patch, 
> HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947126#comment-16947126
 ] 

Hive QA commented on HIVE-21246:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
11s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18911/dev-support/hive-personality.sh
 |
| git revision | master / a254859 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18911/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18911/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21246.1.patch, HIVE-21246.1.patch, 
> HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21884) Scheduled query support

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21884?focusedWorklogId=325275=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325275
 ]

ASF GitHub Bot logged work on HIVE-21884:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 18:49
Start Date: 08/Oct/19 18:49
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #794: HIVE-21884
URL: https://github.com/apache/hive/pull/794#discussion_r332675192
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 ##
 @@ -18,6 +18,36 @@ UPDATE "APP"."TAB_COL_STATS" SET ENGINE = 'hive' WHERE 
ENGINE IS NULL;
 ALTER TABLE "APP"."PART_COL_STATS" ADD ENGINE VARCHAR(128);
 UPDATE "APP"."PART_COL_STATS" SET ENGINE = 'hive' WHERE ENGINE IS NULL;
 
+CREATE TABLE "APP"."SCHEDULED_QUERIES" (
 
 Review comment:
   Cool, thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325275)
Time Spent: 3.5h  (was: 3h 20m)

> Scheduled query support
> ---
>
> Key: HIVE-21884
> URL: https://issues.apache.org/jira/browse/HIVE-21884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21844.04.patch, HIVE-21844.05.patch, 
> HIVE-21844.06.patch, HIVE-21844.07.patch, HIVE-21844.08.patch, 
> HIVE-21844.09.patch, HIVE-21844.15.patch, HIVE-21844.19.patch, 
> HIVE-21884.01.patch, HIVE-21884.02.patch, HIVE-21884.03.patch, 
> HIVE-21884.09.patch, HIVE-21884.10.patch, HIVE-21884.10.patch, 
> HIVE-21884.11.patch, HIVE-21884.12.patch, HIVE-21884.13.patch, 
> HIVE-21884.14.patch, HIVE-21884.14.patch, HIVE-21884.14.patch, 
> HIVE-21884.16.patch, HIVE-21884.17.patch, HIVE-21884.18.patch, 
> HIVE-21884.20.patch, Scheduled queries2.pdf
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> design document:
> https://docs.google.com/document/d/1mJSFdJi_1cbxJTXC9QvGw2rQ3zzJkNfxOO6b5esmyCE/edit#
> in case the google doc is not reachable:  [^Scheduled queries2.pdf] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21884) Scheduled query support

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21884?focusedWorklogId=325274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325274
 ]

ASF GitHub Bot logged work on HIVE-21884:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 18:49
Start Date: 08/Oct/19 18:49
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #794: HIVE-21884
URL: https://github.com/apache/hive/pull/794#discussion_r332675192
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 ##
 @@ -18,6 +18,36 @@ UPDATE "APP"."TAB_COL_STATS" SET ENGINE = 'hive' WHERE 
ENGINE IS NULL;
 ALTER TABLE "APP"."PART_COL_STATS" ADD ENGINE VARCHAR(128);
 UPDATE "APP"."PART_COL_STATS" SET ENGINE = 'hive' WHERE ENGINE IS NULL;
 
+CREATE TABLE "APP"."SCHEDULED_QUERIES" (
 
 Review comment:
   Cool, thanks (y)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325274)
Time Spent: 3h 20m  (was: 3h 10m)

> Scheduled query support
> ---
>
> Key: HIVE-21884
> URL: https://issues.apache.org/jira/browse/HIVE-21884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21844.04.patch, HIVE-21844.05.patch, 
> HIVE-21844.06.patch, HIVE-21844.07.patch, HIVE-21844.08.patch, 
> HIVE-21844.09.patch, HIVE-21844.15.patch, HIVE-21844.19.patch, 
> HIVE-21884.01.patch, HIVE-21884.02.patch, HIVE-21884.03.patch, 
> HIVE-21884.09.patch, HIVE-21884.10.patch, HIVE-21884.10.patch, 
> HIVE-21884.11.patch, HIVE-21884.12.patch, HIVE-21884.13.patch, 
> HIVE-21884.14.patch, HIVE-21884.14.patch, HIVE-21884.14.patch, 
> HIVE-21884.16.patch, HIVE-21884.17.patch, HIVE-21884.18.patch, 
> HIVE-21884.20.patch, Scheduled queries2.pdf
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> design document:
> https://docs.google.com/document/d/1mJSFdJi_1cbxJTXC9QvGw2rQ3zzJkNfxOO6b5esmyCE/edit#
> in case the google doc is not reachable:  [^Scheduled queries2.pdf] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22305) Add the kudu-handler to the packaging module

2019-10-08 Thread Grant Henke (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke reassigned HIVE-22305:
--


> Add the kudu-handler to the packaging module
> 
>
> Key: HIVE-22305
> URL: https://issues.apache.org/jira/browse/HIVE-22305
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Grant Henke
>Assignee: Grant Henke
>Priority: Major
>
> The hive-kudu-handler needs to be added to the packaging module to ensure the 
> jars are packaged into the tar distribution.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-08 Thread David Mollitor (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947082#comment-16947082
 ] 

David Mollitor commented on HIVE-21246:
---

[~abstractdog] Wondering if you could take a look at this one too for me.  I 
just put up a new patch.

> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21246.1.patch, HIVE-21246.1.patch, 
> HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-08 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21246:
--
Status: Open  (was: Patch Available)

> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21246.1.patch, HIVE-21246.1.patch, 
> HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-08 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21246:
--
Status: Patch Available  (was: Open)

> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21246.1.patch, HIVE-21246.1.patch, 
> HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21246) Un-bury DelimitedJSONSerDe from PlanUtils.java

2019-10-08 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-21246:
--
Attachment: HIVE-21246.2.patch

> Un-bury DelimitedJSONSerDe from PlanUtils.java
> --
>
> Key: HIVE-21246
> URL: https://issues.apache.org/jira/browse/HIVE-21246
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21246.1.patch, HIVE-21246.1.patch, 
> HIVE-21246.2.patch
>
>
> Ultimately, I'd like to get rid of 
> {{org.apache.hadoop.hive.serde2.DelimitedJSONSerDe}}, but for now, trying to 
> make it easier to get rid of later.  It's currently buried in 
> {{PlanUtils.java}}.
> A SerDe and a boolean flag gets passed into these methods.  If the flag is 
> set to true, the specified SerDe is overwritten and assigned to 
> {{DelimitedJSONSerDe}}.  This is not documented anywhere and it's a weird 
> thing to do, just pass in the required SerDe from the start instead of 
> sending the wrong SerDe and a flag to overwrite it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-12971) Hive Support for Kudu

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-12971?focusedWorklogId=325205=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325205
 ]

ASF GitHub Bot logged work on HIVE-12971:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 17:45
Start Date: 08/Oct/19 17:45
Worklog Time Spent: 10m 
  Work Description: granthenke commented on pull request #804: HIVE-12971: 
Add the kudu-handler to the packaging module
URL: https://github.com/apache/hive/pull/804
 
 
   This patch adds the hive-kudu-handler to the packaging module to
   ensure the jars are packed into the tar distribtuion.
   
   Change-Id: Iebdb46f7d89bfa5c7a36035b39c3d58b01e58fca
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325205)
Time Spent: 4h 20m  (was: 4h 10m)

> Hive Support for Kudu
> -
>
> Key: HIVE-12971
> URL: https://issues.apache.org/jira/browse/HIVE-12971
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Lenni Kuff
>Assignee: Grant Henke
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-12971.0.patch, HIVE-12971.1.patch, 
> HIVE-12971.2.patch, HIVE-12971.3.patch, HIVE-12971.4.patch, HIVE-12971.5.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> JIRA for tracking work related to Hive/Kudu integration.
> It would be useful to allow Kudu data to be accessible via Hive. This would 
> involve creating a Kudu SerDe/StorageHandler and implementing support for 
> QUERY and DML commands like SELECT, INSERT, UPDATE, and DELETE. Kudu 
> Input/OutputFormats classes already exist. The work can be staged to support 
> this functionality incrementally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-08 Thread David Mollitor (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947072#comment-16947072
 ] 

David Mollitor commented on HIVE-22217:
---

[~abstractdog] Patch attached for branch-3.  I'm not sure how to kick off 
tests. Thanks!

> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.2.0, 2.3.6
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.1.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22217) Better Logging for Hive JAR Reload

2019-10-08 Thread David Mollitor (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Mollitor updated HIVE-22217:
--
Attachment: HIVE-22217.branch3.1.patch

> Better Logging for Hive JAR Reload
> --
>
> Key: HIVE-22217
> URL: https://issues.apache.org/jira/browse/HIVE-22217
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.2.0, 2.3.6
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Fix For: 4.0.0
>
> Attachments: HIVE-22217.1.patch, HIVE-22217.branch3.1.patch
>
>
> Troubleshooting Hive Reloadable Auxiliary JARs has always been difficult.
> Add logging to at least confirm which JAR files are being loaded.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16947032#comment-16947032
 ] 

Hive QA commented on HIVE-22097:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982497/HIVE-22097.1.branch-3.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 139 failed/errored test(s), 14415 tests 
executed
*Failed tests:*
{noformat}
TestAddPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestAddPartitionsFromPartSpec - did not produce a TEST-*.xml file (likely timed 
out) (batchId=228)
TestAdminUser - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestAggregateStatsCache - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestAlterPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestAppendPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=274)
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestCatalogNonDefaultClient - did not produce a TEST-*.xml file (likely timed 
out) (batchId=226)
TestCatalogNonDefaultSvr - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestCatalogOldClient - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestCatalogs - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestChainFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestCheckConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestCloseableThreadLocal - did not produce a TEST-*.xml file (likely timed out) 
(batchId=333)
TestCustomQueryFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=237)
TestDatabaseName - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)
TestDatabases - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestDefaultConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestDropPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=274)
TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed 
out) (batchId=229)
TestExchangePartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestFMSketchSerialization - did not produce a TEST-*.xml file (likely timed 
out) (batchId=238)
TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestForeignKey - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestFunctions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestGetPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestGetPartitionsUsingProjectionAndFilterSpecs - did not produce a TEST-*.xml 
file (likely timed out) (batchId=228)
TestGetTableMeta - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestGroupFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestHLLNoBias - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestHLLSerialization - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestHdfsUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=235)
TestHiveAlterHandler - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=237)
TestHiveMetaStorePartitionSpecs - did not produce a TEST-*.xml file (likely 
timed out) (batchId=228)
TestHiveMetaStoreSchemaMethods - did not produce a TEST-*.xml file (likely 
timed out) (batchId=235)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file 
(likely timed out) (batchId=232)
TestHiveMetaToolCommandLine - did not produce a TEST-*.xml file (likely timed 
out) (batchId=230)
TestHiveMetastoreCli - did not produce a TEST-*.xml file (likely timed out) 
(batchId=226)
TestHmsServerAuthorization - did not produce a TEST-*.xml file (likely timed 
out) (batchId=235)
TestHyperLogLog - did not produce a TEST-*.xml file (likely timed out) 
(batchId=238)
TestHyperLogLogDense - did not produce a

[jira] [Commented] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946966#comment-16946966
 ] 

Hive QA commented on HIVE-22097:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 19s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-18910/patches/PreCommit-HIVE-Build-18910.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18910/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
>

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Status: Patch Available  (was: Open)

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.1.1, 3.0.0
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Attachment: (was: HIVE-22097.1.branch-3.1.patch)

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Attachment: HIVE-22097.1.branch-3.1.patch

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Status: Open  (was: Patch Available)

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.1.1, 3.0.0
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22297) qtests: add regex based replacer

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946950#comment-16946950
 ] 

Hive QA commented on HIVE-22297:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982487/HIVE-22297.02.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17515 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18909/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18909/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18909/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982487 - PreCommit-HIVE-Build

> qtests: add regex based replacer
> 
>
> Key: HIVE-22297
> URL: https://issues.apache.org/jira/browse/HIVE-22297
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22297.01.patch, HIVE-22297.02.patch
>
>
> right now we have some hard wired replacers for specific patterns
> idea would be to generalize it - and make it able to configure replacements 
> from the qfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22291) HMS Translation: Limit translation to hive default catalog only

2019-10-08 Thread Naveen Gangam (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-22291:
-
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fix has been committed to master. Closing the jira. Thank you for the review 
[~samuelan]

> HMS Translation: Limit translation to hive default catalog only
> ---
>
> Key: HIVE-22291
> URL: https://issues.apache.org/jira/browse/HIVE-22291
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22291.patch
>
>
> HMS Translation should only be limited to a single catalog.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21114) Create read-only transactions

2019-10-08 Thread Denys Kuzmenko (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-21114:
--
Attachment: HIVE-21114.1.patch

> Create read-only transactions
> -
>
> Key: HIVE-21114
> URL: https://issues.apache.org/jira/browse/HIVE-21114
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-21114.1.patch
>
>
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.  
> Then we can optimize {{TxnHandler.commitTxn()}} so that it doesn't do any 
> checks in write_set etc.
> {{TxnHandler.commitTxn()}} already starts with {{lockTransactionRecord(stmt, 
> txnid, TXN_OPEN)}} so it can read the txn type in the same SQL stmt.
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT.  By the time 
> {{Driver.openTransaction();}} is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn.  This should be a different jira.
> cc [~ikryvenko]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21114) Create read-only transactions

2019-10-08 Thread Denys Kuzmenko (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-21114:
--
Attachment: (was: HIVE-21114.1.patch)

> Create read-only transactions
> -
>
> Key: HIVE-21114
> URL: https://issues.apache.org/jira/browse/HIVE-21114
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
>
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.  
> Then we can optimize {{TxnHandler.commitTxn()}} so that it doesn't do any 
> checks in write_set etc.
> {{TxnHandler.commitTxn()}} already starts with {{lockTransactionRecord(stmt, 
> txnid, TXN_OPEN)}} so it can read the txn type in the same SQL stmt.
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT.  By the time 
> {{Driver.openTransaction();}} is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn.  This should be a different jira.
> cc [~ikryvenko]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21114) Create read-only transactions

2019-10-08 Thread Denys Kuzmenko (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-21114:
--
Attachment: HIVE-21114.1.patch

> Create read-only transactions
> -
>
> Key: HIVE-21114
> URL: https://issues.apache.org/jira/browse/HIVE-21114
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Denys Kuzmenko
>Priority: Major
> Attachments: HIVE-21114.1.patch
>
>
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.  
> Then we can optimize {{TxnHandler.commitTxn()}} so that it doesn't do any 
> checks in write_set etc.
> {{TxnHandler.commitTxn()}} already starts with {{lockTransactionRecord(stmt, 
> txnid, TXN_OPEN)}} so it can read the txn type in the same SQL stmt.
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT.  By the time 
> {{Driver.openTransaction();}} is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn.  This should be a different jira.
> cc [~ikryvenko]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22297) qtests: add regex based replacer

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946907#comment-16946907
 ] 

Hive QA commented on HIVE-22297:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
12s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
48s{color} | {color:blue} itests/util in master has 53 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} itests/util: The patch generated 1 new + 32 unchanged 
- 0 fixed = 33 total (was 32) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18909/dev-support/hive-personality.sh
 |
| git revision | master / d907dfe |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18909/yetus/diff-checkstyle-itests_util.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18909/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql itests/util U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18909/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> qtests: add regex based replacer
> 
>
> Key: HIVE-22297
> URL: https://issues.apache.org/jira/browse/HIVE-22297
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22297.01.patch, HIVE-22297.02.patch
>
>
> right now we have some hard wired replacers for specific patterns
> idea would be to generalize it - and make it able to configure replacements 
> from the qfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22304) Upgrade ORC version to 1.6.0

2019-10-08 Thread David Lavati (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Lavati reassigned HIVE-22304:
---


> Upgrade ORC version to 1.6.0
> 
>
> Key: HIVE-22304
> URL: https://issues.apache.org/jira/browse/HIVE-22304
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22303) TestObjectStore starts some deadline timers which are never stopped

2019-10-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-22303:
---


> TestObjectStore starts some deadline timers which are never stopped
> ---
>
> Key: HIVE-22303
> URL: https://issues.apache.org/jira/browse/HIVE-22303
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> because these timers are not stopped; they may stay there as a threadlocal; 
> and eventually time out since the disarm logic is missing...
> https://github.com/apache/hive/blob/d907dfe68ed84714d62a22e5191efa616eab2b24/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestObjectStore.java#L373



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946843#comment-16946843
 ] 

Hive QA commented on HIVE-22097:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982480/HIVE-22097.1.branch-3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18908/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18908/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18908/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12982480/HIVE-22097.1.branch-3.patch
 was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982480 - PreCommit-HIVE-Build

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by:

[jira] [Commented] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946839#comment-16946839
 ] 

Hive QA commented on HIVE-22097:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982480/HIVE-22097.1.branch-3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 148 failed/errored test(s), 14438 tests 
executed
*Failed tests:*
{noformat}
TestAddPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestAddPartitionsFromPartSpec - did not produce a TEST-*.xml file (likely timed 
out) (batchId=230)
TestAdminUser - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestAggregateStatsCache - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestAlterPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestAppendPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestBeeLineDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=276)
TestCachedStore - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestCatalogCaching - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestCatalogNonDefaultClient - did not produce a TEST-*.xml file (likely timed 
out) (batchId=228)
TestCatalogNonDefaultSvr - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestCatalogOldClient - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestCatalogs - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestChainFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestCheckConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestCloseableThreadLocal - did not produce a TEST-*.xml file (likely timed out) 
(batchId=335)
TestCustomQueryFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestDataSourceProviderFactory - did not produce a TEST-*.xml file (likely timed 
out) (batchId=239)
TestDatabases - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDeadline - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestDefaultConstraint - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestDropPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=276)
TestEmbeddedHiveMetaStore - did not produce a TEST-*.xml file (likely timed 
out) (batchId=231)
TestExchangePartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestFMSketchSerialization - did not produce a TEST-*.xml file (likely timed 
out) (batchId=240)
TestFilterHooks - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestForeignKey - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestFunctions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestGetPartitions - did not produce a TEST-*.xml file (likely timed out) 
(batchId=230)
TestGetPartitionsUsingProjectionAndFilterSpecs - did not produce a TEST-*.xml 
file (likely timed out) (batchId=230)
TestGetTableMeta - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestGroupFilter - did not produce a TEST-*.xml file (likely timed out) 
(batchId=232)
TestHLLNoBias - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHLLSerialization - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHdfsUtils - did not produce a TEST-*.xml file (likely timed out) 
(batchId=237)
TestHiveAlterHandler - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestHiveMetaStoreGetMetaConf - did not produce a TEST-*.xml file (likely timed 
out) (batchId=239)
TestHiveMetaStorePartitionSpecs - did not produce a TEST-*.xml file (likely 
timed out) (batchId=230)
TestHiveMetaStoreSchemaMethods - did not produce a TEST-*.xml file (likely 
timed out) (batchId=237)
TestHiveMetaStoreTimeout - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHiveMetaStoreTxns - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHiveMetaStoreWithEnvironmentContext - did not produce a TEST-*.xml file 
(likely timed out) (batchId=234)
TestHiveMetaToolCommandLine - did not produce a TEST-*.xml file (likely timed 
out) (batchId=232)
TestHiveMetastoreCli - did not produce a TEST-*.xml file (likely timed out) 
(batchId=228)
TestHmsServerAuthorization - did not produce a TEST-*.xml file (likely timed 
out) (batchId=237)
TestHyperLogLog - did not produce a TEST-*.xml file (likely timed out) 
(batchId=240)
TestHyperLogLogDense - did not produce a TEST-*.xml file (likely timed out) 
(batchId=239)
TestHyperLogLogMerge - did not produce a

[jira] [Work logged] (HIVE-22278) Upgrade log4j to 2.12.1

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22278?focusedWorklogId=325054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325054
 ]

ASF GitHub Bot logged work on HIVE-22278:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 13:07
Start Date: 08/Oct/19 13:07
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #798: HIVE-22278: 
Upgrade log4j to 2.12.1
URL: https://github.com/apache/hive/pull/798
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325054)
Time Spent: 20m  (was: 10m)

> Upgrade log4j to 2.12.1
> ---
>
> Key: HIVE-22278
> URL: https://issues.apache.org/jira/browse/HIVE-22278
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22278.02.patch, HIVE-22278.02.patch, 
> HIVE-22278.02.patch, HIVE-22278.02.patch, HIVE-22278.02.patch, 
> HIVE-22278.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hive's currently using log4j 2.10.0 and according to HIVE-21273, a number of 
> issues are present in it, which can be resolved by upgrading to 2.12.1:
> Curly braces in parameters are treated as placeholders
>  affectsVersions:2.8.2;2.9.0;2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2032?filter=allopenissues]
>  Remove Log4J API dependency on Management APIs
>  affectsVersions:2.9.1;2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2126?filter=allopenissues]
>  Log4j2 throws NoClassDefFoundError in Java 9
>  affectsVersions:2.10.0;2.11.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2129?filter=allopenissues]
>  ThreadContext map is cleared => entries are only available for one log event
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2158?filter=allopenissues]
>  Objects held in SortedArrayStringMap cannot be filtered during serialization
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2163?filter=allopenissues]
>  NullPointerException at 
> org.apache.logging.log4j.util.Activator.loadProvider(Activator.java:81) in 
> log4j 2.10.0
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2182?filter=allopenissues]
>  MarkerFilter onMismatch invalid attribute in .properties
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2202?filter=allopenissues]
>  Configuration builder classes should look for "onMismatch"; not "onMisMatch".
>  
> affectsVersions:2.4;2.4.1;2.5;2.6;2.6.1;2.6.2;2.7;2.8;2.8.1;2.8.2;2.9.0;2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2219?filter=allopenissues]
>  Empty Automatic-Module-Name Header
>  affectsVersions:2.10.0;2.11.0;3.0.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2254?filter=allopenissues]
>  ConcurrentModificationException from 
> org.apache.logging.log4j.status.StatusLogger.(StatusLogger.java:71)
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2276?filter=allopenissues]
>  Allow SystemPropertiesPropertySource to run with a SecurityManager that 
> rejects system property access
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2279?filter=allopenissues]
>  ParserConfigurationException when using Log4j with 
> oracle.xml.jaxp.JXDocumentBuilderFactory
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2283?filter=allopenissues]
>  Log4j 2.10+not working with SLF4J 1.8 in OSGI environment
>  affectsVersions:2.10.0;2.11.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2305?filter=allopenissues]
>  fix the CacheEntry map in ThrowableProxy#toExtendedStackTrace to be put and 
> gotten with same key
>  affectsVersions:2.6.2;2.7;2.8;2.8.1;2.8.2;2.9.0;2.9.1;2.10.0;2.11.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2389?filter=allopenissues]
>  NullPointerException when closing never used RollingRandomAccessFileAppender
>  affectsVersions:2.10.0;2.11.1
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2418?filter=allopenissues]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22270) Upgrade commons-io to 2.6

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22270?focusedWorklogId=325052=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325052
 ]

ASF GitHub Bot logged work on HIVE-22270:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 13:06
Start Date: 08/Oct/19 13:06
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #796: HIVE-22270 
Upgrade commons-io to 2.6
URL: https://github.com/apache/hive/pull/796
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325052)
Time Spent: 20m  (was: 10m)

> Upgrade commons-io to 2.6
> -
>
> Key: HIVE-22270
> URL: https://issues.apache.org/jira/browse/HIVE-22270
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22270.01.patch, HIVE-22270.01.patch, 
> HIVE-22270.01.patch, HIVE-22270.01.patch, HIVE-22270.patch, HIVE-22270.patch, 
> HIVE-22270.patch, HIVE-22270.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hive's currently using commons-io 2.4 and according to HIVE-21273, a number 
> of issues are present in it, which can be resolved by upgrading to 2.6:
> IOUtils copyLarge() and skip() methods are performance hogs
>  affectsVersions:2.3;2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-355?filter=allopenissues]
>  CharSequenceInputStream#reset() behaves incorrectly in case when buffer size 
> is not dividable by data size
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-356?filter=allopenissues]
>  [Tailer] InterruptedException while the thead is sleeping is silently ignored
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-357?filter=allopenissues]
>  IOUtils.contentEquals* methods returns false if input1 == input2; should 
> return true
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-362?filter=allopenissues]
>  Apache Commons - standard links for documents are failing
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-369?filter=allopenissues]
>  FileUtils.sizeOfDirectoryAsBigInteger can overflow
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-390?filter=allopenissues]
>  Regression in FileUtils.readFileToString from 2.0.1
>  affectsVersions:2.1;2.2;2.3;2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-453?filter=allopenissues]
>  Correct exception message in FileUtils.getFile(File; String...)
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-479?filter=allopenissues]
>  org.apache.commons.io.FileUtils#waitFor waits too long
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-481?filter=allopenissues]
>  FilenameUtils should handle embedded null bytes
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-484?filter=allopenissues]
>  Exceptions are suppressed incorrectly when copying files.
>  affectsVersions:2.4;2.5
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-502?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-7145) Remove dependence on apache commons-lang

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-7145?focusedWorklogId=325053=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325053
 ]

ASF GitHub Bot logged work on HIVE-7145:


Author: ASF GitHub Bot
Created on: 08/Oct/19 13:06
Start Date: 08/Oct/19 13:06
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #795: HIVE-7145 
Remove dependence on apache commons-lang
URL: https://github.com/apache/hive/pull/795
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 325053)
Time Spent: 20m  (was: 10m)

> Remove dependence on apache commons-lang
> 
>
> Key: HIVE-7145
> URL: https://issues.apache.org/jira/browse/HIVE-7145
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-7145.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We currently depend on both Apache commons-lang and commons-lang3. They are 
> the same project, just at version 2.x vs 3.x. I propose that we move all of 
> the references in Hive to commons-lang3 and remove the v2 usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22278) Upgrade log4j to 2.12.1

2019-10-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22278:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you [~dlavati]!

> Upgrade log4j to 2.12.1
> ---
>
> Key: HIVE-22278
> URL: https://issues.apache.org/jira/browse/HIVE-22278
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22278.02.patch, HIVE-22278.02.patch, 
> HIVE-22278.02.patch, HIVE-22278.02.patch, HIVE-22278.02.patch, 
> HIVE-22278.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive's currently using log4j 2.10.0 and according to HIVE-21273, a number of 
> issues are present in it, which can be resolved by upgrading to 2.12.1:
> Curly braces in parameters are treated as placeholders
>  affectsVersions:2.8.2;2.9.0;2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2032?filter=allopenissues]
>  Remove Log4J API dependency on Management APIs
>  affectsVersions:2.9.1;2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2126?filter=allopenissues]
>  Log4j2 throws NoClassDefFoundError in Java 9
>  affectsVersions:2.10.0;2.11.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2129?filter=allopenissues]
>  ThreadContext map is cleared => entries are only available for one log event
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2158?filter=allopenissues]
>  Objects held in SortedArrayStringMap cannot be filtered during serialization
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2163?filter=allopenissues]
>  NullPointerException at 
> org.apache.logging.log4j.util.Activator.loadProvider(Activator.java:81) in 
> log4j 2.10.0
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2182?filter=allopenissues]
>  MarkerFilter onMismatch invalid attribute in .properties
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2202?filter=allopenissues]
>  Configuration builder classes should look for "onMismatch"; not "onMisMatch".
>  
> affectsVersions:2.4;2.4.1;2.5;2.6;2.6.1;2.6.2;2.7;2.8;2.8.1;2.8.2;2.9.0;2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2219?filter=allopenissues]
>  Empty Automatic-Module-Name Header
>  affectsVersions:2.10.0;2.11.0;3.0.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2254?filter=allopenissues]
>  ConcurrentModificationException from 
> org.apache.logging.log4j.status.StatusLogger.(StatusLogger.java:71)
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2276?filter=allopenissues]
>  Allow SystemPropertiesPropertySource to run with a SecurityManager that 
> rejects system property access
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2279?filter=allopenissues]
>  ParserConfigurationException when using Log4j with 
> oracle.xml.jaxp.JXDocumentBuilderFactory
>  affectsVersions:2.10.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2283?filter=allopenissues]
>  Log4j 2.10+not working with SLF4J 1.8 in OSGI environment
>  affectsVersions:2.10.0;2.11.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2305?filter=allopenissues]
>  fix the CacheEntry map in ThrowableProxy#toExtendedStackTrace to be put and 
> gotten with same key
>  affectsVersions:2.6.2;2.7;2.8;2.8.1;2.8.2;2.9.0;2.9.1;2.10.0;2.11.0
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2389?filter=allopenissues]
>  NullPointerException when closing never used RollingRandomAccessFileAppender
>  affectsVersions:2.10.0;2.11.1
>  
> [https://issues.apache.org/jira/projects/LOG4J2/issues/LOG4J2-2418?filter=allopenissues]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22270) Upgrade commons-io to 2.6

2019-10-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22270:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you [~dlavati]!

> Upgrade commons-io to 2.6
> -
>
> Key: HIVE-22270
> URL: https://issues.apache.org/jira/browse/HIVE-22270
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Lavati
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22270.01.patch, HIVE-22270.01.patch, 
> HIVE-22270.01.patch, HIVE-22270.01.patch, HIVE-22270.patch, HIVE-22270.patch, 
> HIVE-22270.patch, HIVE-22270.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive's currently using commons-io 2.4 and according to HIVE-21273, a number 
> of issues are present in it, which can be resolved by upgrading to 2.6:
> IOUtils copyLarge() and skip() methods are performance hogs
>  affectsVersions:2.3;2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-355?filter=allopenissues]
>  CharSequenceInputStream#reset() behaves incorrectly in case when buffer size 
> is not dividable by data size
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-356?filter=allopenissues]
>  [Tailer] InterruptedException while the thead is sleeping is silently ignored
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-357?filter=allopenissues]
>  IOUtils.contentEquals* methods returns false if input1 == input2; should 
> return true
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-362?filter=allopenissues]
>  Apache Commons - standard links for documents are failing
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-369?filter=allopenissues]
>  FileUtils.sizeOfDirectoryAsBigInteger can overflow
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-390?filter=allopenissues]
>  Regression in FileUtils.readFileToString from 2.0.1
>  affectsVersions:2.1;2.2;2.3;2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-453?filter=allopenissues]
>  Correct exception message in FileUtils.getFile(File; String...)
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-479?filter=allopenissues]
>  org.apache.commons.io.FileUtils#waitFor waits too long
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-481?filter=allopenissues]
>  FilenameUtils should handle embedded null bytes
>  affectsVersions:2.4
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-484?filter=allopenissues]
>  Exceptions are suppressed incorrectly when copying files.
>  affectsVersions:2.4;2.5
>  
> [https://issues.apache.org/jira/projects/IO/issues/IO-502?filter=allopenissues]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-7145) Remove dependence on apache commons-lang

2019-10-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-7145:
---
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

pushed to master. Thank you [~dlavati]!

> Remove dependence on apache commons-lang
> 
>
> Key: HIVE-7145
> URL: https://issues.apache.org/jira/browse/HIVE-7145
> Project: Hive
>  Issue Type: Bug
>Reporter: Owen O'Malley
>Assignee: David Lavati
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-7145.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We currently depend on both Apache commons-lang and commons-lang3. They are 
> the same project, just at version 2.x vs 3.x. I propose that we move all of 
> the references in Hive to commons-lang3 and remove the v2 usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22297) qtests: add regex based replacer

2019-10-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22297:

Attachment: HIVE-22297.02.patch

> qtests: add regex based replacer
> 
>
> Key: HIVE-22297
> URL: https://issues.apache.org/jira/browse/HIVE-22297
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22297.01.patch, HIVE-22297.02.patch
>
>
> right now we have some hard wired replacers for specific patterns
> idea would be to generalize it - and make it able to configure replacements 
> from the qfile



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946769#comment-16946769
 ] 

Hive QA commented on HIVE-22097:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 16s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-18907/patches/PreCommit-HIVE-Build-18907.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18907/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
>

[jira] [Commented] (HIVE-22284) Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946763#comment-16946763
 ] 

Hive QA commented on HIVE-22284:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982473/HIVE-22284.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 17517 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_dp] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_struct_type_vectorization]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=159)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18906/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18906/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18906/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982473 - PreCommit-HIVE-Build

> Improve LLAP CacheContentsTracker to collect and display correct statistics
> ---
>
> Key: HIVE-22284
> URL: https://issues.apache.org/jira/browse/HIVE-22284
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22284.0.patch, HIVE-22284.1.patch, 
> HIVE-22284.2.patch, HIVE-22284.3.patch, HIVE-22284.4.patch
>
>
> When keeping track of which buffers correspond to what Hive objects, 
> CacheContentsTracker relies on cache tags.
> Currently a tag is a simple String that ideally holds DB and table name, and 
> a partition spec concatenated by . and / . The information here is derived 
> from the Path of the file that is getting cached. Needless to say sometimes 
> this produces a wrong tag especially for external tables.
> Also there's a bug when calculating aggregated stats for a 'parent' tag 
> (corresponding to the table of the partition) because the overall maxCount 
> and maxSize do not add up to the sum of those in the partitions. This happens 
> when buffers get removed from the cache.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Attachment: HIVE-22097.1.branch-3.patch

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, 
> HIVE-22097.1.branch-3.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Attachment: (was: HIVE-22097.1.branch-3.0.patch)

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.1.patch, HIVE-22097.1.patch, 
> JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Status: Patch Available  (was: Reopened)

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.1.1, 3.0.0
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.0.patch, 
> HIVE-22097.1.branch-3.1.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Reopened] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar reopened HIVE-22097:
--

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.0.patch, 
> HIVE-22097.1.branch-3.1.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22097) Incompatible java.util.ArrayList for java 11

2019-10-08 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22097:
-
Attachment: HIVE-22097.1.branch-3.0.patch
HIVE-22097.1.branch-3.1.patch

> Incompatible java.util.ArrayList for java 11
> 
>
> Key: HIVE-22097
> URL: https://issues.apache.org/jira/browse/HIVE-22097
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Yuming Wang
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22097.1.branch-3.0.patch, 
> HIVE-22097.1.branch-3.1.patch, HIVE-22097.1.patch, JDK1.8.png, JDK11.png
>
>
> {noformat}
> export JAVA_HOME=/usr/lib/jdk-11.0.3
> export PATH=${JAVA_HOME}/bin:${PATH}
> hive> create table t(id int);
> Time taken: 0.035 seconds
> hive> insert into t values(1);
> Query ID = root_20190811155400_7c0e0494-eecb-4c54-a9fd-942ab52a0794
> Total jobs = 3
> Launching Job 1 out of 3
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> java.lang.RuntimeException: java.lang.NoSuchFieldException: parentOffset
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:390)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$1.create(SerializationUtilities.java:235)
>   at 
> org.apache.hive.com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities.borrowKryo(SerializationUtilities.java:280)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:595)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:587)
>   at 
> org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:579)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:357)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:159)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2317)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1969)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1636)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1396)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1390)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:838)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:777)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:696)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: java.lang.NoSuchFieldException: parentOffset
>   at java.base/java.lang.Class.getDeclaredField(Class.java:2412)
>   at 
> org.apache.hadoop.hive.ql.exec.SerializationUtilities$ArrayListSubListSerializer.(SerializationUtilities.java:384)
>   ... 29 more
> Job Submission failed with exception 
> 'java.lang.RuntimeException(java.lang.NoSuchFieldException: parentOffset)'
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask. java.lang.NoSuchFieldException: 
> parentOffset
> {noformat}
> The reason is Java removed {{parentOffset}}:
>  !JDK1.8.png! 
>  !JDK11.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22284) Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946714#comment-16946714
 ] 

Hive QA commented on HIVE-22284:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
22s{color} | {color:blue} storage-api in master has 48 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
33s{color} | {color:blue} llap-common in master has 90 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
2s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
42s{color} | {color:blue} llap-server in master has 90 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} storage-api: The patch generated 2 new + 4 unchanged - 
0 fixed = 6 total (was 4) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 2 new + 165 unchanged - 2 
fixed = 167 total (was 167) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} llap-server: The patch generated 1 new + 252 unchanged 
- 13 fixed = 253 total (was 265) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-18906/dev-support/hive-personality.sh
 |
| git revision | master / 050f918 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18906/yetus/diff-checkstyle-storage-api.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18906/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18906/yetus/diff-checkstyle-llap-server.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18906/yetus/patch-asflicense-problems.txt
 |
| modules | C: storage-api llap-common ql llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-18906/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improve LLAP

[jira] [Commented] (HIVE-22235) CommandProcessorResponse should not be an exception

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946686#comment-16946686
 ] 

Hive QA commented on HIVE-22235:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982463/HIVE-22235.08.patch

{color:green}SUCCESS:{color} +1 due to 75 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17515 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18905/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18905/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18905/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982463 - PreCommit-HIVE-Build

> CommandProcessorResponse should not be an exception
> ---
>
> Key: HIVE-22235
> URL: https://issues.apache.org/jira/browse/HIVE-22235
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22235.01.patch, HIVE-22235.02.patch, 
> HIVE-22235.03.patch, HIVE-22235.04.patch, HIVE-22235.05.patch, 
> HIVE-22235.06.patch, HIVE-22235.07.patch, HIVE-22235.08.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The CommandProcessorResponse class extends Exception. This may be convenient, 
> but it is wrong, as a response is not an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22235) CommandProcessorResponse should not be an exception

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946682#comment-16946682
 ] 

Hive QA commented on HIVE-22235:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
45s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
38s{color} | {color:blue} service in master has 49 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} cli in master has 9 extant Findbugs warnings. {color} 
|
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} hcatalog/core in master has 36 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
27s{color} | {color:blue} hcatalog/hcatalog-pig-adapter in master has 2 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
28s{color} | {color:blue} hcatalog/server-extensions in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} hcatalog/webhcat/java-client in master has 3 extant 
Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
29s{color} | {color:blue} hcatalog/streaming in master has 11 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
28s{color} | {color:blue} streaming in master has 2 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
51s{color} | {color:blue} itests/util in master has 53 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
52s{color} | {color:red} ql: The patch generated 73 new + 1432 unchanged - 467 
fixed = 1505 total (was 1899) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch service passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} cli: The patch generated 0 new + 39 unchanged - 2 
fixed = 39 total (was 41) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} hcatalog/core: The patch generated 0 new + 75 
unchanged - 16 fixed = 75 total (was 91) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} hcatalog/hcatalog-pig-adapter: The patch generated 0 
new + 183 unchanged - 11 fixed = 183 total (was 194) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch server-extensions passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch java-client

[jira] [Comment Edited] (HIVE-22300) Deduplicate the authentication and LDAP code in HMS and HS2

2019-10-08 Thread Peter Vary (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946679#comment-16946679
 ] 

Peter Vary edited comment on HIVE-22300 at 10/8/19 10:11 AM:
-

[~ashutosh.bapat]: I have seen several place where there was code duplication. 
I believe that the goal was to be able to separate the HMS to its own top level 
apache project.

Thanks,
Peter


was (Author: pvary):
[~ashutosh.bapat]: Do not forget that we might want to separate the HMS to its 
own top level apache project

> Deduplicate the authentication and LDAP code in HMS and HS2
> ---
>
> Key: HIVE-22300
> URL: https://issues.apache.org/jira/browse/HIVE-22300
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Standalone Metastore
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> HIVE-22267 has duplicated code from hive-service/auth directory under 
> standalone-metastore directory. Deduplicate this code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22300) Deduplicate the authentication and LDAP code in HMS and HS2

2019-10-08 Thread Peter Vary (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946679#comment-16946679
 ] 

Peter Vary commented on HIVE-22300:
---

[~ashutosh.bapat]: Do not forget that we might want to separate the HMS to its 
own top level apache project

> Deduplicate the authentication and LDAP code in HMS and HS2
> ---
>
> Key: HIVE-22300
> URL: https://issues.apache.org/jira/browse/HIVE-22300
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Standalone Metastore
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Major
>
> HIVE-22267 has duplicated code from hive-service/auth directory under 
> standalone-metastore directory. Deduplicate this code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946643#comment-16946643
 ] 

Hive QA commented on HIVE-22274:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982445/HIVE-22274.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 39 failed/errored test(s), 17515 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_limit] 
(batchId=13)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_SortUnionTransposeRule]
 (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[concat_op] (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[inputwherefalse] 
(batchId=95)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[limit0] (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_limit]
 (batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[plan_json] (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join3] 
(batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join4] 
(batchId=95)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join6] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=40)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_expressions]
 (batchId=198)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_extractTime]
 (batchId=198)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_floorTime]
 (batchId=198)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[external_jdbc_table_perf]
 (batchId=185)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_only_empty_query]
 (batchId=182)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_join_transpose]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown3]
 (batchId=169)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[limit_pushdown]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[offset_limit_ppd_optimizer]
 (batchId=180)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_ANY]
 (batchId=177)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_limit]
 (batchId=171)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[limit_pushdown] 
(batchId=145)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_limit]
 (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=121)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query10] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query14] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query16] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query35] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query69] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query94] 
(batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query10]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query16]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query35]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query69]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query72]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query94]
 (batchId=299)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query72]
 (batchId=299)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18904/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18904/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18904/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 39 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982445 - PreCommit-HIVE-Build

> Upgrade Calcite version

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324949
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332368069
 
 

 ##
 File path: ql/src/test/results/clientpositive/llap/semijoin_reddedup.q.out
 ##
 @@ -258,9 +258,12 @@ STAGE PLANS:
 Tez
  A masked pattern was here 
   Edges:
+Map 1 <- Reducer 10 (BROADCAST_EDGE)
+Map 11 <- Reducer 10 (BROADCAST_EDGE)
+Reducer 10 <- Reducer 9 (CUSTOM_SIMPLE_EDGE)
 Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE)
 Reducer 3 <- Reducer 2 (SIMPLE_EDGE), Reducer 9 (SIMPLE_EDGE)
-Reducer 4 <- Map 10 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
+Reducer 4 <- Map 11 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
 
 Review comment:
   I'm not 100% sure what's the goal of this test; but I think we should 
probably disable the uniform distribution for it to retain its original goal
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324949)
Time Spent: 1.5h  (was: 1h 20m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324951
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332377711
 
 

 ##
 File path: 
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
 ##
 @@ -562,14 +562,27 @@ struct DateColumnStatsData {
 5: optional binary bitVectors
 }
 
+struct Timestamp {
+1: required i64 secondsSinceEpoch
 
 Review comment:
   I'm afraid that there will be a downside that we are throwing away precision 
- and because of that we may get into some troubles later:
   
   If we do truncate to seconds; we may not be able to extend the timestamp 
logic to the stats optimizer - as we are not working with the real values.
   
   consider the following
   ```sql
   select '2019-11-11 11:11:11.400' < '2019-11-11 11:11:11.300'
   ```
   if we round to seconds; consider that the left side comes from a table - a 
the columns maxvalue; the stats optimizer could deduce a "true" for the above
   
   Would it complicate things much to use a non-rounded timestamp - and retain 
miliseconds/microsendond as well ?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324951)
Time Spent: 1h 50m  (was: 1h 40m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324952=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324952
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332352613
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 ##
 @@ -276,9 +279,11 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
 ExprNodeDesc pred = fop.getConf().getPredicate();
 
 // evaluate filter expression and update statistics
+final boolean uniformWithinRange = HiveConf.getBoolVar(
+aspCtx.getConf(), 
HiveConf.ConfVars.HIVE_STATS_RANGE_SELECTIVITY_UNIFORM_DISTRIBUTION);
 aspCtx.clearAffectedColumns();
 long newNumRows = evaluateExpression(parentStats, pred, aspCtx,
-neededCols, fop, parentStats.getNumRows());
+neededCols, fop, parentStats.getNumRows(), uniformWithinRange);
 
 Review comment:
   I don't think we should pass a boolean here; as aspCtx is there
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324952)
Time Spent: 2h  (was: 1h 50m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324943=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324943
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332357556
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -856,8 +856,15 @@ public static ColStatistics 
getColStatistics(ColumnStatisticsObj cso, String tab
 } else if (colTypeLowerCase.equals(serdeConstants.BINARY_TYPE_NAME)) {
   cs.setAvgColLen(csd.getBinaryStats().getAvgColLen());
   cs.setNumNulls(csd.getBinaryStats().getNumNulls());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
+  cs.setAvgColLen(JavaDataModel.get().lengthOfTimestamp());
+  cs.setNumNulls(csd.getTimestampStats().getNumNulls());
+  Long lowVal = (csd.getTimestampStats().getLowValue() != null) ? 
csd.getTimestampStats().getLowValue()
+  .getSecondsSinceEpoch() : null;
+  Long highVal = (csd.getTimestampStats().getHighValue() != null) ? 
csd.getTimestampStats().getHighValue()
+  .getSecondsSinceEpoch() : null;
+  cs.setRange(lowVal, highVal);
 
 Review comment:
   I don't feel this fortunatewe do know the low/high value but we faltten 
it to some number...instead of this we would need properly typed ranges - for 
decimal we already throw away all our knowledge if it runs beyond long limits
   
   the current changes follow the existing traditions - if we decide to change 
that ; it should be done in a separate ticket
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324943)
Time Spent: 1h  (was: 50m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324941
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332365302
 
 

 ##
 File path: 
ql/src/test/results/clientpositive/llap/retry_failure_stat_changes.q.out
 ##
 @@ -139,25 +139,25 @@ Stage-0
   PARTITION_ONLY_SHUFFLE [RS_12]
 Group By Operator [GBY_11] (rows=1/1 width=8)
   Output:["_col0"],aggregations:["sum(_col0)"]
-  Select Operator [SEL_9] (rows=1/3 width=8)
+  Select Operator [SEL_9] (rows=4/3 width=8)
 Output:["_col0"]
-Merge Join Operator [MERGEJOIN_30] (rows=1/3 width=8)
+Merge Join Operator [MERGEJOIN_30] (rows=4/3 width=8)
   Conds:RS_6._col0=RS_7._col0(Inner),Output:["_col0","_col1"]
 <-Map 1 [SIMPLE_EDGE] llap
   SHUFFLE [RS_6]
 PartitionCols:_col0
-Select Operator [SEL_2] (rows=1/5 width=4)
+Select Operator [SEL_2] (rows=7/5 width=4)
   Output:["_col0"]
-  Filter Operator [FIL_18] (rows=1/5 width=4)
+  Filter Operator [FIL_18] (rows=7/5 width=4)
 
 Review comment:
   there was a "by-design" underestimation in this testcase...now we have good 
estimate :)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324941)
Time Spent: 40m  (was: 0.5h)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324953
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332355126
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 ##
 @@ -967,13 +979,23 @@ private long evaluateComparator(Statistics stats, 
AnnotateStatsProcCtx aspCtx, E
   if (minValue > value) {
 return 0;
   }
+  if (uniformWithinRange) {
+// Assuming uniform distribution, we can use the range to 
calculate
+// new estimate for the number of rows
+return Math.round(((double) (value - minValue) / (maxValue - 
minValue)) * numRows);
 
 Review comment:
   I think we will probably hit a divide by zero here when max=min; I don't see 
any preceeding conditionals covering for that (however there can be...)
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324953)
Time Spent: 2h  (was: 1h 50m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324944
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332382671
 
 

 ##
 File path: ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out
 ##
 @@ -759,50 +759,56 @@ STAGE PLANS:
   Statistics: Num rows: 500 Data size: 2000 Basic stats: 
COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (key > 2) (type: boolean)
-Statistics: Num rows: 166 Data size: 664 Basic stats: 
COMPLETE Column stats: COMPLETE
+Statistics: Num rows: 498 Data size: 1992 Basic stats: 
COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: key (type: int)
   outputColumnNames: _col0
-  Statistics: Num rows: 166 Data size: 664 Basic stats: 
COMPLETE Column stats: COMPLETE
-  Map Join Operator
-condition map:
- Inner Join 0 to 1
-keys:
-  0 _col0 (type: int)
-  1 _col0 (type: int)
-outputColumnNames: _col0, _col1
-input vertices:
-  1 Map 2
-Statistics: Num rows: 272 Data size: 2176 Basic stats: 
COMPLETE Column stats: COMPLETE
-File Output Operator
-  compressed: false
-  Statistics: Num rows: 272 Data size: 2176 Basic 
stats: COMPLETE Column stats: COMPLETE
-  table:
-  input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
-  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
-  serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+  Statistics: Num rows: 498 Data size: 1992 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Reduce Output Operator
+key expressions: _col0 (type: int)
+sort order: +
+Map-reduce partition columns: _col0 (type: int)
+Statistics: Num rows: 498 Data size: 1992 Basic stats: 
COMPLETE Column stats: COMPLETE
 
 Review comment:
   disable enhancement; seems to be interfering with the test
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324944)
Time Spent: 1h  (was: 50m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324950=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324950
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332383859
 
 

 ##
 File path: ql/src/test/results/clientpositive/llap/subquery_select.q.out
 ##
 @@ -3918,14 +3918,14 @@ STAGE PLANS:
   Statistics: Num rows: 26 Data size: 208 Basic stats: 
COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: p_partkey BETWEEN 1 AND 2 (type: 
boolean)
-Statistics: Num rows: 8 Data size: 64 Basic stats: 
COMPLETE Column stats: COMPLETE
+Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
Column stats: COMPLETE
 Select Operator
   expressions: p_size (type: int)
   outputColumnNames: p_size
-  Statistics: Num rows: 8 Data size: 64 Basic stats: 
COMPLETE Column stats: COMPLETE
+  Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE
   Group By Operator
 aggregations: max(p_size)
-minReductionHashAggr: 0.875
+minReductionHashAggr: 0.0
 
 Review comment:
   the computed range is most likely empty...
   these changes suggest to me that something is not entirely rightis this 
expected?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324950)
Time Spent: 1h 40m  (was: 1.5h)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324945
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332355776
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -856,8 +856,15 @@ public static ColStatistics 
getColStatistics(ColumnStatisticsObj cso, String tab
 } else if (colTypeLowerCase.equals(serdeConstants.BINARY_TYPE_NAME)) {
   cs.setAvgColLen(csd.getBinaryStats().getAvgColLen());
   cs.setNumNulls(csd.getBinaryStats().getNumNulls());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
 
 Review comment:
   I start to feel like having these 2 changes separetly from eachother might 
have been good
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324945)
Time Spent: 1h  (was: 50m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324946
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332353993
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
 ##
 @@ -316,7 +321,7 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
 
 protected long evaluateExpression(Statistics stats, ExprNodeDesc pred,
 AnnotateStatsProcCtx aspCtx, List neededCols,
-Operator op, long currNumRows) throws SemanticException {
+Operator op, long currNumRows, boolean uniformWithinRange) throws 
SemanticException {
 
 Review comment:
   this boolean is not used in this function ; but aspCtx can be used to obtain 
it - I would suggest to either:
   * leave the function signature as is and get the boolean when its needed 
from conf
   * this class is constructed and used to optimize a single query - so we may 
pass the conf on consturction and create a finalized field from this info
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324946)
Time Spent: 1h  (was: 50m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324942
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332352109
 
 

 ##
 File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 ##
 @@ -2537,6 +2537,11 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 "When estimating output rows for a join involving multiple columns, 
the default behavior assumes" +
 "the columns are independent. Setting this flag to true will cause the 
estimator to assume" +
 "the columns are correlated."),
+
HIVE_STATS_RANGE_SELECTIVITY_UNIFORM_DISTRIBUTION("hive.stats.filter.range.uniform",
 true,
+"When estimating output rows from a condition, if a range predicate is 
applied over a column and the" +
+"minimum and maximum values for that column are available, assume 
uniform distribution of values" +
+"accross that range and scales number of rows proportionally. If this 
is set to false, default" +
+"selectivity value is used."),
 
 Review comment:
   small  thing: could you add spaces at the end; because right now it contains 
words like "theminimum" and "valuesacross" 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324942)
Time Spent: 50m  (was: 40m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324940=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324940
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332362215
 
 

 ##
 File path: ql/src/test/results/clientpositive/alter_table_update_status.q.out
 ##
 @@ -453,8 +453,8 @@ POSTHOOK: type: DESCTABLE
 POSTHOOK: Input: default@datatype_stats_n0
 col_name   ts  
 
 data_type  timestamp   
 
-min1325379723  
 
-max1325379723  
 
+min2012-01-01 01:02:03 
 
 
 Review comment:
   human readable timestamp value 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324940)
Time Spent: 0.5h  (was: 20m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324948
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332358389
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -935,8 +942,11 @@ else 
if(colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME)){
 cs.setNumTrues(Math.max(1, numRows/2));
 cs.setNumFalses(Math.max(1, numRows/2));
 cs.setAvgColLen(JavaDataModel.get().primitive1());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
+  cs.setAvgColLen(JavaDataModel.get().lengthOfTimestamp());
+  // epoch, seconds since epoch
+  cs.setRange(0, 2177452799L);
 
 Review comment:
   not 100% sure about this; but I think the minvalue should not be 0; or we 
may optimize away pre 1970 filter conditions
   and why it's exactly 217745799 :) I might be missing some context...
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324948)
Time Spent: 1h 20m  (was: 1h 10m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324947
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:47
Start Date: 08/Oct/19 08:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332373527
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/TimestampColumnStatsAggregator.java
 ##
 @@ -0,0 +1,358 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.hadoop.hive.metastore.columnstats.aggr;
+
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.Map;
+import org.apache.hadoop.hive.common.ndv.NumDistinctValueEstimator;
+import org.apache.hadoop.hive.common.ndv.NumDistinctValueEstimatorFactory;
+import org.apache.hadoop.hive.metastore.api.ColumnStatisticsData;
+import org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj;
+import org.apache.hadoop.hive.metastore.api.Timestamp;
+import org.apache.hadoop.hive.metastore.api.TimestampColumnStatsData;
+import org.apache.hadoop.hive.metastore.api.MetaException;
+import 
org.apache.hadoop.hive.metastore.columnstats.cache.TimestampColumnStatsDataInspector;
+import 
org.apache.hadoop.hive.metastore.utils.MetaStoreServerUtils.ColStatsObjWithSourceInfo;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.hadoop.hive.metastore.columnstats.ColumnsStatsUtils.timestampInspectorFromStats;
+
+public class TimestampColumnStatsAggregator extends ColumnStatsAggregator 
implements
+IExtrapolatePartStatus {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(TimestampColumnStatsAggregator.class);
+
+  @Override
+  public ColumnStatisticsObj aggregate(List 
colStatsWithSourceInfo,
+List partNames, boolean areAllPartsFound) throws 
MetaException {
+ColumnStatisticsObj statsObj = null;
+String colType = null;
+String colName = null;
+// check if all the ColumnStatisticsObjs contain stats and all the ndv are
+// bitvectors
+boolean doAllPartitionContainStats = partNames.size() == 
colStatsWithSourceInfo.size();
+NumDistinctValueEstimator ndvEstimator = null;
+for (ColStatsObjWithSourceInfo csp : colStatsWithSourceInfo) {
+  ColumnStatisticsObj cso = csp.getColStatsObj();
+  if (statsObj == null) {
+colName = cso.getColName();
+colType = cso.getColType();
+statsObj = ColumnStatsAggregatorFactory.newColumnStaticsObj(colName, 
colType,
+cso.getStatsData().getSetField());
+LOG.trace("doAllPartitionContainStats for column: {} is: {}", colName, 
doAllPartitionContainStats);
+  }
+  TimestampColumnStatsDataInspector timestampColumnStats = 
timestampInspectorFromStats(cso);
+
+  if (timestampColumnStats.getNdvEstimator() == null) {
+ndvEstimator = null;
+break;
+  } else {
+// check if all of the bit vectors can merge
+NumDistinctValueEstimator estimator = 
timestampColumnStats.getNdvEstimator();
+if (ndvEstimator == null) {
+  ndvEstimator = estimator;
+} else {
+  if (ndvEstimator.canMerge(estimator)) {
+continue;
+  } else {
+ndvEstimator = null;
+break;
+  }
+}
+  }
+}
+if (ndvEstimator != null) {
+  ndvEstimator = NumDistinctValueEstimatorFactory
+  .getEmptyNumDistinctValueEstimator(ndvEstimator);
+}
+LOG.debug("all of the bit vectors can merge for " + colName + " is " + 
(ndvEstimator != null));
+ColumnStatisticsData columnStatisticsData = new ColumnStatisticsData();
+if (doAllPartitionContainStats || colStatsWithSourceInfo.size() < 2) {
+  TimestampColumnStatsDataInspector aggregateData = null;
+  long lowerBound = 0;
+  long

[jira] [Commented] (HIVE-22274) Upgrade Calcite version to 1.21.0

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946631#comment-16946631
 ] 

Hive QA commented on HIVE-22274:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
55s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
20s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
41s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
46s{color} | {color:red} ql: The patch generated 33 new + 369 unchanged - 11 
fixed = 402 total (was 380) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
1s{color} | {color:red} root: The patch generated 33 new + 369 unchanged - 11 
fixed = 402 total (was 380) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
20s{color} | {color:red} ql generated 6 new + 1549 unchanged - 2 fixed = 1555 
total (was 1551) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to joinInfo in 
org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveJoinFactoryImpl.createJoin(RelNode,
 RelNode, RexNode, Set, JoinRelType, boolean)  At 
HiveRelFactories.java:org.apache.hadoop.hive.ql.optimizer.calcite.HiveRelFactories$HiveJoinFactoryImpl.createJoin(RelNode,
 RelNode, RexNode, Set, JoinRelType, boolean)  At HiveRelFactories.java:[line 
161] |
|  |  Dead store to rightKeys in 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(LogicalCorrelate)
  At 
HiveRelDecorrelator.java:org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(LogicalCorrelate)
  At HiveRelDecorrelator.java:[line 1465] |
|  |  Dead store to leftKeys in 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(LogicalCorrelate)
  At 
HiveRelDecorrelator.java:org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveRelDecorrelator.decorrelateRel(LogicalCorrelate)
  At HiveRelDecorrelator.java:[line 1464] |
|  |  instanceof will always return true for all non-null values in new 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdPredicates$JoinConditionBasedPredicateInference(Join,
 RexNode, RexNode), since all org.apache.calcite.rel.core.Join are instances of 
org.apache.calcite.rel.core.Join  At HiveRelMdPredicates.java:for all non-null 
values in new 
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdPredicates$JoinConditionBasedPredicateInference(Join,
 RexNode, RexNode), since all

[jira] [Work logged] (HIVE-21884) Scheduled query support

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21884?focusedWorklogId=324930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324930
 ]

ASF GitHub Bot logged work on HIVE-21884:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:29
Start Date: 08/Oct/19 08:29
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #794: HIVE-21884
URL: https://github.com/apache/hive/pull/794#discussion_r332390665
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
 ##
 @@ -18,6 +18,36 @@ UPDATE "APP"."TAB_COL_STATS" SET ENGINE = 'hive' WHERE 
ENGINE IS NULL;
 ALTER TABLE "APP"."PART_COL_STATS" ADD ENGINE VARCHAR(128);
 UPDATE "APP"."PART_COL_STATS" SET ENGINE = 'hive' WHERE ENGINE IS NULL;
 
+CREATE TABLE "APP"."SCHEDULED_QUERIES" (
 
 Review comment:
   yes; I did when I was writing them - actually I've also spotted a bug which 
already had a ticket opened a week beforebut the bug was actually there for 
around a year or so
   I've opened HIVE-22302
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324930)
Time Spent: 3h 10m  (was: 3h)

> Scheduled query support
> ---
>
> Key: HIVE-21884
> URL: https://issues.apache.org/jira/browse/HIVE-21884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21844.04.patch, HIVE-21844.05.patch, 
> HIVE-21844.06.patch, HIVE-21844.07.patch, HIVE-21844.08.patch, 
> HIVE-21844.09.patch, HIVE-21844.15.patch, HIVE-21844.19.patch, 
> HIVE-21884.01.patch, HIVE-21884.02.patch, HIVE-21884.03.patch, 
> HIVE-21884.09.patch, HIVE-21884.10.patch, HIVE-21884.10.patch, 
> HIVE-21884.11.patch, HIVE-21884.12.patch, HIVE-21884.13.patch, 
> HIVE-21884.14.patch, HIVE-21884.14.patch, HIVE-21884.14.patch, 
> HIVE-21884.16.patch, HIVE-21884.17.patch, HIVE-21884.18.patch, 
> HIVE-21884.20.patch, Scheduled queries2.pdf
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> design document:
> https://docs.google.com/document/d/1mJSFdJi_1cbxJTXC9QvGw2rQ3zzJkNfxOO6b5esmyCE/edit#
> in case the google doc is not reachable:  [^Scheduled queries2.pdf] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22302) Add some smoke tests

2019-10-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-22302:
---


> Add some smoke tests
> 
>
> Key: HIVE-22302
> URL: https://issues.apache.org/jira/browse/HIVE-22302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> I'm not sure how much this is achievable...but we sometime leave metastore 
> upgrade bugs / etc by mistake...
> it would be great to have something which:
> * compiles and deploys hive
> * runs some trivial cases
> * ...and run it against multiple kind of metastore dbs
> I think travis can be convinced to do something like this...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22302) Add some smoke tests

2019-10-08 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-22302:
---

Assignee: (was: Zoltan Haindrich)

> Add some smoke tests
> 
>
> Key: HIVE-22302
> URL: https://issues.apache.org/jira/browse/HIVE-22302
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Priority: Major
>
> I'm not sure how much this is achievable...but we sometime leave metastore 
> upgrade bugs / etc by mistake...
> it would be great to have something which:
> * compiles and deploys hive
> * runs some trivial cases
> * ...and run it against multiple kind of metastore dbs
> I think travis can be convinced to do something like this...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22239?focusedWorklogId=324914=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324914
 ]

ASF GitHub Bot logged work on HIVE-22239:
-

Author: ASF GitHub Bot
Created on: 08/Oct/19 08:03
Start Date: 08/Oct/19 08:03
Worklog Time Spent: 10m 
  Work Description: miklosgergely commented on pull request #787: HIVE-22239
URL: https://github.com/apache/hive/pull/787#discussion_r332380055
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -935,8 +942,11 @@ else 
if(colTypeLowerCase.equals(serdeConstants.SMALLINT_TYPE_NAME)){
 cs.setNumTrues(Math.max(1, numRows/2));
 cs.setNumFalses(Math.max(1, numRows/2));
 cs.setAvgColLen(JavaDataModel.get().primitive1());
-} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME) ||
-colTypeLowerCase.equals(serdeConstants.TIMESTAMPLOCALTZ_TYPE_NAME)) {
+} else if (colTypeLowerCase.equals(serdeConstants.TIMESTAMP_TYPE_NAME)) {
+  cs.setAvgColLen(JavaDataModel.get().lengthOfTimestamp());
+  // epoch, seconds since epoch
+  cs.setRange(0, 2177452799L);
 
 Review comment:
   Is this 2038-12-31 11:59:59 PM GMT? Why is it hard coded to that? If there 
is a reason for that, then I suggest to introduce a more meaningful constant 
declaration, like this:
   
   // Explanation why the timestamp range upper limit should be this date
   private static final long TIMESTAMP_RANGE_UPPER_LIMIT = new 
SimpleDateFormat("-MM-dd HH:mm:ss zzz").parse("2038-12-31 23:59:59 
GMT").getTime() / 1000;
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 324914)
Time Spent: 20m  (was: 10m)

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22284) Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946595#comment-16946595
 ] 

Ádám Szita edited comment on HIVE-22284 at 10/8/19 8:01 AM:


Thanks for the comments [~gopalv], [~pvary]. I've refactored CacheTag into 3 
versions as recommended, and also interned table names and partition descs as 
well.

Naturally using such CacheTag objects poses bigger overhead than the current 
version that uses Strings, but in my opinion this isn't a substantial 
difference (8 bytes per reference [assuming LLAP daemon > -Xmx32G] +~12 bytes 
object overhead / tag ) especially if we're interning the reoccurring values.
 And on the other hand: we get correct stats that match Hive constructs exactly.


was (Author: szita):
Thanks for the comments [~gopalv], [~pvary]. I've refactored CacheTag into 3 
versions as recommended, and also interned table names and partition descs as 
well.

Naturally using such CacheTag objects poses bigger overhead than the current 
version that uses Strings, but in my opinion this isn't a substantial 
difference (+8 bytes per reference [assuming LLAP daemon > -Xmx32G ]+~12 bytes 
object overhead / tag ) especially if we're interning the reoccurring values.
And on the other hand: we get correct stats that match Hive constructs exactly.

> Improve LLAP CacheContentsTracker to collect and display correct statistics
> ---
>
> Key: HIVE-22284
> URL: https://issues.apache.org/jira/browse/HIVE-22284
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22284.0.patch, HIVE-22284.1.patch, 
> HIVE-22284.2.patch, HIVE-22284.3.patch, HIVE-22284.4.patch
>
>
> When keeping track of which buffers correspond to what Hive objects, 
> CacheContentsTracker relies on cache tags.
> Currently a tag is a simple String that ideally holds DB and table name, and 
> a partition spec concatenated by . and / . The information here is derived 
> from the Path of the file that is getting cached. Needless to say sometimes 
> this produces a wrong tag especially for external tables.
> Also there's a bug when calculating aggregated stats for a 'parent' tag 
> (corresponding to the table of the partition) because the overall maxCount 
> and maxSize do not add up to the sum of those in the partitions. This happens 
> when buffers get removed from the cache.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22284) Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946595#comment-16946595
 ] 

Ádám Szita commented on HIVE-22284:
---

Thanks for the comments [~gopalv], [~pvary]. I've refactored CacheTag into 3 
versions as recommended, and also interned table names and partition descs as 
well.

Naturally using such CacheTag objects poses bigger overhead than the current 
version that uses Strings, but in my opinion this isn't a substantial 
difference (+8 bytes per reference [assuming LLAP daemon > -Xmx32G ]+~12 bytes 
object overhead / tag ) especially if we're interning the reoccurring values.
And on the other hand: we get correct stats that match Hive constructs exactly.

> Improve LLAP CacheContentsTracker to collect and display correct statistics
> ---
>
> Key: HIVE-22284
> URL: https://issues.apache.org/jira/browse/HIVE-22284
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22284.0.patch, HIVE-22284.1.patch, 
> HIVE-22284.2.patch, HIVE-22284.3.patch, HIVE-22284.4.patch
>
>
> When keeping track of which buffers correspond to what Hive objects, 
> CacheContentsTracker relies on cache tags.
> Currently a tag is a simple String that ideally holds DB and table name, and 
> a partition spec concatenated by . and / . The information here is derived 
> from the Path of the file that is getting cached. Needless to say sometimes 
> this produces a wrong tag especially for external tables.
> Also there's a bug when calculating aggregated stats for a 'parent' tag 
> (corresponding to the table of the partition) because the overall maxCount 
> and maxSize do not add up to the sum of those in the partitions. This happens 
> when buffers get removed from the cache.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22284) Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-22284:
--
Attachment: HIVE-22284.4.patch

> Improve LLAP CacheContentsTracker to collect and display correct statistics
> ---
>
> Key: HIVE-22284
> URL: https://issues.apache.org/jira/browse/HIVE-22284
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22284.0.patch, HIVE-22284.1.patch, 
> HIVE-22284.2.patch, HIVE-22284.3.patch, HIVE-22284.4.patch
>
>
> When keeping track of which buffers correspond to what Hive objects, 
> CacheContentsTracker relies on cache tags.
> Currently a tag is a simple String that ideally holds DB and table name, and 
> a partition spec concatenated by . and / . The information here is derived 
> from the Path of the file that is getting cached. Needless to say sometimes 
> this produces a wrong tag especially for external tables.
> Also there's a bug when calculating aggregated stats for a 'parent' tag 
> (corresponding to the table of the partition) because the overall maxCount 
> and maxSize do not add up to the sum of those in the partitions. This happens 
> when buffers get removed from the cache.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22284) Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-22284:
--
Status: Patch Available  (was: In Progress)

> Improve LLAP CacheContentsTracker to collect and display correct statistics
> ---
>
> Key: HIVE-22284
> URL: https://issues.apache.org/jira/browse/HIVE-22284
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22284.0.patch, HIVE-22284.1.patch, 
> HIVE-22284.2.patch, HIVE-22284.3.patch, HIVE-22284.4.patch
>
>
> When keeping track of which buffers correspond to what Hive objects, 
> CacheContentsTracker relies on cache tags.
> Currently a tag is a simple String that ideally holds DB and table name, and 
> a partition spec concatenated by . and / . The information here is derived 
> from the Path of the file that is getting cached. Needless to say sometimes 
> this produces a wrong tag especially for external tables.
> Also there's a bug when calculating aggregated stats for a 'parent' tag 
> (corresponding to the table of the partition) because the overall maxCount 
> and maxSize do not add up to the sum of those in the partitions. This happens 
> when buffers get removed from the cache.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22284) Improve LLAP CacheContentsTracker to collect and display correct statistics

2019-10-08 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-22284:
--
Status: In Progress  (was: Patch Available)

> Improve LLAP CacheContentsTracker to collect and display correct statistics
> ---
>
> Key: HIVE-22284
> URL: https://issues.apache.org/jira/browse/HIVE-22284
> Project: Hive
>  Issue Type: Improvement
>  Components: llap
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22284.0.patch, HIVE-22284.1.patch, 
> HIVE-22284.2.patch, HIVE-22284.3.patch, HIVE-22284.4.patch
>
>
> When keeping track of which buffers correspond to what Hive objects, 
> CacheContentsTracker relies on cache tags.
> Currently a tag is a simple String that ideally holds DB and table name, and 
> a partition spec concatenated by . and / . The information here is derived 
> from the Path of the file that is getting cached. Needless to say sometimes 
> this produces a wrong tag especially for external tables.
> Also there's a bug when calculating aggregated stats for a 'parent' tag 
> (corresponding to the table of the partition) because the overall maxCount 
> and maxSize do not add up to the sum of those in the partitions. This happens 
> when buffers get removed from the cache.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22198) Execute unoin-all with childs Join in parallel

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946582#comment-16946582
 ] 

Hive QA commented on HIVE-22198:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12980846/HIVE-22198.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18903/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18903/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18903/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2019-10-08 07:30:51.472
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-18903/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2019-10-08 07:30:51.475
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 050f918 HIVE-22252: Fix caught NullPointerExceptions generated 
during EXPLAIN (John Sherman, reviewed by Jesus Camacho Rodriguez)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 050f918 HIVE-22252: Fix caught NullPointerExceptions generated 
during EXPLAIN (John Sherman, reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2019-10-08 07:30:53.008
+ rm -rf ../yetus_PreCommit-HIVE-Build-18903
+ mkdir ../yetus_PreCommit-HIVE-Build-18903
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-18903
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-18903/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: a/ql/src/java/org/apache/hadoop/hive/ql/Driver.java: does not exist in 
index
error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/Driver.java:2881
error: repository lacks the necessary blob to fall back on 3-way merge.
error: ql/src/java/org/apache/hadoop/hive/ql/Driver.java: patch does not apply
error: src/java/org/apache/hadoop/hive/ql/Driver.java: does not exist in index
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-18903
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12980846 - PreCommit-HIVE-Build

> Execute unoin-all with childs Join in parallel
> --
>
> Key: HIVE-22198
> URL: https://issues.apache.org/jira/browse/HIVE-22198
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: LuGuangMing
>Assignee: LuGuangMing
>Priority: Major
> Attachments: HIVE-22198.patch, image-2019-09-20-11-38-37-433.png, 
> image-2019-09-20-11-39-30-347.png, image-2019-10-08-09-51-06-144.png, 
> test-parallel.sql
>
>
> set parallel is true, set skewjoin is false, set auto convert join is false. 
> run a unoin all, There is nothing error message, but some result data is 
> missing, details check attatchment [^test-parallel.sql]
> create table tab1(tid int, com string) row format delimited fields terminated 
> by '\t' stored as textfile;
>  create table tab2(tid int, com string) row format delimited fields 
> terminated by '\t' stored as textfile;
>  create table tab3(tid int, com string) row format delimited fields 
> terminated by '\t' stored as textfile;
>  create table tab4(tid int, com string) row format delimited fields 
> terminated by '\t' stored as textfile;
> insert into tab1

[jira] [Commented] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946581#comment-16946581
 ] 

Hive QA commented on HIVE-22239:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982441/HIVE-22239.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 17516 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18902/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18902/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18902/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982441 - PreCommit-HIVE-Build

> Scale data size using column value ranges
> -
>
> Key: HIVE-22239
> URL: https://issues.apache.org/jira/browse/HIVE-22239
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22239.01.patch, HIVE-22239.02.patch, 
> HIVE-22239.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, min/max values for columns are only used to determine whether a 
> certain range filter falls out of range and thus filters all rows or none at 
> all. If it does not, we just use a heuristic that the condition will filter 
> 1/3 of the input rows. Instead of using that heuristic, we can use another 
> one that assumes that data will be uniformly distributed across that range, 
> and calculate the selectivity for the condition accordingly.
> This patch also includes the propagation of min/max column values from 
> statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22239) Scale data size using column value ranges

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946566#comment-16946566
 ] 

Hive QA commented on HIVE-22239:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
41s{color} | {color:blue} standalone-metastore/metastore-common in master has 
37 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
34s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
18s{color} | {color:blue} standalone-metastore/metastore-server in master has 
170 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
22s{color} | {color:blue} ql in master has 1551 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
48s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
19s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 26 new + 109 unchanged - 11 fixed = 135 total (was 120) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 15 new + 259 unchanged - 2 
fixed = 274 total (was 261) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} metastore-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
20s{color} | {color:red} standalone-metastore/metastore-server generated 2 new 
+ 170 unchanged - 0 fixed = 172 total (was 170) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} ql generated 0 new + 1550 unchanged - 1 fixed = 1550 
total (was 1551) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:standalone-metastore/metastore-server |
|  |  Integral division result cast to double or float in 
org.apache.hadoop.hive.metastore.columnstats.aggr.TimestampColumnStatsAggregator.aggregate(List,
 List, boolean)  At TimestampColumnStatsAggregator.java:double or float in 
org.apache.hadoop.hive.metastore.columnstats.aggr.TimestampColumnStatsAggregator.aggregate(List,
 List, boolean)  At TimestampColumnStatsAggregator.java:[line 102] |
|  |  
org.apache.hadoop.hive.metastore.columnstats.cache.TimestampColumnStatsDataInspector
 doesn't override 
org.apache.hadoop.hive.metastore.api.TimestampColumnStatsData.equals(Object)  
At TimestampColumnStatsDataInspector.java:At

[jira] [Updated] (HIVE-22235) CommandProcessorResponse should not be an exception

2019-10-08 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-22235:
--
Attachment: (was: HIVE-22235.08.patch)

> CommandProcessorResponse should not be an exception
> ---
>
> Key: HIVE-22235
> URL: https://issues.apache.org/jira/browse/HIVE-22235
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22235.01.patch, HIVE-22235.02.patch, 
> HIVE-22235.03.patch, HIVE-22235.04.patch, HIVE-22235.05.patch, 
> HIVE-22235.06.patch, HIVE-22235.07.patch, HIVE-22235.08.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The CommandProcessorResponse class extends Exception. This may be convenient, 
> but it is wrong, as a response is not an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22235) CommandProcessorResponse should not be an exception

2019-10-08 Thread Miklos Gergely (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Gergely updated HIVE-22235:
--
Attachment: HIVE-22235.08.patch

> CommandProcessorResponse should not be an exception
> ---
>
> Key: HIVE-22235
> URL: https://issues.apache.org/jira/browse/HIVE-22235
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22235.01.patch, HIVE-22235.02.patch, 
> HIVE-22235.03.patch, HIVE-22235.04.patch, HIVE-22235.05.patch, 
> HIVE-22235.06.patch, HIVE-22235.07.patch, HIVE-22235.08.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The CommandProcessorResponse class extends Exception. This may be convenient, 
> but it is wrong, as a response is not an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22255) Hive don't trigger Major Compaction automatically if table contains only base files

2019-10-08 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16946514#comment-16946514
 ] 

Hive QA commented on HIVE-22255:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12982440/HIVE-22255.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 17515 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.noCompactTableDeltaPctNotHighEnough
 (batchId=321)
org.apache.hive.service.TestHS2ImpersonationWithRemoteMS.testImpersonation 
(batchId=282)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/18901/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18901/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18901/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12982440 - PreCommit-HIVE-Build

> Hive don't trigger Major Compaction automatically if table contains only base 
> files 
> 
>
> Key: HIVE-22255
> URL: https://issues.apache.org/jira/browse/HIVE-22255
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 3.1.2
> Environment: Hive-3.1.1
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-22255.patch
>
>
> user may run into the issue if the table consists of all base files but no 
> delta, then the following condition will yield false and automatic major 
> compaction will be skipped.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L313]
>  
> Steps to Reproduce:
>  # create Acid table 
> {code:java}
> //  create table myacid(id int);
> {code}
>  # Run multiple insert table 
> {code:java}
> // insert overwrite table myacid values(1);insert overwrite table myacid 
> values(2),(3),(4){code}
>  # DFS ls output
> {code:java}
> // dfs -ls -R /warehouse/tablespace/managed/hive/myacid;
> ++
> |                     DFS Output                     |
> ++
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001 |
> | -rw-rw+  3 hive hadoop          1 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001/_orc_acid_version |
> | -rw-rw+  3 hive hadoop        610 2019-09-27 16:42 
> /warehouse/tablespace/managed/hive/myacid/base_001/bucket_0 |
> | drwxrwx---+  - hive hadoop          0 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002 |
> | -rw-rw+  3 hive hadoop          1 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002/_orc_acid_version |
> | -rw-rw+  3 hive hadoop        633 2019-09-27 16:43 
> /warehouse/tablespace/managed/hive/myacid/base_002/bucket_0 |
> ++{code}
>  
> you will see that Major compaction will not be trigger until you run alter 
> table compact MAJOR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

99 matches

Mail list logo