[jira] [Work logged] (HIVE-21294) Vectorization: 1-reducer Shuffle can skip the object hash functions

2019-03-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21294?focusedWorklogId=209428=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-209428
 ]

ASF GitHub Bot logged work on HIVE-21294:
-

Author: ASF GitHub Bot
Created on: 07/Mar/19 07:46
Start Date: 07/Mar/19 07:46
Worklog Time Spent: 10m 
  Work Description: pudidic commented on pull request #547: HIVE-21294: 
Vectorization: 1-reducer Shuffle can skip the object hash…
URL: https://github.com/apache/hive/pull/547
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 209428)
Time Spent: 0.5h  (was: 20m)

> Vectorization: 1-reducer Shuffle can skip the object hash functions
> ---
>
> Key: HIVE-21294
> URL: https://issues.apache.org/jira/browse/HIVE-21294
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21294.2.patch, HIVE-21294.3.patch, 
> HIVE-21294.4.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> VectorReduceSinkObjectHashOperator can skip the object hashing entirely if 
> the reducer count = 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21294) Vectorization: 1-reducer Shuffle can skip the object hash functions

2019-03-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21294?focusedWorklogId=209429=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-209429
 ]

ASF GitHub Bot logged work on HIVE-21294:
-

Author: ASF GitHub Bot
Created on: 07/Mar/19 07:46
Start Date: 07/Mar/19 07:46
Worklog Time Spent: 10m 
  Work Description: pudidic commented on issue #547: HIVE-21294: 
Vectorization: 1-reducer Shuffle can skip the object hash…
URL: https://github.com/apache/hive/pull/547#issuecomment-470420932
 
 
   Merged to master branch.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 209429)
Time Spent: 40m  (was: 0.5h)

> Vectorization: 1-reducer Shuffle can skip the object hash functions
> ---
>
> Key: HIVE-21294
> URL: https://issues.apache.org/jira/browse/HIVE-21294
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21294.2.patch, HIVE-21294.3.patch, 
> HIVE-21294.4.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> VectorReduceSinkObjectHashOperator can skip the object hashing entirely if 
> the reducer count = 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16716) Clean up javadoc from errors in module ql

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786476#comment-16786476
 ] 

Hive QA commented on HIVE-16716:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
1s{color} | {color:red} ql: The patch generated 5 new + 2511 unchanged - 7 
fixed = 2516 total (was 2518) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  1m  
3s{color} | {color:red} ql generated 42 new + 58 unchanged - 42 fixed = 100 
total (was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16378/dev-support/hive-personality.sh
 |
| git revision | master / 84f766e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16378/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16378/yetus/whitespace-eol.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16378/yetus/diff-javadoc-javadoc-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16378/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Clean up javadoc from errors in module ql
> -
>
> Key: HIVE-16716
> URL: https://issues.apache.org/jira/browse/HIVE-16716
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Janos Gub
>Assignee: Robert Kucsora
>Priority: Major
> Attachments: HIVE-16716-v2.patch, HIVE-16716.2.patch, 
> HIVE-16716.3.patch, HIVE-16716.4.patch, HIVE-16716.5.patch, 
> HIVE-16716.6.patch, HIVE-16716.7.patch, HIVE-16716.8.patch, HIVE-16716.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16924) Support distinct in presence of Group By

2019-03-06 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786466#comment-16786466
 ] 

Zoltan Haindrich commented on HIVE-16924:
-

masking_1 failed in the last 3 qa runs; it might be related

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch, HIVE-16924.14.patch, 
> HIVE-16924.15.patch, HIVE-16924.16.patch, HIVE-16924.17.patch, 
> HIVE-16924.18.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21400) Vectorization: LazyBinarySerializeWrite allocates Field() within the loop

2019-03-06 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21400:
---
Attachment: HIVE-21400.1.patch

> Vectorization: LazyBinarySerializeWrite allocates Field() within the loop
> -
>
> Key: HIVE-21400
> URL: https://issues.apache.org/jira/browse/HIVE-21400
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-21400.1.patch
>
>
> GC thrash from an unexpected source in ReduceSinkOperator.
> {code}
> org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinarySerializeWrite.resetWithoutOutput(LazyBinarySerializeWrite.java:136)
> at 
> org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinarySerializeWrite.reset(LazyBinarySerializeWrite.java:132)
> at 
> org.apache.hadoop.hive.ql.exec.vector.reducesink.VectorReduceSinkUniformHashOperator.process(VectorReduceSinkUniformHashOperator.java:180)
> {code}
> GC space is getting thrashed by the 
> {code}
> root = new Field(STRUCT);
> {code}
> for every row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21400) Vectorization: LazyBinarySerializeWrite allocates Field() within the loop

2019-03-06 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21400:
---
Status: Patch Available  (was: Open)

> Vectorization: LazyBinarySerializeWrite allocates Field() within the loop
> -
>
> Key: HIVE-21400
> URL: https://issues.apache.org/jira/browse/HIVE-21400
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
> Attachments: HIVE-21400.1.patch
>
>
> GC thrash from an unexpected source in ReduceSinkOperator.
> {code}
> org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinarySerializeWrite.resetWithoutOutput(LazyBinarySerializeWrite.java:136)
> at 
> org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinarySerializeWrite.reset(LazyBinarySerializeWrite.java:132)
> at 
> org.apache.hadoop.hive.ql.exec.vector.reducesink.VectorReduceSinkUniformHashOperator.process(VectorReduceSinkUniformHashOperator.java:180)
> {code}
> GC space is getting thrashed by the 
> {code}
> root = new Field(STRUCT);
> {code}
> for every row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16924) Support distinct in presence of Group By

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786455#comment-16786455
 ] 

Hive QA commented on HIVE-16924:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961395/HIVE-16924.18.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_1] (batchId=92)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[test_teradatabinaryfile] 
(batchId=2)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16377/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16377/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16377/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961395 - PreCommit-HIVE-Build

> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, HIVE-16924.13.patch, HIVE-16924.14.patch, 
> HIVE-16924.15.patch, HIVE-16924.16.patch, HIVE-16924.17.patch, 
> HIVE-16924.18.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21377) Using Oracle as HMS DB with DirectSQL

2019-03-06 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21377:
--
Attachment: HIVE-21377.01.patch
Status: Patch Available  (was: In Progress)

> Using Oracle as HMS DB with DirectSQL
> -
>
> Key: HIVE-21377
> URL: https://issues.apache.org/jira/browse/HIVE-21377
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.1.0, 3.0.0
>Reporter: Bo 
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21377.01.patch, HIVE-21377.patch
>
>
> When we use the Oracle as HMS DB, we saw this kind of contents in the HMS log 
> accordingly:
> {code:java}
> 2019-02-02 T08:23:57,102 WARN [Thread-12]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(3741)) - Falling back to ORM path due 
> to direct SQL failure (this is not an error): Cannot extract boolean from 
> column value 0 at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlBoolean(MetaStoreDirectSql.java:1031)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:728)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:471)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:462)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:3392)
> {code}
> In Hive, we handle the Postgres, MySQL and Derby for the extractSqlBoolean.
> But Oracle return the 0 or 1 for Boolean. So we need to modify the 
> MetastoreDirectSqlUtils.java - [1]
> So, could add this snip in this code?
> {code:java}
>   static Boolean extractSqlBoolean(Object value) throws MetaException {
> if (value == null) {
>   return null;
> }
> if (value instanceof Boolean) {
>   return (Boolean)value;
> }
> if (value instanceof Number) { // add
>   try {
> return BooleanUtils.toBooleanObject((BigDecimal) value, 1, 0, null);
>   } catch(IllegalArugmentExeception iae){
>   // NOOP
>   }
> if (value instanceof String) {
>   try {
> return BooleanUtils.toBooleanObject((String) value, "Y", "N", null);
>   } catch (IllegalArgumentException iae) {
> // NOOP
>   }
> }
> throw new MetaException("Cannot extract boolean from column value " + 
> value);
>   }
> {code}
>  [1] -
> https://github.com/apache/hive/blob/f51f108b761f0c88647f48f30447dae12b308f31/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L501-L527
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-16924) Support distinct in presence of Group By

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786452#comment-16786452
 ] 

Hive QA commented on HIVE-16924:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
50s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
47s{color} | {color:red} ql: The patch generated 8 new + 639 unchanged - 13 
fixed = 647 total (was 652) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
6s{color} | {color:red} root: The patch generated 8 new + 647 unchanged - 13 
fixed = 655 total (was 660) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
27s{color} | {color:green} ql generated 0 new + 2249 unchanged - 2 fixed = 2249 
total (was 2251) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 64m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16377/dev-support/hive-personality.sh
 |
| git revision | master / 84f766e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16377/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16377/yetus/diff-checkstyle-root.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16377/yetus/whitespace-eol.txt
 |
| modules | C: ql . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16377/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support distinct in presence of Group By 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch, HIVE-16924.07.patch, HIVE-16924.08.patch, 
> HIVE-16924.09.patch, HIVE-16924.10.patch, HIVE-16924.11.patch, 
> HIVE-16924.12.patch, 

[jira] [Commented] (HIVE-16725) Support recursive CTEs

2019-03-06 Thread Patrick Cuba (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786434#comment-16786434
 ] 

Patrick Cuba commented on HIVE-16725:
-

Any due date for this?
This would be super handy for a query I’m working on

> Support recursive CTEs
> --
>
> Key: HIVE-16725
> URL: https://issues.apache.org/jira/browse/HIVE-16725
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Carter Shanklin
>Priority: Major
>
> Hive introduced non-recursive CTEs in HIVE-1180.
> Recursive CTEs are commonly used to navigate hierarchies stored in relational 
> tables where a parent ID column "foreign key" refers to another "primary key" 
> field within the same table. In this context recursive CTEs are used to 
> traverse hierarchies, determine parents / children, measure depths, build 
> paths and so on.
> Recursive CTEs are constructed similarly to basic CTEs but include 2 queries 
> at a minimum: first a root query which is combined via UNION / UNION ALL to 
> additional queries that can refer to the CTE's table name.
> Support should include:
> * Basic recursive CTE support: i.e. allow the CTE's table name to be referred 
> in the table subquery after a UNION or UNION ALL.
> * Recursive CTEs should be supported as basic queries, in views, or in 
> subqueries.
> * Loop detection is highly desirable. If a loop is detected the query should 
> fail at runtime. Hive is commonly used in shared clusters where it is 
> difficult to track down rogue queries.
> * To ease portability, suggest  to not require the recursive keyword. It 
> could be made optional.
> * To ease portability, "with column list", i.e. with t(col1, col2) as ( ... ) 
> should be supported.
> Example (Postgres compatible):
> {code}
> create table hierarchy (id integer, parent integer);
> insert into hierarchy values (1, null), (2, 1), (3, 2);
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent is null
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
>  id | parent
> +
>   1 |
>   2 |  1
>   3 |  2
> (3 rows)
> update hierarchy set parent = 3 where id = 1;
> with recursive t(id, parent) as (
>   select id, parent from hierarchy where parent = 1
>   union all select hierarchy.id, hierarchy.parent from hierarchy, t where 
> t.id = hierarchy.parent
> ) select * from t;
> [ Query runs forever ]
> {code}
> Implementation Notes:
> The SQL standard requires use of the "recursive" keyword for recursive CTEs. 
> However, major commercial databases including Oracle, SQL Server and DB2 do 
> not require, or in some cases, don't even allow the "recursive" keyword. 
> Postgres requires the "recursive" keyword.
> If Oracle detects a loop it fails with this message: ORA-32044: cycle 
> detected while executing recursive WITH query
> If Postgres encounters a loop in a recursive CTE, the query runs forever and 
> must be killed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-21377) Using Oracle as HMS DB with DirectSQL

2019-03-06 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-21377 started by Rajkumar Singh.
-
> Using Oracle as HMS DB with DirectSQL
> -
>
> Key: HIVE-21377
> URL: https://issues.apache.org/jira/browse/HIVE-21377
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Bo 
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21377.patch
>
>
> When we use the Oracle as HMS DB, we saw this kind of contents in the HMS log 
> accordingly:
> {code:java}
> 2019-02-02 T08:23:57,102 WARN [Thread-12]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(3741)) - Falling back to ORM path due 
> to direct SQL failure (this is not an error): Cannot extract boolean from 
> column value 0 at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlBoolean(MetaStoreDirectSql.java:1031)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:728)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:471)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:462)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:3392)
> {code}
> In Hive, we handle the Postgres, MySQL and Derby for the extractSqlBoolean.
> But Oracle return the 0 or 1 for Boolean. So we need to modify the 
> MetastoreDirectSqlUtils.java - [1]
> So, could add this snip in this code?
> {code:java}
>   static Boolean extractSqlBoolean(Object value) throws MetaException {
> if (value == null) {
>   return null;
> }
> if (value instanceof Boolean) {
>   return (Boolean)value;
> }
> if (value instanceof Number) { // add
>   try {
> return BooleanUtils.toBooleanObject((BigDecimal) value, 1, 0, null);
>   } catch(IllegalArugmentExeception iae){
>   // NOOP
>   }
> if (value instanceof String) {
>   try {
> return BooleanUtils.toBooleanObject((String) value, "Y", "N", null);
>   } catch (IllegalArgumentException iae) {
> // NOOP
>   }
> }
> throw new MetaException("Cannot extract boolean from column value " + 
> value);
>   }
> {code}
>  [1] -
> https://github.com/apache/hive/blob/f51f108b761f0c88647f48f30447dae12b308f31/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L501-L527
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21377) Using Oracle as HMS DB with DirectSQL

2019-03-06 Thread Rajkumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-21377:
--
Status: Open  (was: Patch Available)

> Using Oracle as HMS DB with DirectSQL
> -
>
> Key: HIVE-21377
> URL: https://issues.apache.org/jira/browse/HIVE-21377
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.1.0, 3.0.0
>Reporter: Bo 
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21377.patch
>
>
> When we use the Oracle as HMS DB, we saw this kind of contents in the HMS log 
> accordingly:
> {code:java}
> 2019-02-02 T08:23:57,102 WARN [Thread-12]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(3741)) - Falling back to ORM path due 
> to direct SQL failure (this is not an error): Cannot extract boolean from 
> column value 0 at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlBoolean(MetaStoreDirectSql.java:1031)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:728)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:471)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:462)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:3392)
> {code}
> In Hive, we handle the Postgres, MySQL and Derby for the extractSqlBoolean.
> But Oracle return the 0 or 1 for Boolean. So we need to modify the 
> MetastoreDirectSqlUtils.java - [1]
> So, could add this snip in this code?
> {code:java}
>   static Boolean extractSqlBoolean(Object value) throws MetaException {
> if (value == null) {
>   return null;
> }
> if (value instanceof Boolean) {
>   return (Boolean)value;
> }
> if (value instanceof Number) { // add
>   try {
> return BooleanUtils.toBooleanObject((BigDecimal) value, 1, 0, null);
>   } catch(IllegalArugmentExeception iae){
>   // NOOP
>   }
> if (value instanceof String) {
>   try {
> return BooleanUtils.toBooleanObject((String) value, "Y", "N", null);
>   } catch (IllegalArgumentException iae) {
> // NOOP
>   }
> }
> throw new MetaException("Cannot extract boolean from column value " + 
> value);
>   }
> {code}
>  [1] -
> https://github.com/apache/hive/blob/f51f108b761f0c88647f48f30447dae12b308f31/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L501-L527
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21286) Hive should support clean-up of previously bootstrapped tables when retry from different dump.

2019-03-06 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21286:

Status: Patch Available  (was: Open)

04.patch rebased with master.

> Hive should support clean-up of previously bootstrapped tables when retry 
> from different dump.
> --
>
> Key: HIVE-21286
> URL: https://issues.apache.org/jira/browse/HIVE-21286
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21286.01.patch, HIVE-21286.02.patch, 
> HIVE-21286.03.patch, HIVE-21286.04.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> If external tables are enabled for replication on an existing repl policy, 
> then bootstrapping of external tables are combined with incremental dump.
> If incremental bootstrap load fails with non-retryable error for which user 
> will have to manually drop all the external tables before trying with another 
> bootstrap dump. For full bootstrap, to retry with different dump, we 
> suggested user to drop the DB but in this case they need to manually drop all 
> the external tables which is not so user friendly. So, need to handle it in 
> Hive side as follows.
> REPL LOAD takes additional config (passed by user in WITH clause) that says, 
> drop all the tables which are bootstrapped from previous dump. 
> hive.repl.clean.tables.from.bootstrap=
> Hive will use this config only if the current dump is combined bootstrap in 
> incremental dump.
> Caution to be taken by user that this config should not be passed if previous 
> REPL LOAD (with bootstrap) was successful or any successful incremental 
> dump+load happened after "previous_bootstrap_dump_dir".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21286) Hive should support clean-up of previously bootstrapped tables when retry from different dump.

2019-03-06 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21286:

Attachment: HIVE-21286.04.patch

> Hive should support clean-up of previously bootstrapped tables when retry 
> from different dump.
> --
>
> Key: HIVE-21286
> URL: https://issues.apache.org/jira/browse/HIVE-21286
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21286.01.patch, HIVE-21286.02.patch, 
> HIVE-21286.03.patch, HIVE-21286.04.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> If external tables are enabled for replication on an existing repl policy, 
> then bootstrapping of external tables are combined with incremental dump.
> If incremental bootstrap load fails with non-retryable error for which user 
> will have to manually drop all the external tables before trying with another 
> bootstrap dump. For full bootstrap, to retry with different dump, we 
> suggested user to drop the DB but in this case they need to manually drop all 
> the external tables which is not so user friendly. So, need to handle it in 
> Hive side as follows.
> REPL LOAD takes additional config (passed by user in WITH clause) that says, 
> drop all the tables which are bootstrapped from previous dump. 
> hive.repl.clean.tables.from.bootstrap=
> Hive will use this config only if the current dump is combined bootstrap in 
> incremental dump.
> Caution to be taken by user that this config should not be passed if previous 
> REPL LOAD (with bootstrap) was successful or any successful incremental 
> dump+load happened after "previous_bootstrap_dump_dir".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21400) Vectorization: LazyBinarySerializeWrite allocates Field() within the loop

2019-03-06 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V reassigned HIVE-21400:
--

Assignee: Gopal V

> Vectorization: LazyBinarySerializeWrite allocates Field() within the loop
> -
>
> Key: HIVE-21400
> URL: https://issues.apache.org/jira/browse/HIVE-21400
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Major
>
> GC thrash from an unexpected source in ReduceSinkOperator.
> {code}
> org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinarySerializeWrite.resetWithoutOutput(LazyBinarySerializeWrite.java:136)
> at 
> org.apache.hadoop.hive.serde2.lazybinary.fast.LazyBinarySerializeWrite.reset(LazyBinarySerializeWrite.java:132)
> at 
> org.apache.hadoop.hive.ql.exec.vector.reducesink.VectorReduceSinkUniformHashOperator.process(VectorReduceSinkUniformHashOperator.java:180)
> {code}
> GC space is getting thrashed by the 
> {code}
> root = new Field(STRUCT);
> {code}
> for every row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21374) Dependency conflicts on org.apache.httpcomponents:httpcore:jar, leading to invoking unexpected methods

2019-03-06 Thread Hello CoCooo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786415#comment-16786415
 ] 

Hello CoCooo edited comment on HIVE-21374 at 3/7/19 6:00 AM:
-

I have checked the revision history.

I'm glad to see this issue has been fixed by this commit: *2b51e56* 
[[https://github.com/apache/hive/commit/2b51e562173a8d6e2bcf304a611daafe0da8ebc0#diff-600376dffeb79835ede4a0b285078036]]
 . 

Thank you very much.
  


was (Author: hellococooo):
I have checked the revision history.

I'm glad to see this issue has been fixed by this commit: *2b51e56* 
[https://github.com/apache/hive/commit/2b51e562173a8d6e2bcf304a611daafe0da8ebc0#diff-600376dffeb79835ede4a0b285078036]]
 . 

Thank you very much.
  

> Dependency conflicts on org.apache.httpcomponents:httpcore:jar, leading to 
> invoking unexpected methods
> --
>
> Key: HIVE-21374
> URL: https://issues.apache.org/jira/browse/HIVE-21374
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.4
>Reporter: Hello CoCooo
>Priority: Major
> Fix For: 2.3.4
>
> Attachments: validate4.4.1.png, validate4.4.png
>
>
> Hi! In *hive-rel-release-2.3.4\service-rpc,*  there are multiple versions of 
> *org.apache.httpcomponents:httpcore:jar*. As shown in the following 
> dependency tree, according to Maven's dependency management strategy, only 
> *org.apache.httpcomponents:httpcore:jar:4.4* can be loaded, and 
> *org.apache.httpcomponents:httpcore:jar:4.4.1* will be shadowed.
> Your project references the method 
> {color:#d04437}**
>  {color}via the following invocation path, which is included in the shadowed 
> version *org.apache.httpcomponents:httpcore:jar:4.4.1*. However, this method 
> is missing in the actual loaded version 
> *org.apache.httpcomponents:httpcore:jar:4.4*. Surprisingly, it will not cause 
> NoSuchMethodError at rumtime.
> {color:#59afe1}*Invocation path:*{color}
> {code:java}
> // code placeholder
>  requestInvoke(org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar
>  
> C:\Users\Flipped\.m2\repository\junit\junit\4.11\junit-4.11.jar
>  get(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntry(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntry(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  access$000(org.apache.http.pool.AbstractConnPool,java.lang.Object,java.lang.Object,long,java.util.concurrent.TimeUnit,org.apache.http.pool.PoolEntryFuture)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntryBlocking(java.lang.Object,java.lang.Object,long,java.util.concurrent.TimeUnit,org.apache.http.pool.PoolEntryFuture)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  validate(org.apache.http.pool.PoolEntry)>{code}
> By further analyzing, I found that the caller 
> *org.apache.thrift.server.TThreadedSelectorServer.requestInvoke(AbstractNonblockingServer$FrameBuffer)*
>  would invoke the method 
> *{color:#d04437}AbstractConnPool.validate(PoolEntry){color}* defined in the 
> *superclass of org.apache.http.impl.conn.CPool (**CPool* *extends 
> AbstractConnPool)* with the same signature of the expected callee, due to 
> dynamic binding mechanism.
> Although the actual invoked method belonging to 
> *{color:#d04437}AbstractConnPool{color}* has the same method name, same 
> parameter types and return type as the expected method defined in its 
> subclass {color:#d04437}*CPool*{color}, but it has different control flows 
> and different behaviors. Maybe it is buggy behavior.
>  
> +_*{color:#f691b2}Solution:{color}*_+
>  Use the newer version *org.apache.httpcomponents:httpcore:jar:4.4.1* in 
> parent pom file to keep the version consistency.
>  
> *Dependency tree*
> [INFO] org.apache.hive:hive-service-rpc:jar:2.3.4
> [INFO] +- commons-codec:commons-codec:jar:1.4:compile
> [INFO] +- commons-cli:commons-cli:jar:1.2:compile
> [INFO] +- tomcat:jasper-compiler:jar:5.5.23:compile
> [INFO] |  +- javax.servlet:jsp-api:jar:2.0:compile
> [INFO] |  |  - (javax.servlet:servlet-api:jar:2.4:compile - omitted for 
> duplicate)
> [INFO] |  - ant:ant:jar:1.6.5:compile
> [INFO] +- tomcat:jasper-runtime:jar:5.5.23:compile
> [INFO] |  +- javax.servlet:servlet-api:jar:2.4:compile
> [INFO] |  - commons-el:commons-el:jar:1.0:compile
> [INFO] | - 

[jira] [Comment Edited] (HIVE-21374) Dependency conflicts on org.apache.httpcomponents:httpcore:jar, leading to invoking unexpected methods

2019-03-06 Thread Hello CoCooo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786415#comment-16786415
 ] 

Hello CoCooo edited comment on HIVE-21374 at 3/7/19 6:00 AM:
-

I have checked the revision history.

I'm glad to see this issue has been fixed by this commit: *2b51e56* 
[https://github.com/apache/hive/commit/2b51e562173a8d6e2bcf304a611daafe0da8ebc0#diff-600376dffeb79835ede4a0b285078036]]
 . 

Thank you very much.
  


was (Author: hellococooo):
I have check the revision history.

I'm glad to see this issue has been fixed by this commit: *2b51e56* 
[https://github.com/apache/hive/commit/2b51e562173a8d6e2bcf304a611daafe0da8ebc0#diff-600376dffeb79835ede4a0b285078036]]
 . 

Thank you very much.
 

> Dependency conflicts on org.apache.httpcomponents:httpcore:jar, leading to 
> invoking unexpected methods
> --
>
> Key: HIVE-21374
> URL: https://issues.apache.org/jira/browse/HIVE-21374
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.4
>Reporter: Hello CoCooo
>Priority: Major
> Fix For: 2.3.4
>
> Attachments: validate4.4.1.png, validate4.4.png
>
>
> Hi! In *hive-rel-release-2.3.4\service-rpc,*  there are multiple versions of 
> *org.apache.httpcomponents:httpcore:jar*. As shown in the following 
> dependency tree, according to Maven's dependency management strategy, only 
> *org.apache.httpcomponents:httpcore:jar:4.4* can be loaded, and 
> *org.apache.httpcomponents:httpcore:jar:4.4.1* will be shadowed.
> Your project references the method 
> {color:#d04437}**
>  {color}via the following invocation path, which is included in the shadowed 
> version *org.apache.httpcomponents:httpcore:jar:4.4.1*. However, this method 
> is missing in the actual loaded version 
> *org.apache.httpcomponents:httpcore:jar:4.4*. Surprisingly, it will not cause 
> NoSuchMethodError at rumtime.
> {color:#59afe1}*Invocation path:*{color}
> {code:java}
> // code placeholder
>  requestInvoke(org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar
>  
> C:\Users\Flipped\.m2\repository\junit\junit\4.11\junit-4.11.jar
>  get(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntry(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntry(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  access$000(org.apache.http.pool.AbstractConnPool,java.lang.Object,java.lang.Object,long,java.util.concurrent.TimeUnit,org.apache.http.pool.PoolEntryFuture)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntryBlocking(java.lang.Object,java.lang.Object,long,java.util.concurrent.TimeUnit,org.apache.http.pool.PoolEntryFuture)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  validate(org.apache.http.pool.PoolEntry)>{code}
> By further analyzing, I found that the caller 
> *org.apache.thrift.server.TThreadedSelectorServer.requestInvoke(AbstractNonblockingServer$FrameBuffer)*
>  would invoke the method 
> *{color:#d04437}AbstractConnPool.validate(PoolEntry){color}* defined in the 
> *superclass of org.apache.http.impl.conn.CPool (**CPool* *extends 
> AbstractConnPool)* with the same signature of the expected callee, due to 
> dynamic binding mechanism.
> Although the actual invoked method belonging to 
> *{color:#d04437}AbstractConnPool{color}* has the same method name, same 
> parameter types and return type as the expected method defined in its 
> subclass {color:#d04437}*CPool*{color}, but it has different control flows 
> and different behaviors. Maybe it is buggy behavior.
>  
> +_*{color:#f691b2}Solution:{color}*_+
>  Use the newer version *org.apache.httpcomponents:httpcore:jar:4.4.1* in 
> parent pom file to keep the version consistency.
>  
> *Dependency tree*
> [INFO] org.apache.hive:hive-service-rpc:jar:2.3.4
> [INFO] +- commons-codec:commons-codec:jar:1.4:compile
> [INFO] +- commons-cli:commons-cli:jar:1.2:compile
> [INFO] +- tomcat:jasper-compiler:jar:5.5.23:compile
> [INFO] |  +- javax.servlet:jsp-api:jar:2.0:compile
> [INFO] |  |  - (javax.servlet:servlet-api:jar:2.4:compile - omitted for 
> duplicate)
> [INFO] |  - ant:ant:jar:1.6.5:compile
> [INFO] +- tomcat:jasper-runtime:jar:5.5.23:compile
> [INFO] |  +- javax.servlet:servlet-api:jar:2.4:compile
> [INFO] |  - commons-el:commons-el:jar:1.0:compile
> [INFO] | - 

[jira] [Commented] (HIVE-21374) Dependency conflicts on org.apache.httpcomponents:httpcore:jar, leading to invoking unexpected methods

2019-03-06 Thread Hello CoCooo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786415#comment-16786415
 ] 

Hello CoCooo commented on HIVE-21374:
-

I have check the revision history.

I'm glad to see this issue has been fixed by this commit: *2b51e56* 
[https://github.com/apache/hive/commit/2b51e562173a8d6e2bcf304a611daafe0da8ebc0#diff-600376dffeb79835ede4a0b285078036]]
 . 

Thank you very much.
 

> Dependency conflicts on org.apache.httpcomponents:httpcore:jar, leading to 
> invoking unexpected methods
> --
>
> Key: HIVE-21374
> URL: https://issues.apache.org/jira/browse/HIVE-21374
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.4
>Reporter: Hello CoCooo
>Priority: Major
> Fix For: 2.3.4
>
> Attachments: validate4.4.1.png, validate4.4.png
>
>
> Hi! In *hive-rel-release-2.3.4\service-rpc,*  there are multiple versions of 
> *org.apache.httpcomponents:httpcore:jar*. As shown in the following 
> dependency tree, according to Maven's dependency management strategy, only 
> *org.apache.httpcomponents:httpcore:jar:4.4* can be loaded, and 
> *org.apache.httpcomponents:httpcore:jar:4.4.1* will be shadowed.
> Your project references the method 
> {color:#d04437}**
>  {color}via the following invocation path, which is included in the shadowed 
> version *org.apache.httpcomponents:httpcore:jar:4.4.1*. However, this method 
> is missing in the actual loaded version 
> *org.apache.httpcomponents:httpcore:jar:4.4*. Surprisingly, it will not cause 
> NoSuchMethodError at rumtime.
> {color:#59afe1}*Invocation path:*{color}
> {code:java}
> // code placeholder
>  requestInvoke(org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\thrift\libthrift\0.9.3\libthrift-0.9.3.jar
>  
> C:\Users\Flipped\.m2\repository\junit\junit\4.11\junit-4.11.jar
>  get(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntry(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntry(long,java.util.concurrent.TimeUnit)> 
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  access$000(org.apache.http.pool.AbstractConnPool,java.lang.Object,java.lang.Object,long,java.util.concurrent.TimeUnit,org.apache.http.pool.PoolEntryFuture)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  getPoolEntryBlocking(java.lang.Object,java.lang.Object,long,java.util.concurrent.TimeUnit,org.apache.http.pool.PoolEntryFuture)>
>  
> C:\Users\Flipped\.m2\repository\org\apache\httpcomponents\httpcore\4.4.1\httpcore-4.4.1.jar
>  validate(org.apache.http.pool.PoolEntry)>{code}
> By further analyzing, I found that the caller 
> *org.apache.thrift.server.TThreadedSelectorServer.requestInvoke(AbstractNonblockingServer$FrameBuffer)*
>  would invoke the method 
> *{color:#d04437}AbstractConnPool.validate(PoolEntry){color}* defined in the 
> *superclass of org.apache.http.impl.conn.CPool (**CPool* *extends 
> AbstractConnPool)* with the same signature of the expected callee, due to 
> dynamic binding mechanism.
> Although the actual invoked method belonging to 
> *{color:#d04437}AbstractConnPool{color}* has the same method name, same 
> parameter types and return type as the expected method defined in its 
> subclass {color:#d04437}*CPool*{color}, but it has different control flows 
> and different behaviors. Maybe it is buggy behavior.
>  
> +_*{color:#f691b2}Solution:{color}*_+
>  Use the newer version *org.apache.httpcomponents:httpcore:jar:4.4.1* in 
> parent pom file to keep the version consistency.
>  
> *Dependency tree*
> [INFO] org.apache.hive:hive-service-rpc:jar:2.3.4
> [INFO] +- commons-codec:commons-codec:jar:1.4:compile
> [INFO] +- commons-cli:commons-cli:jar:1.2:compile
> [INFO] +- tomcat:jasper-compiler:jar:5.5.23:compile
> [INFO] |  +- javax.servlet:jsp-api:jar:2.0:compile
> [INFO] |  |  - (javax.servlet:servlet-api:jar:2.4:compile - omitted for 
> duplicate)
> [INFO] |  - ant:ant:jar:1.6.5:compile
> [INFO] +- tomcat:jasper-runtime:jar:5.5.23:compile
> [INFO] |  +- javax.servlet:servlet-api:jar:2.4:compile
> [INFO] |  - commons-el:commons-el:jar:1.0:compile
> [INFO] | - commons-logging:commons-logging:jar:1.0.3:compile
> [INFO] +- org.apache.thrift:libfb303:jar:0.9.3:compile
> [INFO] |  - (org.apache.thrift:libthrift:jar:0.9.3:compile - omitted for 
> duplicate)
> [INFO] +- org.apache.thrift:libthrift:jar:0.9.3:compile
> [INFO] |  +- (org.slf4j:slf4j-api:jar:1.7.10:compile - version managed from 
> 

[jira] [Commented] (HIVE-21280) Null pointer exception on running compaction against a MM table.

2019-03-06 Thread Vaibhav Gumashta (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786409#comment-16786409
 ] 

Vaibhav Gumashta commented on HIVE-21280:
-

Thanks for the patch [~aditya-shah]; I will review it tomorrow.

> Null pointer exception on running compaction against a MM table.
> 
>
> Key: HIVE-21280
> URL: https://issues.apache.org/jira/browse/HIVE-21280
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Aditya Shah
>Assignee: Aditya Shah
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21280.patch
>
>
> On running compaction on MM table, got a null pointer exception while getting 
> HDFS session path. The error seemed to me that the session state was not 
> started for these queries. Even after making it start it further fails in 
> running a Teztask for insert overwrite on temp table with the contents of the 
> original table. The cause for this is Tezsession state is not able to 
> initialize due to Illegal Argument exception being thrown at the time of 
> setting up caller context in Tez task due to caller id which uses queryid 
> being an empty string. 
> I do think session state needs to be started and each of the queries running 
> for compaction (I'm also doubtful for stats updater thread's queries) should 
> have a query id. Some details are as follows:
> Steps to reproduce:
> 1) Using beeline with HS2 and HMS
> 2) create an MM table
> 3) Insert a few values in the table
> 4) alter table mm_table compact 'major'; 
> Stack trace on HMS:
> {code:java}
> compactor.Worker: Caught exception while trying to compact 
> id:8,dbname:default,tableName:acid_mm_orc,partName:null,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0.
>  Marking failed to avoid repeated failures, java.io.IOException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create 
> temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, `b` 
> string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH 
> SERDEPROPERTIES (
> 'serialization.format'='1')STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
> 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
>  TBLPROPERTIES ('transactional'='false')
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:373)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:241)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run 
> create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` 
> int, `b` string) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH SERDEPROPERTIES (
> 'serialization.format'='1')STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
> 'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
>  TBLPROPERTIES ('transactional'='false')
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:525)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:365)
> ... 2 more
> Caused by: java.lang.NullPointerException: Non-local session path expected to 
> be non-null
> at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
> at 
> org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:815)
> at org.apache.hadoop.hive.ql.Context.(Context.java:309)
> at org.apache.hadoop.hive.ql.Context.(Context.java:295)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:591)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1684)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1807)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1556)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:522)
> ... 3 more
> {code}
> cc: [~ekoifman] [~vgumashta] [~sershe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21286) Hive should support clean-up of previously bootstrapped tables when retry from different dump.

2019-03-06 Thread Sankar Hariappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-21286:

Status: Open  (was: Patch Available)

> Hive should support clean-up of previously bootstrapped tables when retry 
> from different dump.
> --
>
> Key: HIVE-21286
> URL: https://issues.apache.org/jira/browse/HIVE-21286
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21286.01.patch, HIVE-21286.02.patch, 
> HIVE-21286.03.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> If external tables are enabled for replication on an existing repl policy, 
> then bootstrapping of external tables are combined with incremental dump.
> If incremental bootstrap load fails with non-retryable error for which user 
> will have to manually drop all the external tables before trying with another 
> bootstrap dump. For full bootstrap, to retry with different dump, we 
> suggested user to drop the DB but in this case they need to manually drop all 
> the external tables which is not so user friendly. So, need to handle it in 
> Hive side as follows.
> REPL LOAD takes additional config (passed by user in WITH clause) that says, 
> drop all the tables which are bootstrapped from previous dump. 
> hive.repl.clean.tables.from.bootstrap=
> Hive will use this config only if the current dump is combined bootstrap in 
> incremental dump.
> Caution to be taken by user that this config should not be passed if previous 
> REPL LOAD (with bootstrap) was successful or any successful incremental 
> dump+load happened after "previous_bootstrap_dump_dir".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21286) Hive should support clean-up of previously bootstrapped tables when retry from different dump.

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786403#comment-16786403
 ] 

Hive QA commented on HIVE-21286:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961386/HIVE-21286.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15805 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=155)

[intersect_all.q,unionDistinct_1.q,table_nonprintable.q,orc_llap_counters1.q,mm_cttas.q,whroot_external1.q,global_limit.q,cte_2.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,results_cache_diff_fs.q,cttl.q,parallel_colstats.q,load_hdfs_file_with_space_in_the_name.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[test_teradatabinaryfile] 
(batchId=2)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16376/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16376/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16376/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961386 - PreCommit-HIVE-Build

> Hive should support clean-up of previously bootstrapped tables when retry 
> from different dump.
> --
>
> Key: HIVE-21286
> URL: https://issues.apache.org/jira/browse/HIVE-21286
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21286.01.patch, HIVE-21286.02.patch, 
> HIVE-21286.03.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> If external tables are enabled for replication on an existing repl policy, 
> then bootstrapping of external tables are combined with incremental dump.
> If incremental bootstrap load fails with non-retryable error for which user 
> will have to manually drop all the external tables before trying with another 
> bootstrap dump. For full bootstrap, to retry with different dump, we 
> suggested user to drop the DB but in this case they need to manually drop all 
> the external tables which is not so user friendly. So, need to handle it in 
> Hive side as follows.
> REPL LOAD takes additional config (passed by user in WITH clause) that says, 
> drop all the tables which are bootstrapped from previous dump. 
> hive.repl.clean.tables.from.bootstrap=
> Hive will use this config only if the current dump is combined bootstrap in 
> incremental dump.
> Caution to be taken by user that this config should not be passed if previous 
> REPL LOAD (with bootstrap) was successful or any successful incremental 
> dump+load happened after "previous_bootstrap_dump_dir".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21286) Hive should support clean-up of previously bootstrapped tables when retry from different dump.

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786380#comment-16786380
 ] 

Hive QA commented on HIVE-21286:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
49s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
9s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
46s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} itests/hive-unit: The patch generated 10 new + 18 
unchanged - 0 fixed = 28 total (was 18) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16376/dev-support/hive-personality.sh
 |
| git revision | master / 84f766e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16376/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| modules | C: ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16376/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive should support clean-up of previously bootstrapped tables when retry 
> from different dump.
> --
>
> Key: HIVE-21286
> URL: https://issues.apache.org/jira/browse/HIVE-21286
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: DR, Replication, pull-request-available
> Attachments: HIVE-21286.01.patch, HIVE-21286.02.patch, 
> HIVE-21286.03.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> If external tables are enabled for replication on an existing repl policy, 
> then bootstrapping of external tables are combined with incremental dump.
> If incremental bootstrap load fails with non-retryable error for which user 
> will have to manually drop all the external tables before trying with another 
> bootstrap 

[jira] [Commented] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786359#comment-16786359
 ] 

Gopal V commented on HIVE-21399:


That might explain why it isn't working for me right now - the config is very 
optimistic and the stats say be pessimistic.

My config is 0.99 and I'm testing to see if a massive nDV product reduces it 
and skips map aggregation faster.

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch, HIVE-21399.02.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786357#comment-16786357
 ] 

Jesus Camacho Rodriguez commented on HIVE-21399:


[~gopalv], new value is calculated as: {{max(config_value, 1 - ndv/num_rows)}}; 
setting the value to 1 would effectively force that value into the group by 
operators. Is that what you meant?

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch, HIVE-21399.02.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21264) Improvements Around CharTypeInfo

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786354#comment-16786354
 ] 

Gopal V commented on HIVE-21264:


[~belugabehr]: I don't think is correct because varchar(10) != char(10).

{code}
getClass() != other.getClass()
{code}

That was there to prevent it from going and assuming the types are identical.

> Improvements Around CharTypeInfo
> 
>
> Key: HIVE-21264
> URL: https://issues.apache.org/jira/browse/HIVE-21264
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21264.1.patch, HIVE-21264.2.patch
>
>
> The {{CharTypeInfo}} stores the type name of the data type (char/varchar) and 
> the length (1-255).  {{CharTypeInfo}} objects are often getting cached once 
> they are created.
> The {{hashcode()}} and {{equals()}} of its sub-classes varchar and char are 
> inconsistent.
> * Make hashcode and equals consistent (and fast)
> * Simplify the {{getQualifiedName}} implementation and reduce the scope to 
> protected
> * Other related nits



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21371) Make NonSyncByteArrayOutputStream Overflow Conscious

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786347#comment-16786347
 ] 

Gopal V commented on HIVE-21371:


LGTM - +1

> Make NonSyncByteArrayOutputStream Overflow Conscious 
> -
>
> Key: HIVE-21371
> URL: https://issues.apache.org/jira/browse/HIVE-21371
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0, 3.2.0
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Minor
> Attachments: HIVE-21371.1.patch, HIVE-21371.2.patch, 
> HIVE-21371.2.patch, HIVE-21371.2.patch
>
>
> {code:java|title=NonSyncByteArrayOutputStream}
>   private int enLargeBuffer(int increment) {
> int temp = count + increment;
> int newLen = temp;
> if (temp > buf.length) {
>   if ((buf.length << 1) > temp) {
> newLen = buf.length << 1;
>   }
>   byte newbuf[] = new byte[newLen];
>   System.arraycopy(buf, 0, newbuf, 0, count);
>   buf = newbuf;
> }
> return newLen;
>   }
> {code}
> This will fail if the array is 2GB or larger because it will double the size 
> every time without consideration for the 4GB limit on arrays.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786342#comment-16786342
 ] 

Gopal V commented on HIVE-21399:


[~jcamachorodriguez]: also turn off this optimization if someone does want to 
force this via "hive.map.aggr.hash.min.reduction=1.0f;"

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch, HIVE-21399.02.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21377) Using Oracle as HMS DB with DirectSQL

2019-03-06 Thread Bo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786341#comment-16786341
 ] 

Bo  commented on HIVE-21377:


[~pvary]How about this one?
{code:java}
 // check if oracle db returned 0 or 1 for boolean value 
 if (value instanceof Number) { 
 try { 
 return BooleanUtils.toBooleanObject(Integer.valueOf(((Number) 
value).intValue()), 1, 0, null); 
 } catch (IllegalArgumentException iae) { 
 // NOOP 
 } 
 } 
{code}

> Using Oracle as HMS DB with DirectSQL
> -
>
> Key: HIVE-21377
> URL: https://issues.apache.org/jira/browse/HIVE-21377
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Bo 
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-21377.patch
>
>
> When we use the Oracle as HMS DB, we saw this kind of contents in the HMS log 
> accordingly:
> {code:java}
> 2019-02-02 T08:23:57,102 WARN [Thread-12]: metastore.ObjectStore 
> (ObjectStore.java:handleDirectSqlError(3741)) - Falling back to ORM path due 
> to direct SQL failure (this is not an error): Cannot extract boolean from 
> column value 0 at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.extractSqlBoolean(MetaStoreDirectSql.java:1031)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsFromPartitionIds(MetaStoreDirectSql.java:728)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$300(MetaStoreDirectSql.java:109)
>  at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql$1.run(MetaStoreDirectSql.java:471)
>  at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:462)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:3392)
> {code}
> In Hive, we handle the Postgres, MySQL and Derby for the extractSqlBoolean.
> But Oracle return the 0 or 1 for Boolean. So we need to modify the 
> MetastoreDirectSqlUtils.java - [1]
> So, could add this snip in this code?
> {code:java}
>   static Boolean extractSqlBoolean(Object value) throws MetaException {
> if (value == null) {
>   return null;
> }
> if (value instanceof Boolean) {
>   return (Boolean)value;
> }
> if (value instanceof Number) { // add
>   try {
> return BooleanUtils.toBooleanObject((BigDecimal) value, 1, 0, null);
>   } catch(IllegalArugmentExeception iae){
>   // NOOP
>   }
> if (value instanceof String) {
>   try {
> return BooleanUtils.toBooleanObject((String) value, "Y", "N", null);
>   } catch (IllegalArgumentException iae) {
> // NOOP
>   }
> }
> throw new MetaException("Cannot extract boolean from column value " + 
> value);
>   }
> {code}
>  [1] -
> https://github.com/apache/hive/blob/f51f108b761f0c88647f48f30447dae12b308f31/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L501-L527
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19968) UDF exception is not throw out

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786323#comment-16786323
 ] 

Hive QA commented on HIVE-19968:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961377/HIVE-19968.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15819 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16375/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16375/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16375/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961377 - PreCommit-HIVE-Build

> UDF exception is not throw out
> --
>
> Key: HIVE-19968
> URL: https://issues.apache.org/jira/browse/HIVE-19968
> Project: Hive
>  Issue Type: Bug
>Reporter: sandflee
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-19968.01.patch, HIVE-19968.02.patch, 
> HIVE-19968.03.patch, HIVE-19968.04.patch, HIVE-19968.05.patch, 
> HIVE-19968.06.patch, hive-udf.png
>
>
> udf init failed, and throw a exception, but hive catch it and do nothing, 
> leading to app succ, but no data is generated.
> {code}
> GenericUDFReflect.java#evaluate()
> try {  
>    o = null;  
>    o = ReflectionUtils.newInstance(c, null);
> }   catch (Exception e) {  
> // ignored
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21399:
---
Attachment: HIVE-21399.02.patch

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch, HIVE-21399.02.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-19968) UDF exception is not throw out

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786303#comment-16786303
 ] 

Hive QA commented on HIVE-19968:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
17s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
42s{color} | {color:red} ql: The patch generated 3 new + 2 unchanged - 0 fixed 
= 5 total (was 2) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16375/dev-support/hive-personality.sh
 |
| git revision | master / 84f766e |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16375/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16375/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> UDF exception is not throw out
> --
>
> Key: HIVE-19968
> URL: https://issues.apache.org/jira/browse/HIVE-19968
> Project: Hive
>  Issue Type: Bug
>Reporter: sandflee
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-19968.01.patch, HIVE-19968.02.patch, 
> HIVE-19968.03.patch, HIVE-19968.04.patch, HIVE-19968.05.patch, 
> HIVE-19968.06.patch, hive-udf.png
>
>
> udf init failed, and throw a exception, but hive catch it and do nothing, 
> leading to app succ, but no data is generated.
> {code}
> GenericUDFReflect.java#evaluate()
> try {  
>    o = null;  
>    o = ReflectionUtils.newInstance(c, null);
> }   catch (Exception e) {  
> // ignored
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21338) Remove order by and limit for aggregates

2019-03-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21338?focusedWorklogId=209305=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-209305
 ]

ASF GitHub Bot logged work on HIVE-21338:
-

Author: ASF GitHub Bot
Created on: 07/Mar/19 01:52
Start Date: 07/Mar/19 01:52
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #557: HIVE-21338 
Remove order by and limit for aggregates
URL: https://github.com/apache/hive/pull/557#discussion_r263210480
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
 ##
 @@ -1925,6 +1926,11 @@ public RelNode apply(RelOptCluster cluster, 
RelOptSchema relOptSchema, SchemaPlu
 perfLogger.PerfLogEnd(this.getClass().getName(), PerfLogger.OPTIMIZER, 
"Calcite: Window fixing rule");
   }
 
+  perfLogger.PerfLogBegin(this.getClass().getName(), PerfLogger.OPTIMIZER);
 
 Review comment:
   I believe now that the rule relies on the logic from the metadata provider 
that is more powerful, we could move this before step 6. in 
```applyPreJoinOrderingTransforms```. This means that ```RelFieldTrimmer``` 
will be executed after this rule and thus may eliminate any column(s) that were 
used by SortLimit but are not needed anymore (for instance, the example above).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 209305)
Time Spent: 1.5h  (was: 1h 20m)

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21338.1.patch, HIVE-21338.2.patch, 
> HIVE-21338.3.patch, HIVE-21338.4.patch, HIVE-21338.5.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> 

[jira] [Commented] (HIVE-21048) Remove needless org.mortbay.jetty from hadoop exclusions

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786291#comment-16786291
 ] 

Hive QA commented on HIVE-21048:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 10m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m 
11s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 19m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16374/dev-support/hive-personality.sh
 |
| git revision | master / 6326bee |
| Default Java | 1.8.0_111 |
| modules | C: storage-api common llap-tez ql service jdbc hcatalog 
hcatalog/core hcatalog/hcatalog-pig-adapter hcatalog/webhcat/svr . 
itests/qtest-druid U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16374/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Remove needless org.mortbay.jetty from hadoop exclusions
> 
>
> Key: HIVE-21048
> URL: https://issues.apache.org/jira/browse/HIVE-21048
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21048.01.patch, HIVE-21048.02.patch, 
> HIVE-21048.03.patch, HIVE-21048.04.patch, HIVE-21048.05.patch, 
> HIVE-21048.06.patch, HIVE-21048.07.patch, HIVE-21048.08.patch, 
> HIVE-21048.08.patch, HIVE-21048.09.patch, dep.out
>
>
> During HIVE-20638 i found that org.mortbay.jetty exclusions from e.g. hadoop 
> don't take effect, as the actual groupId of jetty is org.eclipse.jetty for 
> most of the current projects, please find attachment (example for hive 
> commons project).
> https://en.wikipedia.org/wiki/Jetty_(web_server)#History



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786285#comment-16786285
 ] 

Jesus Camacho Rodriguez commented on HIVE-21399:


[~vgarg], yes, that is the check on {{Mode.HASH}}.

[~gopalv], sure, will do.

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786282#comment-16786282
 ] 

Gopal V commented on HIVE-21399:


Testing the patch - the feature is right now invisible.

Can you wrap

{code}
public float getMinReductionHashAggr
{code}

for explain formatted/extended?

I realize that means a lot of qfile updates, but if this is planning wrong, the 
explain should catch it.

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786280#comment-16786280
 ] 

Vineet Garg commented on HIVE-21399:


[~jcamachorodriguez] Shouldn't {{SetHashGroupByMinReduction}} be only run for 
map side aggregation? May be I missed something but I don't see the check.

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21048) Remove needless org.mortbay.jetty from hadoop exclusions

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786277#comment-16786277
 ] 

Hive QA commented on HIVE-21048:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961376/HIVE-21048.09.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 15819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniDruidKafkaCliDriver.testCliDriver[druidkafkamini_avro]
 (batchId=275)
org.apache.hadoop.hive.cli.TestMiniDruidKafkaCliDriver.testCliDriver[druidkafkamini_basic]
 (batchId=275)
org.apache.hadoop.hive.cli.TestMiniDruidKafkaCliDriver.testCliDriver[druidkafkamini_csv]
 (batchId=275)
org.apache.hadoop.hive.cli.TestMiniDruidKafkaCliDriver.testCliDriver[druidkafkamini_delimited]
 (batchId=275)
org.apache.hadoop.hive.metastore.TestObjectStore.catalogs (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDatabaseOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDeprecatedConfigIsOverwritten
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropParitionsCleanup
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSQLDropPartitionsCacheCrossSession
 (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testDirectSqlErrorMetrics 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testEmptyTrustStoreProps 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testMasterKeyOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testMaxEventResponse 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testPartitionOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testQueryCloseOnError 
(batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testRoleOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testTableOps (batchId=230)
org.apache.hadoop.hive.metastore.TestObjectStore.testUseSSLProperty 
(batchId=230)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16374/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16374/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16374/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961376 - PreCommit-HIVE-Build

> Remove needless org.mortbay.jetty from hadoop exclusions
> 
>
> Key: HIVE-21048
> URL: https://issues.apache.org/jira/browse/HIVE-21048
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21048.01.patch, HIVE-21048.02.patch, 
> HIVE-21048.03.patch, HIVE-21048.04.patch, HIVE-21048.05.patch, 
> HIVE-21048.06.patch, HIVE-21048.07.patch, HIVE-21048.08.patch, 
> HIVE-21048.08.patch, HIVE-21048.09.patch, dep.out
>
>
> During HIVE-20638 i found that org.mortbay.jetty exclusions from e.g. hadoop 
> don't take effect, as the actual groupId of jetty is org.eclipse.jetty for 
> most of the current projects, please find attachment (example for hive 
> commons project).
> https://en.wikipedia.org/wiki/Jetty_(web_server)#History



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21294) Vectorization: 1-reducer Shuffle can skip the object hash functions

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786269#comment-16786269
 ] 

Gopal V commented on HIVE-21294:


Pushed to master, thanks [~teddy.choi]

> Vectorization: 1-reducer Shuffle can skip the object hash functions
> ---
>
> Key: HIVE-21294
> URL: https://issues.apache.org/jira/browse/HIVE-21294
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21294.2.patch, HIVE-21294.3.patch, 
> HIVE-21294.4.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> VectorReduceSinkObjectHashOperator can skip the object hashing entirely if 
> the reducer count = 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21294) Vectorization: 1-reducer Shuffle can skip the object hash functions

2019-03-06 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21294:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Vectorization: 1-reducer Shuffle can skip the object hash functions
> ---
>
> Key: HIVE-21294
> URL: https://issues.apache.org/jira/browse/HIVE-21294
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21294.2.patch, HIVE-21294.3.patch, 
> HIVE-21294.4.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> VectorReduceSinkObjectHashOperator can skip the object hashing entirely if 
> the reducer count = 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-03-06 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Attachment: (was: HIVE-21338.5.patch)

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21338.1.patch, HIVE-21338.2.patch, 
> HIVE-21338.3.patch, HIVE-21338.4.patch, HIVE-21338.5.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-03-06 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Status: Open  (was: Patch Available)

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21338.1.patch, HIVE-21338.2.patch, 
> HIVE-21338.3.patch, HIVE-21338.4.patch, HIVE-21338.5.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-03-06 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Status: Patch Available  (was: Open)

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21338.1.patch, HIVE-21338.2.patch, 
> HIVE-21338.3.patch, HIVE-21338.4.patch, HIVE-21338.5.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-03-06 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Attachment: HIVE-21338.5.patch

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21338.1.patch, HIVE-21338.2.patch, 
> HIVE-21338.3.patch, HIVE-21338.4.patch, HIVE-21338.5.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21338) Remove order by and limit for aggregates

2019-03-06 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21338:
---
Attachment: HIVE-21338.5.patch

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21338.1.patch, HIVE-21338.2.patch, 
> HIVE-21338.3.patch, HIVE-21338.4.patch, HIVE-21338.5.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> serde: 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
> Fetch Operator
>   limit: 100
>   Processor Tree:
> ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (HIVE-21294) Vectorization: 1-reducer Shuffle can skip the object hash functions

2019-03-06 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21294?focusedWorklogId=209260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-209260
 ]

ASF GitHub Bot logged work on HIVE-21294:
-

Author: ASF GitHub Bot
Created on: 07/Mar/19 00:59
Start Date: 07/Mar/19 00:59
Worklog Time Spent: 10m 
  Work Description: t3rmin4t0r commented on issue #547: HIVE-21294: 
Vectorization: 1-reducer Shuffle can skip the object hash…
URL: https://github.com/apache/hive/pull/547#issuecomment-470339499
 
 
   LGTM - +1
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 209260)
Time Spent: 20m  (was: 10m)

> Vectorization: 1-reducer Shuffle can skip the object hash functions
> ---
>
> Key: HIVE-21294
> URL: https://issues.apache.org/jira/browse/HIVE-21294
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21294.2.patch, HIVE-21294.3.patch, 
> HIVE-21294.4.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> VectorReduceSinkObjectHashOperator can skip the object hashing entirely if 
> the reducer count = 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21391) LLAP: Pool of column vector buffers can cause memory pressure

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786244#comment-16786244
 ] 

Hive QA commented on HIVE-21391:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961360/HIVE-21391.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15819 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16373/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16373/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16373/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961360 - PreCommit-HIVE-Build

> LLAP: Pool of column vector buffers can cause memory pressure
> -
>
> Key: HIVE-21391
> URL: https://issues.apache.org/jira/browse/HIVE-21391
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21391.1.patch
>
>
> Where there are too many columns (in the order of 100s), with decimal, string 
> types the column vector pool of buffers created here 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/EncodedDataConsumer.java#L59]
>  can cause memory pressure. 
> Example:
> 128 (poolSize) * 300 (numCols) * 1024 (batchSize) * 80 (decimalSize) ~= 3GB
> The pool size keeps increasing when there is slow consumer but fast llap io 
> (SSDs) leading to GC pressure when all LLAP io threads read splits from same 
> table. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21399:
---
Attachment: HIVE-21399.01.patch

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21399:
---
Attachment: (was: HIVE-21399.patch)

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.01.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21391) LLAP: Pool of column vector buffers can cause memory pressure

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786227#comment-16786227
 ] 

Hive QA commented on HIVE-21391:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
46s{color} | {color:blue} llap-server in master has 79 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} llap-server: The patch generated 0 new + 74 
unchanged - 5 fixed = 74 total (was 79) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16373/dev-support/hive-personality.sh
 |
| git revision | master / 6326bee |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: llap-server U: llap-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16373/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Pool of column vector buffers can cause memory pressure
> -
>
> Key: HIVE-21391
> URL: https://issues.apache.org/jira/browse/HIVE-21391
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21391.1.patch
>
>
> Where there are too many columns (in the order of 100s), with decimal, string 
> types the column vector pool of buffers created here 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/EncodedDataConsumer.java#L59]
>  can cause memory pressure. 
> Example:
> 128 (poolSize) * 300 (numCols) * 1024 (batchSize) * 80 (decimalSize) ~= 3GB
> The pool size keeps increasing when there is slow consumer but fast llap io 
> (SSDs) leading to GC pressure when all LLAP io threads read splits from same 
> table. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21397) BloomFilter for hive Managed [ACID] table does not work as expected

2019-03-06 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-21397:

Component/s: Transactions

> BloomFilter for hive Managed [ACID] table does not work as expected
> ---
>
> Key: HIVE-21397
> URL: https://issues.apache.org/jira/browse/HIVE-21397
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, Transactions
>Affects Versions: 3.1.1
>Reporter: vaibhav
>Priority: Blocker
>
> Steps to Reproduce this issue : 
> - 
> 1. Create a HIveManaged table as below : 
> - 
> {code:java}
> CREATE TABLE `bloomTest`( 
>    `msisdn` string, 
>    `imsi` varchar(20), 
>    `imei` bigint, 
>    `cell_id` bigint) 
>  ROW FORMAT SERDE 
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
>  STORED AS INPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
>  OUTPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
>  LOCATION 
>    
> 'hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest;
>  
>  TBLPROPERTIES ( 
>    'bucketing_version'='2', 
>    'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
>    'orc.bloom.filter.fpp'='0.02', 
>    'transactional'='true', 
>    'transactional_properties'='default', 
>    'transient_lastDdlTime'='1551206683') {code}
> - 
> 2. Insert a few rows. 
> - 
> - 
> 3. Check if bloom filter or active : [ It does not show bloom filters for 
> hive managed tables ] 
> - 
> {code:java}
> [hive@c1162-node2 root]$ hive --orcfiledump 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_
>  | grep -i bloom 
> SLF4J: Class path contains multiple SLF4J bindings. 
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. 
> SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] 
> Processing data file 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_/bucket_0
>  [length: 791] 
> Structure for 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_/bucket_0
>  {code}
> - 
> On Another hand: For hive External tables it works : 
> - 
> {code:java}
> CREATE external TABLE `ext_bloomTest`( 
>    `msisdn` string, 
>    `imsi` varchar(20), 
>    `imei` bigint, 
>    `cell_id` bigint) 
>  ROW FORMAT SERDE 
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
>  STORED AS INPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
>  OUTPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
>  TBLPROPERTIES ( 
>    'bucketing_version'='2', 
>    'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
>    'orc.bloom.filter.fpp'='0.02') {code}
> - 
> {code:java}
> [hive@c1162-node2 root]$ hive --orcfiledump 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  | grep -i bloom 
> SLF4J: Class path contains multiple SLF4J bindings. 
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. 
> SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] 
> Processing data file 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  [length: 755] 
> Structure for 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  
> Stream: column 1 section BLOOM_FILTER_UTF8 start: 41 length 110 
> Stream: column 2 section BLOOM_FILTER_UTF8 start: 178 length 114 
> Stream: column 4 section BLOOM_FILTER_UTF8 start: 340 length 109 {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20546) Upgrade to Apache Druid 0.13.0-incubating

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786213#comment-16786213
 ] 

Hive QA commented on HIVE-20546:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
34s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 5s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
16s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} druid-handler in master has 3 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
23s{color} | {color:blue} itests/qtest-druid in master has 7 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  8m 
39s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 16m 
18s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16372/dev-support/hive-personality.sh
 |
| git revision | master / 6326bee |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql druid-handler . itests itests/qtest-druid U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16372/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Upgrade to Apache Druid 0.13.0-incubating
> -
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20546.1.patch, HIVE-20546.2.patch, 
> HIVE-20546.3.patch, HIVE-20546.4.patch, HIVE-20546.5.patch, 
> HIVE-20546.6.patch, HIVE-20546.7.patch, HIVE-20546.patch
>
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20546) Upgrade to Apache Druid 0.13.0-incubating

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786197#comment-16786197
 ] 

Hive QA commented on HIVE-20546:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961298/HIVE-20546.7.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15819 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16372/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16372/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16372/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961298 - PreCommit-HIVE-Build

> Upgrade to Apache Druid 0.13.0-incubating
> -
>
> Key: HIVE-20546
> URL: https://issues.apache.org/jira/browse/HIVE-20546
> Project: Hive
>  Issue Type: Task
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-20546.1.patch, HIVE-20546.2.patch, 
> HIVE-20546.3.patch, HIVE-20546.4.patch, HIVE-20546.5.patch, 
> HIVE-20546.6.patch, HIVE-20546.7.patch, HIVE-20546.patch
>
>
> This task is to upgrade to druid 0.13.0 when it is released. Note that it 
> will hopefully be first apache release for Druid. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21399:
---
Attachment: HIVE-21399.patch

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21399.patch
>
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work started] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-21399 started by Jesus Camacho Rodriguez.
--
> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-21399:
---
Status: Patch Available  (was: In Progress)

> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21399) Adjust hive.map.aggr.hash.min.reduction statically depending on group by statistics

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-21399:
--


> Adjust hive.map.aggr.hash.min.reduction statically depending on group by 
> statistics
> ---
>
> Key: HIVE-21399
> URL: https://issues.apache.org/jira/browse/HIVE-21399
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Currently, the value is set statically from config variable. If stats are 
> available, we could try to adjust this value at optimization time to favor 
> turning off hash aggregation earlier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20889) Support timestamp-micros in AvroSerDe

2019-03-06 Thread Ashish Shrowty (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786189#comment-16786189
 ] 

Ashish Shrowty commented on HIVE-20889:
---

i second this .. micros resolution should be supported by AVRO deserializer

> Support timestamp-micros in AvroSerDe
> -
>
> Key: HIVE-20889
> URL: https://issues.apache.org/jira/browse/HIVE-20889
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 1.1.0
>Reporter: vinisha
>Priority: Major
>
> This change only supports timestamp-millis. Avro 1.8.2 also supports 
> timestamp-micros. 
> [https://avro.apache.org/docs/1.8.2/spec.html#Timestamp+%28microsecond+precision%29]
> timestamp-micros should also be supported in hive AvroSerde because hive 
> timestamps support nano second level precision.
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps]
> One possibility is to support avro timestamp-millis and avro timestamp-micros 
> in serialization. Avro Deserializer can map hive timestamp to 
> timestamp-micros. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786165#comment-16786165
 ] 

Gopal V commented on HIVE-21375:


Yeah, it only fixed NiFi for StreamingV2

> Closing TransactionBatch closes FileSystem for other batches
> 
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Reporter: Shawn Weeks
>Priority: Minor
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches

2019-03-06 Thread Shawn Weeks (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786164#comment-16786164
 ] 

Shawn Weeks commented on HIVE-21375:


HIVE-19772 doesn't appear to modify the V1 Streaming API. I may be reading the 
patch completely wrong though.

> Closing TransactionBatch closes FileSystem for other batches
> 
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Reporter: Shawn Weeks
>Priority: Minor
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21338) Remove order by and limit for aggregates

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786160#comment-16786160
 ] 

Hive QA commented on HIVE-21338:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961291/HIVE-21338.4.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16371/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16371/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16371/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12961291/HIVE-21338.4.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961291 - PreCommit-HIVE-Build

> Remove order by and limit for aggregates
> 
>
> Key: HIVE-21338
> URL: https://issues.apache.org/jira/browse/HIVE-21338
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21338.1.patch, HIVE-21338.2.patch, 
> HIVE-21338.3.patch, HIVE-21338.4.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If a query is guaranteed to produce at most one row LIMIT and ORDER BY could 
> be removed. This saves unnecessary vertex for LIMIT/ORDER BY.
> {code:sql}
> explain select count(*) cs from store_sales where ss_ext_sales_price > 100.00 
> order by cs limit 100
> {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Tez
>   DagId: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Edges:
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
>   DagName: vgarg_20190227131959_2914830f-eab6-425d-b9f0-b8cb56f8a1e9:4
>   Vertices:
> Map 1
> Map Operator Tree:
> TableScan
>   alias: store_sales
>   filterExpr: (ss_ext_sales_price > 100) (type: boolean)
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Filter Operator
> predicate: (ss_ext_sales_price > 100) (type: boolean)
> Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
> Select Operator
>   Statistics: Num rows: 1 Data size: 112 Basic stats: 
> COMPLETE Column stats: NONE
>   Group By Operator
> aggregations: count()
> mode: hash
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> Reduce Output Operator
>   sort order:
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   value expressions: _col0 (type: bigint)
> Execution mode: vectorized
> Reducer 2
> Execution mode: vectorized
> Reduce Operator Tree:
>   Group By Operator
> aggregations: count(VALUE._col0)
> mode: mergepartial
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Reduce Output Operator
>   key expressions: _col0 (type: bigint)
>   sort order: +
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   TopN Hash Memory Usage: 0.1
> Reducer 3
> Execution mode: vectorized
> Reduce Operator Tree:
>   Select Operator
> expressions: KEY.reducesinkkey0 (type: bigint)
> outputColumnNames: _col0
> Statistics: Num rows: 1 Data size: 120 Basic stats: COMPLETE 
> Column stats: NONE
> Limit
>   Number of rows: 100
>   Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
>   File Output Operator
> compressed: false
> Statistics: Num rows: 1 Data size: 120 Basic stats: 
> COMPLETE Column stats: NONE
> table:
> input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
> 

[jira] [Commented] (HIVE-21325) Hive external table replication failed with Permission denied issue.

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786159#comment-16786159
 ] 

Hive QA commented on HIVE-21325:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961279/HIVE-21325.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 15819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testBootstrapReplLoadRetryAfterFailureForTablesAndConstraints
 (batchId=252)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapReplLoadRetryAfterFailureForTablesAndConstraints
 (batchId=247)
org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable[1]
 (batchId=209)
org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable[2]
 (batchId=209)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16370/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16370/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16370/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961279 - PreCommit-HIVE-Build

> Hive external table replication failed with Permission denied issue.
> 
>
> Key: HIVE-21325
> URL: https://issues.apache.org/jira/browse/HIVE-21325
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21325.01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> During external table replication the file copy is done in parallel to the 
> meta data replication. If the file copy task creates the directory with do as 
> set to true, it will create the directory with permission set to the user 
> running the repl command. In that case the meta data task while creating the 
> table may fail as hive user might not have access to the created directory.
> The fix should be
>  # While creating directory, if sql based authentication is enabled, then 
> disable storage based authentication for hive user.
>  # Currently the created directory has the login user access, it should 
> retain the source clusters owner, group and permission.
>  # For external table replication don't create the directory during create 
> table and add partition.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786157#comment-16786157
 ] 

Gopal V commented on HIVE-21375:


[~Absolutesantaja]: I think this got fixed in HIVE-19772

> Closing TransactionBatch closes FileSystem for other batches
> 
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Reporter: Shawn Weeks
>Priority: Minor
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21182) Skip setting up hive scratch dir during planning

2019-03-06 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-21182:
---
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

> Skip setting up hive scratch dir during planning
> 
>
> Key: HIVE-21182
> URL: https://issues.apache.org/jira/browse/HIVE-21182
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21182.1.patch, HIVE-21182.2.patch, 
> HIVE-21182.3.patch
>
>
> During metadata gathering phase hive creates staging/scratch dir which is 
> further used by FS op (FS op sets up staging dir within this dir for tasks to 
> write to).
> Since FS op do mkdirs to setup staging dir we can skip creating scratch dir 
> during metadata gathering phase. FS op will take care of setting up all the 
> dirs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21380) No documented command to list queries in a Hive Workload resource pool

2019-03-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786144#comment-16786144
 ] 

Prasanth Jayachandran edited comment on HIVE-21380 at 3/6/19 9:25 PM:
--

There is a mx bean that is exposed via metrics endpoint that prints the 
sessions running in the pool. It should be easy to enhance that to add 
session.getQueryId() here 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java#L753]
 


was (Author: prasanth_j):
The is mx bean that is exposed via metrics endpoint that prints the sessions 
running in the pool. It should be easy to enhance that to add 
session.getQueryId() here 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java#L753]
 

> No documented command to list queries in a Hive Workload resource pool
> --
>
> Key: HIVE-21380
> URL: https://issues.apache.org/jira/browse/HIVE-21380
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: Kerberos, Ranger
>Reporter: Michael DeGuzis
>Priority: Major
>
> We have searched all over 
> https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/hive-workload/content/hive_workload_management_entity_data_in_sys.html
>  and even the Apache hive code base. How do you monitor what queries are 
> running in what resource pool???



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21380) No documented command to list queries in a Hive Workload resource pool

2019-03-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786144#comment-16786144
 ] 

Prasanth Jayachandran commented on HIVE-21380:
--

The is mx bean that is exposed via metrics endpoint that prints the sessions 
running in the pool. It should be easy to enhance that to add 
session.getQueryId() here 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java#L753]
 

> No documented command to list queries in a Hive Workload resource pool
> --
>
> Key: HIVE-21380
> URL: https://issues.apache.org/jira/browse/HIVE-21380
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: Kerberos, Ranger
>Reporter: Michael DeGuzis
>Priority: Major
>
> We have searched all over 
> https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/hive-workload/content/hive_workload_management_entity_data_in_sys.html
>  and even the Apache hive code base. How do you monitor what queries are 
> running in what resource pool???



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21325) Hive external table replication failed with Permission denied issue.

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786137#comment-16786137
 ] 

Hive QA commented on HIVE-21325:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
44s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
32s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
17s{color} | {color:blue} standalone-metastore/metastore-server in master has 
179 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
21s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16370/dev-support/hive-personality.sh
 |
| git revision | master / 881080f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: standalone-metastore/metastore-server ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16370/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Hive external table replication failed with Permission denied issue.
> 
>
> Key: HIVE-21325
> URL: https://issues.apache.org/jira/browse/HIVE-21325
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-21325.01.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> During external table replication the file copy is done in parallel to the 
> meta data replication. If the file copy task creates the directory with do as 
> set to true, it will create the directory with permission set to the user 
> running the repl command. In that case the meta data task while creating the 
> table may fail as hive user might not have access to the created directory.
> The fix should be
>  # While creating directory, if sql based authentication is enabled, then 
> disable storage based authentication for hive user.
>  # Currently the created 

[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-03-06 Thread Gopal V (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786131#comment-16786131
 ] 

Gopal V commented on HIVE-20656:


{code}
Client Execution succeeded but contained differences (error code = 1) after 
executing vector_groupby_reduce.q 
889c889
< 1 85411   816 58.285714285714285  -5080.1699  
-362.86928571428564 621.35  44.382142857142857143
---
> 1 85411   816 58.285714285714285  -5080.17
> -362.8692857142857  621.35  44.382142857142857143
{code}

-5080.1699 vs -5080.17

Float jumps from {{-5080.1694}} -> {{-5080.17}} in 1 bit.

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch, HIVE-20656.2.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21391) LLAP: Pool of column vector buffers can cause memory pressure

2019-03-06 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786124#comment-16786124
 ] 

Prasanth Jayachandran commented on HIVE-21391:
--

[~gopalv] can you please review this patch? The strong ref fixed object pool of 
CVBs is made to weak ref. 

> LLAP: Pool of column vector buffers can cause memory pressure
> -
>
> Key: HIVE-21391
> URL: https://issues.apache.org/jira/browse/HIVE-21391
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21391.1.patch
>
>
> Where there are too many columns (in the order of 100s), with decimal, string 
> types the column vector pool of buffers created here 
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/decode/EncodedDataConsumer.java#L59]
>  can cause memory pressure. 
> Example:
> 128 (poolSize) * 300 (numCols) * 1024 (batchSize) * 80 (decimalSize) ~= 3GB
> The pool size keeps increasing when there is slow consumer but fast llap io 
> (SSDs) leading to GC pressure when all LLAP io threads read splits from same 
> table. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786109#comment-16786109
 ] 

Hive QA commented on HIVE-20656:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961276/HIVE-20656.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_groupby_reduce] 
(batchId=61)
org.apache.hive.jdbc.TestActivePassiveHA.testActivePassiveHA (batchId=261)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16368/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16368/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16368/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961276 - PreCommit-HIVE-Build

> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch, HIVE-20656.2.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786111#comment-16786111
 ] 

Hive QA commented on HIVE-20848:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961275/HIVE-20848.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16369/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16369/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16369/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Tests exited with: Exception: Patch URL 
https://issues.apache.org/jira/secure/attachment/12961275/HIVE-20848.01.patch 
was found in seen patch url's cache and a test was probably run already on it. 
Aborting...
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961275 - PreCommit-HIVE-Build

> After setting UpdateInputAccessTimeHook query fail with Table Not Found.
> 
>
> Key: HIVE-20848
> URL: https://issues.apache.org/jira/browse/HIVE-20848
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20848.01.patch, HIVE-20848.patch
>
>
> {code}
>  select from_unixtime(1540495168); 
>  set 
> hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;
>  select from_unixtime(1540495168); 
> {code}
> the second select fail with following exception
> {code}
> ERROR ql.Driver: FAILED: Hive Internal Error: 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found 
> _dummy_table)
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155)
> at 
> org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20656) Sensible defaults: Map aggregation memory configs are too aggressive

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786084#comment-16786084
 ] 

Hive QA commented on HIVE-20656:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
19s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16368/dev-support/hive-personality.sh
 |
| git revision | master / 881080f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16368/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Sensible defaults: Map aggregation memory configs are too aggressive
> 
>
> Key: HIVE-20656
> URL: https://issues.apache.org/jira/browse/HIVE-20656
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-20656.1.patch, HIVE-20656.2.patch
>
>
> The defaults for the following configs seems to be too aggressive. In java 
> this can easily lead to several full GC pauses whose memory cannot be 
> reclaimed.
> {code:java}
> HIVEMAPAGGRHASHMEMORY("hive.map.aggr.hash.percentmemory", (float) 0.99,
> "Portion of total memory to be used by map-side group aggregation hash 
> table"),
> HIVEMAPAGGRMEMORYTHRESHOLD("hive.map.aggr.hash.force.flush.memory.threshold", 
> (float) 0.9,
> "The max memory to be used by map-side group aggregation hash table.\n" +
> "If the memory usage is higher than this number, force to flush 
> data"),{code}
>  
> We can be little bit conservative for these configs to avoid getting into GC 
> pause. 



--
This message was sent by Atlassian 

[jira] [Commented] (HIVE-21382) Group by keys reduction optimization - keys are not reduced in query23

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786050#comment-16786050
 ] 

Hive QA commented on HIVE-21382:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961272/HIVE-21382.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 15819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query39]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[cbo_query64]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query39]
 (batchId=275)
org.apache.hadoop.hive.cli.TestTezPerfConstraintsCliDriver.testCliDriver[query64]
 (batchId=275)
org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable[1]
 (batchId=209)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16367/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16367/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16367/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961272 - PreCommit-HIVE-Build

> Group by keys reduction optimization - keys are not reduced in query23
> --
>
> Key: HIVE-21382
> URL: https://issues.apache.org/jira/browse/HIVE-21382
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21382.1.patch
>
>
> {code:sql}
> explain cbo with frequent_ss_items as 
>  (select substr(i_item_desc,1,30) itemdesc,i_item_sk item_sk,d_date 
> solddate,count(*) cnt
>   from store_sales
>   ,date_dim 
>   ,item
>   where ss_sold_date_sk = d_date_sk
> and ss_item_sk = i_item_sk 
> and d_year in (1999,1999+1,1999+2,1999+3)
>   group by substr(i_item_desc,1,30),i_item_sk,d_date
>   having count(*) >4)
> select  sum(sales)
>  from ((select cs_quantity*cs_list_price sales
>from catalog_sales
>,date_dim 
>where d_year = 1999 
>  and d_moy = 1 
>  and cs_sold_date_sk = d_date_sk 
>  and cs_item_sk in (select item_sk from frequent_ss_items))) subq 
> limit 100;
> {code}
> {code:sql}
> HiveSortLimit(fetch=[100])
>   HiveProject($f0=[$0])
> HiveAggregate(group=[{}], agg#0=[sum($0)])
>   HiveProject(sales=[*(CAST($2):DECIMAL(10, 0), $3)])
> HiveSemiJoin(condition=[=($1, $5)], joinType=[inner])
>   HiveJoin(condition=[=($0, $4)], joinType=[inner], algorithm=[none], 
> cost=[{2.0 rows, 0.0 cpu, 0.0 io}])
> HiveProject(cs_sold_date_sk=[$0], cs_item_sk=[$15], 
> cs_quantity=[$18], cs_list_price=[$20])
>   HiveFilter(condition=[IS NOT NULL($0)])
> HiveTableScan(table=[[perf_constraints, catalog_sales]], 
> table:alias=[catalog_sales])
> HiveProject(d_date_sk=[$0])
>   HiveFilter(condition=[AND(=($6, 1999), =($8, 1))])
> HiveTableScan(table=[[perf_constraints, date_dim]], 
> table:alias=[date_dim])
>   HiveProject(i_item_sk=[$1])
> HiveFilter(condition=[>($3, 4)])
>   HiveProject(substr=[$2], i_item_sk=[$1], d_date=[$0], $f3=[$3])
> HiveAggregate(group=[{3, 4, 5}], agg#0=[count()])
>   HiveJoin(condition=[=($1, $4)], joinType=[inner], 
> algorithm=[none], cost=[{2.0 rows, 0.0 cpu, 0.0 io}])
> HiveJoin(condition=[=($0, $2)], joinType=[inner], 
> algorithm=[none], cost=[{2.0 rows, 0.0 cpu, 0.0 io}])
>   HiveProject(ss_sold_date_sk=[$0], ss_item_sk=[$2])
> HiveFilter(condition=[IS NOT NULL($0)])
>   HiveTableScan(table=[[perf_constraints, 
> store_sales]], table:alias=[store_sales])
>   HiveProject(d_date_sk=[$0], d_date=[$2])
> HiveFilter(condition=[IN($6, 1999, 2000, 2001, 2002)])
>   HiveTableScan(table=[[perf_constraints, date_dim]], 
> table:alias=[date_dim])
> HiveProject(i_item_sk=[$0], substr=[substr($4, 1, 30)])
>   HiveTableScan(table=[[perf_constraints, item]], 
> table:alias=[item])
> {code}
> Right side of HiveSemiJoin has an aggregate which 

[jira] [Commented] (HIVE-21337) HMS Metadata migration from Postgres/Derby to other DBs fail

2019-03-06 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786048#comment-16786048
 ] 

Yongzhi Chen commented on HIVE-21337:
-

+1 make sure all the tests pass. 

> HMS Metadata migration from Postgres/Derby to other DBs fail
> 
>
> Key: HIVE-21337
> URL: https://issues.apache.org/jira/browse/HIVE-21337
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-21337.2.patch, HIVE-21337.patch
>
>
> Customer recently was migrating from Postgres to Oracle for HMS metastore. 
> During import of the [exported] data from HMS metastore from postgres, 
> failures are seen as the COLUMNS_V2.COMMENT is 4000 bytes long whereas oracle 
> and other schemas define it to be 256 bytes.
> This inconsistency in the schema makes the migration cumbersome and manual. 
> This jira makes this column consistent in length across all databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21389) Hive distribution miss javax.ws.rs-api.jar after HIVE-21247

2019-03-06 Thread Daniel Dai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-21389:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Patch pushed to master. Thanks Thejas for review!

> Hive distribution miss javax.ws.rs-api.jar after HIVE-21247
> ---
>
> Key: HIVE-21389
> URL: https://issues.apache.org/jira/browse/HIVE-21389
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-21389.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21385) Allow disabling pushdown of non-splittable computation to JDBC sources

2019-03-06 Thread Daniel Dai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786031#comment-16786031
 ] 

Daniel Dai commented on HIVE-21385:
---

How about make a list config parameter? By default, we can put 
join/union/aggregation/sort, but user can remove anything he don't want to 
pushdown.

> Allow disabling pushdown of non-splittable computation to JDBC sources
> --
>
> Key: HIVE-21385
> URL: https://issues.apache.org/jira/browse/HIVE-21385
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21385.01.patch, HIVE-21385.patch
>
>
> Until pushdown is cost-based decision, we will be able to enable / disable 
> pushdown of operators that prevent reading results from the JDBC connection 
> in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-03-06 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-21293:

Attachment: (was: HIVE-21293.02.patch)

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch, HIVE-21293.02.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-03-06 Thread Laszlo Bodor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laszlo Bodor updated HIVE-21293:

Attachment: HIVE-21293.02.patch

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch, HIVE-21293.02.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21382) Group by keys reduction optimization - keys are not reduced in query23

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785996#comment-16785996
 ] 

Hive QA commented on HIVE-21382:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
33s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 3 new + 1 unchanged - 0 fixed 
= 4 total (was 1) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16367/dev-support/hive-personality.sh
 |
| git revision | master / 0413fec |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16367/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16367/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Group by keys reduction optimization - keys are not reduced in query23
> --
>
> Key: HIVE-21382
> URL: https://issues.apache.org/jira/browse/HIVE-21382
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
> Attachments: HIVE-21382.1.patch
>
>
> {code:sql}
> explain cbo with frequent_ss_items as 
>  (select substr(i_item_desc,1,30) itemdesc,i_item_sk item_sk,d_date 
> solddate,count(*) cnt
>   from store_sales
>   ,date_dim 
>   ,item
>   where ss_sold_date_sk = d_date_sk
> and ss_item_sk = i_item_sk 
> and d_year in (1999,1999+1,1999+2,1999+3)
>   group by substr(i_item_desc,1,30),i_item_sk,d_date
>   having count(*) >4)
> select  sum(sales)
>  from ((select cs_quantity*cs_list_price sales
>from catalog_sales
>,date_dim 
>where d_year = 1999 
>  and d_moy = 1 
>  and cs_sold_date_sk = d_date_sk 
>  and cs_item_sk in (select item_sk from frequent_ss_items))) subq 
> limit 100;
> {code}
> {code:sql}
> HiveSortLimit(fetch=[100])
>   HiveProject($f0=[$0])
> HiveAggregate(group=[{}], agg#0=[sum($0)])
>   HiveProject(sales=[*(CAST($2):DECIMAL(10, 0), $3)])
> HiveSemiJoin(condition=[=($1, $5)], joinType=[inner])
>   HiveJoin(condition=[=($0, $4)], joinType=[inner], algorithm=[none], 
> cost=[{2.0 rows, 0.0 cpu, 0.0 io}])
> 

[jira] [Commented] (HIVE-21376) Incompatible change in Hive bucket computation

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785993#comment-16785993
 ] 

Jesus Camacho Rodriguez commented on HIVE-21376:


[~ashutoshc], could you take a look? I will create a follow-up to add tests 
that verify that type hash codes for bucketing do not change between releases 
so we do not hit this issue again in future.

> Incompatible change in Hive bucket computation
> --
>
> Key: HIVE-21376
> URL: https://issues.apache.org/jira/browse/HIVE-21376
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: David Phillips
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21376.01.patch, HIVE-21376.patch
>
>
> HIVE-20007 seems to have inadvertently changed the bucket hash code 
> computation via {{ObjectInspectorUtils.getBucketHashCodeOld()}} for the 
> {{DATE}} and {{TIMESTAMP}} data type2.
> {{DATE}} was previously computed using {{DateWritable}}, which uses 
> {{daysSinceEpoch}} as the hash code. It is now computed using 
> {{DateWritableV2}}, which uses the hash code of {{java.time.LocalDate}} 
> (which is not days since epoch).
> {{TIMESTAMP}} was previous computed using {{TimestampWritable}} and now uses 
> {{TimestampWritableV2}}. They ostensibly use the same hash code computation, 
> but there are two important differences:
>  # {{TimestampWritable}} rounds the number of milliseconds into the seconds 
> portion of the computation, but {{TimestampWritableV2}} does not.
>  # {{TimestampWritable}} gets the epoch time from {{java.sql.Timestamp}}, 
> which returns it relative to the JVM time zone, not UTC. 
> {{TimestampWritableV2}} uses a {{LocalDateTime}} relative to UTC.
> I was unable to get Hive 3.1 running in order to verify if this actually 
> causes data to be read or written incorrectly (there may be code above this 
> library method which makes things work correctly). However, if my 
> understanding is correct, this means Hive 3.1 is both forwards and backwards 
> incompatible with bucketed tables using either of these data types. It also 
> indicates that Hive needs tests to verify that the hash code does not change 
> between releases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21397) BloomFilter for hive Managed [ACID] table does not work as expected

2019-03-06 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-21397:
---
Summary: BloomFilter for hive Managed [ACID] table does not work as 
expected  (was: BoomFilter for hive Managed [ACID] table does not work as 
expected)

> BloomFilter for hive Managed [ACID] table does not work as expected
> ---
>
> Key: HIVE-21397
> URL: https://issues.apache.org/jira/browse/HIVE-21397
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Affects Versions: 3.1.1
>Reporter: vaibhav
>Priority: Blocker
>
> Steps to Reproduce this issue : 
> - 
> 1. Create a HIveManaged table as below : 
> - 
> {code:java}
> CREATE TABLE `bloomTest`( 
>    `msisdn` string, 
>    `imsi` varchar(20), 
>    `imei` bigint, 
>    `cell_id` bigint) 
>  ROW FORMAT SERDE 
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
>  STORED AS INPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
>  OUTPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
>  LOCATION 
>    
> 'hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest;
>  
>  TBLPROPERTIES ( 
>    'bucketing_version'='2', 
>    'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
>    'orc.bloom.filter.fpp'='0.02', 
>    'transactional'='true', 
>    'transactional_properties'='default', 
>    'transient_lastDdlTime'='1551206683') {code}
> - 
> 2. Insert a few rows. 
> - 
> - 
> 3. Check if bloom filter or active : [ It does not show bloom filters for 
> hive managed tables ] 
> - 
> {code:java}
> [hive@c1162-node2 root]$ hive --orcfiledump 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_
>  | grep -i bloom 
> SLF4J: Class path contains multiple SLF4J bindings. 
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. 
> SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] 
> Processing data file 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_/bucket_0
>  [length: 791] 
> Structure for 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/managed/hive/bloomTest/delta_001_001_/bucket_0
>  {code}
> - 
> On Another hand: For hive External tables it works : 
> - 
> {code:java}
> CREATE external TABLE `ext_bloomTest`( 
>    `msisdn` string, 
>    `imsi` varchar(20), 
>    `imei` bigint, 
>    `cell_id` bigint) 
>  ROW FORMAT SERDE 
>    'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
>  STORED AS INPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
>  OUTPUTFORMAT 
>    'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' 
>  TBLPROPERTIES ( 
>    'bucketing_version'='2', 
>    'orc.bloom.filter.columns'='msisdn,cell_id,imsi', 
>    'orc.bloom.filter.fpp'='0.02') {code}
> - 
> {code:java}
> [hive@c1162-node2 root]$ hive --orcfiledump 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  | grep -i bloom 
> SLF4J: Class path contains multiple SLF4J bindings. 
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>  
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation. 
> SLF4J: Actual binding is of type 
> [org.apache.logging.slf4j.Log4jLoggerFactory] 
> Processing data file 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  [length: 755] 
> Structure for 
> hdfs://c1162-node2.squadron-labs.com:8020/warehouse/tablespace/external/hive/ext_bloomTest/00_0
>  
> Stream: column 1 section BLOOM_FILTER_UTF8 start: 41 length 110 
> Stream: column 2 section BLOOM_FILTER_UTF8 start: 178 length 114 
> Stream: column 4 section BLOOM_FILTER_UTF8 start: 340 length 109 {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785986#comment-16785986
 ] 

Jesus Camacho Rodriguez commented on HIVE-21293:


Thanks for checking [~abstractdog]. My bad, I did not realize TRUE and FALSE 
are already reserved. I believe latest patch removing it from boolean value is 
fine then. Since we will not be able to use it as a constant value, you can 
create a follow-up JIRA and maybe we can pick it up in the future.

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch, HIVE-21293.02.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785966#comment-16785966
 ] 

Hive QA commented on HIVE-21339:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961270/HIVE-21339.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15768 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.beeline.TestBeeLineWithArgs.org.apache.hive.beeline.TestBeeLineWithArgs
 (batchId=257)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16366/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16366/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16366/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961270 - PreCommit-HIVE-Build

> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, HIVE-21339.2.patch, 
> HIVE-21339.3.patch, llap-cache-fs-get.png, llap-query7-cached.svg
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // Don't cache the filesystem object for now; Tez closes it and FS cache 
> will fix all that
> fs = split.getPath().getFileSystem(jobConf);
> fileKey = determineFileId(fs, split,
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_ALLOW_SYNTHETIC_FILEID),
> HiveConf.getBoolVar(daemonConf, 
> ConfVars.LLAP_CACHE_DEFAULT_FS_FILE_ID),
> !HiveConf.getBoolVar(daemonConf, ConfVars.LLAP_IO_USE_FILEID_PATH)
> );
> {code}
>  !llap-cache-fs-get.png! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21339) LLAP: Cache hit also initializes an FS object

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785897#comment-16785897
 ] 

Hive QA commented on HIVE-21339:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
33s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
51s{color} | {color:blue} llap-server in master has 79 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 2 new + 111 unchanged - 0 
fixed = 113 total (was 111) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} llap-server: The patch generated 1 new + 31 unchanged 
- 0 fixed = 32 total (was 31) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m  5s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16366/dev-support/hive-personality.sh
 |
| git revision | master / 0413fec |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16366/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16366/yetus/diff-checkstyle-llap-server.txt
 |
| modules | C: ql llap-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16366/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> LLAP: Cache hit also initializes an FS object 
> --
>
> Key: HIVE-21339
> URL: https://issues.apache.org/jira/browse/HIVE-21339
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 4.0.0
>Reporter: Gopal V
>Assignee: Prasanth Jayachandran
>Priority: Major
> Attachments: HIVE-21339.1.patch, HIVE-21339.2.patch, 
> HIVE-21339.3.patch, llap-cache-fs-get.png, llap-query7-cached.svg
>
>
> https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java#L214
> {code}
> // 1. Get file metadata from cache, or create the reader and read it.
> // 

[jira] [Commented] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-03-06 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785874#comment-16785874
 ] 

Yongzhi Chen commented on HIVE-21336:
-

I think let it fail is better because for 3.0 direct install will fail in the 
same env. So it is consistent within the same version.

> HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char
> --
>
> Key: HIVE-21336
> URL: https://issues.apache.org/jira/browse/HIVE-21336
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21336.2.patch, HIVE-21336.3.patch, HIVE-21336.patch
>
>
> CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 
> Customer tried the same DDL in SQLDevloper, and got the same error. This 
> could be a result of combination of DB level settings like the db_block_size, 
> limiting the maximum key length, as per below doc: 
> http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 
> Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at 
> the session level to CHAR, thus reducing the max size of the index length. We 
> have increased the size of the COLUMN_NAME from 128 to 767 (used to be at 
> 1000) and TABLE_NAME from 128 to 256. This by setting 
> {code} 
> CREATE TABLE PART_COL_STATS ( 
> CS_ID NUMBER NOT NULL, 
> DB_NAME VARCHAR2(128) NOT NULL, 
> TABLE_NAME VARCHAR2(256) NOT NULL, 
> PARTITION_NAME VARCHAR2(767) NOT NULL, 
> COLUMN_NAME VARCHAR2(767) NOT NULL,  
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> {code} 
> Reproducer: 
> {code} 
> SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
> (c) 1982, 2011, Oracle. All rights reserved. 
> Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
> Production 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> BYTE 
> SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 
> SQL> commit; Commit complete. 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> CHAR 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> * ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 
> SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
> Session altered. 
> SQL> commit; 
> Commit complete. 
> SQL> drop table PART_COL_STATS; 
> Table dropped. 
> SQL> commit; 
> Commit complete. 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> Index created. 
> SQL> commit; 
> Commit complete. 
> SQL> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21385) Allow disabling pushdown of non-splittable computation to JDBC sources

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785857#comment-16785857
 ] 

Jesus Camacho Rodriguez commented on HIVE-21385:


Maybe in addition to having these two global variables, it can make sense to 
being able to enable/disable them on a per-table basis setting it in the table 
properties. What do you think? I can create a follow-up for that.

> Allow disabling pushdown of non-splittable computation to JDBC sources
> --
>
> Key: HIVE-21385
> URL: https://issues.apache.org/jira/browse/HIVE-21385
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21385.01.patch, HIVE-21385.patch
>
>
> Until pushdown is cost-based decision, we will be able to enable / disable 
> pushdown of operators that prevent reading results from the JDBC connection 
> in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21385) Allow disabling pushdown of non-splittable computation to JDBC sources

2019-03-06 Thread Jesus Camacho Rodriguez (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785855#comment-16785855
 ] 

Jesus Camacho Rodriguez commented on HIVE-21385:


I would say it depends. For instance, Aggregate may not increase data size, but 
it may not reduce it much, and we would prevent reading in parallel using JDBC. 
The advantage of keeping those operators in the Hive plan is that we can read 
in parallel and compute aggregation/sort/union using Hive parallel operators 
too.
For the time being, I believe disabling pushdown completely (already there) and 
disabling operators that may prevent reading in parallel gives the user enough 
flexibility.

> Allow disabling pushdown of non-splittable computation to JDBC sources
> --
>
> Key: HIVE-21385
> URL: https://issues.apache.org/jira/browse/HIVE-21385
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO, StorageHandler
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-21385.01.patch, HIVE-21385.patch
>
>
> Until pushdown is cost-based decision, we will be able to enable / disable 
> pushdown of operators that prevent reading results from the JDBC connection 
> in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20616) Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785853#comment-16785853
 ] 

Hive QA commented on HIVE-20616:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12940711/HIVE-20616.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 15819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[test_teradatabinaryfile] 
(batchId=2)
org.apache.hive.service.cli.session.TestSessionManagerMetrics.testAbandonedSessionMetrics
 (batchId=234)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16365/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16365/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16365/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12940711 - PreCommit-HIVE-Build

> Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars
> 
>
> Key: HIVE-20616
> URL: https://issues.apache.org/jira/browse/HIVE-20616
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20616.patch
>
>
> with mysql as metastore db the PARTITION_PARAMS.PARAM_VALUE defined as 
> varchar(4000)
> {code}
> describe PARTITION_PARAMS; 
> +-+---+--+-+-+---+ 
> | Field | Type | Null | Key | Default | Extra | 
> +-+---+--+-+-+---+ 
> | PART_ID | bigint(20) | NO | PRI | NULL | | 
> | PARAM_KEY | varchar(256) | NO | PRI | NULL | | 
> | PARAM_VALUE | varchar(4000) | YES | | NULL | | 
> +-+---+--+-+-+---+ 
> {code}
> which lead to the MoveTask failure if PART_VALUE excceeds 4000 chars.
> {code}
> org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO 
> `PARTITION_PARAMS` (`PARAM_VALUE`,`PART_ID`,`PARAM_KEY`) VALUES (?,?,?)
>  at 
> org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1074)
>  at 
> org.datanucleus.store.rdbms.scostore.JoinMapStore.putAll(JoinMapStore.java:224)
>  at 
> org.datanucleus.store.rdbms.mapping.java.MapMapping.postInsert(MapMapping.java:158)
>  at 
> org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:522)
>  at 
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162)
>  at 
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138)
>  at 
> org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363)
>  at 
> org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
>  at 
> org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:724)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPartition(ObjectStore.java:2442)
>  at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>  at com.sun.proxy.$Proxy32.addPartition(Unknown Source)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_core(HiveMetaStore.java:3976)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_with_environment_context(HiveMetaStore.java:4032)
>  at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  

[jira] [Commented] (HIVE-20616) Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785775#comment-16785775
 ] 

Hive QA commented on HIVE-20616:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 3s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  1m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16365/dev-support/hive-personality.sh
 |
| git revision | master / 0413fec |
| modules | C: metastore U: metastore |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16365/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars
> 
>
> Key: HIVE-20616
> URL: https://issues.apache.org/jira/browse/HIVE-20616
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20616.patch
>
>
> with mysql as metastore db the PARTITION_PARAMS.PARAM_VALUE defined as 
> varchar(4000)
> {code}
> describe PARTITION_PARAMS; 
> +-+---+--+-+-+---+ 
> | Field | Type | Null | Key | Default | Extra | 
> +-+---+--+-+-+---+ 
> | PART_ID | bigint(20) | NO | PRI | NULL | | 
> | PARAM_KEY | varchar(256) | NO | PRI | NULL | | 
> | PARAM_VALUE | varchar(4000) | YES | | NULL | | 
> +-+---+--+-+-+---+ 
> {code}
> which lead to the MoveTask failure if PART_VALUE excceeds 4000 chars.
> {code}
> org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO 
> `PARTITION_PARAMS` (`PARAM_VALUE`,`PART_ID`,`PARAM_KEY`) VALUES (?,?,?)
>  at 
> org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1074)
>  at 
> org.datanucleus.store.rdbms.scostore.JoinMapStore.putAll(JoinMapStore.java:224)
>  at 
> org.datanucleus.store.rdbms.mapping.java.MapMapping.postInsert(MapMapping.java:158)
>  at 
> org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:522)
>  at 
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162)
>  at 
> org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138)
>  at 
> org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363)
>  at 
> org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
>  at 
> org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
>  at 
> org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:724)
>  at 
> org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
>  at 
> org.apache.hadoop.hive.metastore.ObjectStore.addPartition(ObjectStore.java:2442)
>  at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>  at com.sun.proxy.$Proxy32.addPartition(Unknown Source)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_core(HiveMetaStore.java:3976)
>  at 
> 

[jira] [Updated] (HIVE-21398) Columns which has estimated statistics should not be considered as unique keys

2019-03-06 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21398:

Status: Patch Available  (was: Open)

> Columns which has estimated statistics should not be considered as unique keys
> --
>
> Key: HIVE-21398
> URL: https://issues.apache.org/jira/browse/HIVE-21398
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21398.01.patch
>
>
> Right now for a column to qualify as a unique column it has to meet the 
> criteria: 
> {code}
> NDV >= numRows
> {code}
> when numRows is 1 this tends to be true ; but numRows is also 1 in cases when 
> we are kinda operate in the blind - don't know how many row there are - more 
> generatlly: with estimated column statistics.
> As a sideeffect of qualifying all columns to be unique; after a few joins all 
> column combinations became uniqueso for a join between 3 tables which 
> have (i,j,k) columns; then it will allocate {{i*j*k}} triplets of "unique 
> column triplets".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-21398) Columns which has estimated statistics should not be considered as unique keys

2019-03-06 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-21398:

Attachment: HIVE-21398.01.patch

> Columns which has estimated statistics should not be considered as unique keys
> --
>
> Key: HIVE-21398
> URL: https://issues.apache.org/jira/browse/HIVE-21398
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-21398.01.patch
>
>
> Right now for a column to qualify as a unique column it has to meet the 
> criteria: 
> {code}
> NDV >= numRows
> {code}
> when numRows is 1 this tends to be true ; but numRows is also 1 in cases when 
> we are kinda operate in the blind - don't know how many row there are - more 
> generatlly: with estimated column statistics.
> As a sideeffect of qualifying all columns to be unique; after a few joins all 
> column combinations became uniqueso for a join between 3 tables which 
> have (i,j,k) columns; then it will allocate {{i*j*k}} triplets of "unique 
> column triplets".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-16716) Clean up javadoc from errors in module ql

2019-03-06 Thread Robert Kucsora (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kucsora updated HIVE-16716:
--
Attachment: HIVE-16716.8.patch

> Clean up javadoc from errors in module ql
> -
>
> Key: HIVE-16716
> URL: https://issues.apache.org/jira/browse/HIVE-16716
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Janos Gub
>Assignee: Robert Kucsora
>Priority: Major
> Attachments: HIVE-16716-v2.patch, HIVE-16716.2.patch, 
> HIVE-16716.3.patch, HIVE-16716.4.patch, HIVE-16716.5.patch, 
> HIVE-16716.6.patch, HIVE-16716.7.patch, HIVE-16716.8.patch, HIVE-16716.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785732#comment-16785732
 ] 

Hive QA commented on HIVE-20848:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961275/HIVE-20848.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 15819 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16364/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16364/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16364/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961275 - PreCommit-HIVE-Build

> After setting UpdateInputAccessTimeHook query fail with Table Not Found.
> 
>
> Key: HIVE-20848
> URL: https://issues.apache.org/jira/browse/HIVE-20848
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20848.01.patch, HIVE-20848.patch
>
>
> {code}
>  select from_unixtime(1540495168); 
>  set 
> hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;
>  select from_unixtime(1540495168); 
> {code}
> the second select fail with following exception
> {code}
> ERROR ql.Driver: FAILED: Hive Internal Error: 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found 
> _dummy_table)
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155)
> at 
> org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HIVE-21398) Columns which has estimated statistics should not be considered as unique keys

2019-03-06 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-21398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-21398:
---


> Columns which has estimated statistics should not be considered as unique keys
> --
>
> Key: HIVE-21398
> URL: https://issues.apache.org/jira/browse/HIVE-21398
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> Right now for a column to qualify as a unique column it has to meet the 
> criteria: 
> {code}
> NDV >= numRows
> {code}
> when numRows is 1 this tends to be true ; but numRows is also 1 in cases when 
> we are kinda operate in the blind - don't know how many row there are - more 
> generatlly: with estimated column statistics.
> As a sideeffect of qualifying all columns to be unique; after a few joins all 
> column combinations became uniqueso for a join between 3 tables which 
> have (i,j,k) columns; then it will allocate {{i*j*k}} triplets of "unique 
> column triplets".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21314) Hive Replication not retaining the owner in the replicated table

2019-03-06 Thread mahesh kumar behera (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785700#comment-16785700
 ] 

mahesh kumar behera commented on HIVE-21314:


03.patch committed to master, Thanks [~sankarh] for review.

> Hive Replication not retaining the owner in the replicated table
> 
>
> Key: HIVE-21314
> URL: https://issues.apache.org/jira/browse/HIVE-21314
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-21314.01.patch, HIVE-21314.02.patch, 
> HIVE-21314.03.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Hive Replication not retaining the owner in the replicated table. The owner 
> for the target table is set same as the user executing the load command. The 
> user information should be read from the dump metadata and should be used 
> while creating the table at target cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-20848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785696#comment-16785696
 ] 

Hive QA commented on HIVE-20848:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
49s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16364/dev-support/hive-personality.sh
 |
| git revision | master / 0413fec |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16364/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> After setting UpdateInputAccessTimeHook query fail with Table Not Found.
> 
>
> Key: HIVE-20848
> URL: https://issues.apache.org/jira/browse/HIVE-20848
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
>Reporter: Rajkumar Singh
>Assignee: Rajkumar Singh
>Priority: Major
> Attachments: HIVE-20848.01.patch, HIVE-20848.patch
>
>
> {code}
>  select from_unixtime(1540495168); 
>  set 
> hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;
>  select from_unixtime(1540495168); 
> {code}
> the second select fail with following exception
> {code}
> ERROR ql.Driver: FAILED: Hive Internal Error: 
> org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found 
> _dummy_table)
> org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
> _dummy_table
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155)
> at 
> org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> 

[jira] [Commented] (HIVE-21336) HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char

2019-03-06 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785677#comment-16785677
 ] 

Naveen Gangam commented on HIVE-21336:
--

so in the 
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-2.3.0-to-3.0.0.oracle.sql
 script, 
{code}
-- Rebuild the index for Part col stats.  No such index for table stats, which 
seems weird
DROP INDEX PCS_STATS_IDX;
CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS (CAT_NAME, 
DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME);
{code}

So when upgrading from say 2.3, with CHAR settings, the upgrade script will 
fail to create this index because of the length of the columns at this point. I 
do think it is necessary if we are upgrading. Would you agree? Thanks


> HMS Index PCS_STATS_IDX too long for Oracle when NLS_LENGTH_SEMANTICS=char
> --
>
> Key: HIVE-21336
> URL: https://issues.apache.org/jira/browse/HIVE-21336
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-21336.2.patch, HIVE-21336.3.patch, HIVE-21336.patch
>
>
> CREATE INDEX PCS_STATS_IDX ON PAR T_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> Error: ORA-01450: maximum key length (6398) exceeded (state=72000,code=1450) 
> Customer tried the same DDL in SQLDevloper, and got the same error. This 
> could be a result of combination of DB level settings like the db_block_size, 
> limiting the maximum key length, as per below doc: 
> http://www.dba-oracle.com/t_ora_01450_maximum_key_length_exceeded.htm 
> Also {{NLS_LENGTH_SEMANTICS}} is by default BYTE, but users can set this at 
> the session level to CHAR, thus reducing the max size of the index length. We 
> have increased the size of the COLUMN_NAME from 128 to 767 (used to be at 
> 1000) and TABLE_NAME from 128 to 256. This by setting 
> {code} 
> CREATE TABLE PART_COL_STATS ( 
> CS_ID NUMBER NOT NULL, 
> DB_NAME VARCHAR2(128) NOT NULL, 
> TABLE_NAME VARCHAR2(256) NOT NULL, 
> PARTITION_NAME VARCHAR2(767) NOT NULL, 
> COLUMN_NAME VARCHAR2(767) NOT NULL,  
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> {code} 
> Reproducer: 
> {code} 
> SQL*Plus: Release 11.2.0.2.0 Production on Wed Feb 27 11:02:16 2019 Copyright 
> (c) 1982, 2011, Oracle. All rights reserved. 
> Connected to: Oracle Database 11g Express Edition Release 11.2.0.2.0 - 64bit 
> Production 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> BYTE 
> SQL> alter session set NLS_LENGTH_SEMANTICS=CHAR; Session altered. 
> SQL> commit; Commit complete. 
> SQL> select * from v$nls_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; 
> PARAMETER 
>  
> VALUE 
>  
> NLS_LENGTH_SEMANTICS 
> CHAR 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME) 
> * ERROR at line 1: ORA-01450: maximum key length (6398) exceeded 
> SQL> alter session set NLS_LENGTH_SEMANTICS=BYTE; 
> Session altered. 
> SQL> commit; 
> Commit complete. 
> SQL> drop table PART_COL_STATS; 
> Table dropped. 
> SQL> commit; 
> Commit complete. 
> SQL> CREATE TABLE PART_COL_STATS (CS_ID NUMBER NOT NULL, DB_NAME 
> VARCHAR2(128) NOT NULL, TABLE_NAME VARCHAR2(256) NOT NULL, PARTITION_NAME 
> VARCHAR2(767) NOT NULL, COLUMN_NAME VARCHAR2(767) NOT NULL); 
> Table created. 
> SQL> CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
> (DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME); 
> Index created. 
> SQL> commit; 
> Commit complete. 
> SQL> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21337) HMS Metadata migration from Postgres/Derby to other DBs fail

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785657#comment-16785657
 ] 

Hive QA commented on HIVE-21337:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12961254/HIVE-21337.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 15819 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[test_teradatabinaryfile] 
(batchId=2)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/16363/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/16363/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-16363/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12961254 - PreCommit-HIVE-Build

> HMS Metadata migration from Postgres/Derby to other DBs fail
> 
>
> Key: HIVE-21337
> URL: https://issues.apache.org/jira/browse/HIVE-21337
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-21337.2.patch, HIVE-21337.patch
>
>
> Customer recently was migrating from Postgres to Oracle for HMS metastore. 
> During import of the [exported] data from HMS metastore from postgres, 
> failures are seen as the COLUMNS_V2.COMMENT is 4000 bytes long whereas oracle 
> and other schemas define it to be 256 bytes.
> This inconsistency in the schema makes the migration cumbersome and manual. 
> This jira makes this column consistent in length across all databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-03-06 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785628#comment-16785628
 ] 

Laszlo Bodor edited comment on HIVE-21293 at 3/6/19 1:29 PM:
-

[~jcamachorodriguez]: thanks for the suggestion, I tried it with:
{code}
booleanValue
:
KW_TRUE^ | KW_FALSE^ | KW_UNKNOWN -> TOK_NULL
;
{code}
but I got the original warning. Did I use it in the wrong way? As far as I 
understand, this rewrite rule will still let the grammar fall into booleanValue 
rule on "unknown" keyword. 
One thing that's not clear to me is: what benefit we have by having "unknown" 
as a boolean constant, but not a reserved keyword (although it indeed seems 
more straightforward).


was (Author: abstractdog):
[~jcamachorodriguez]: thanks for the suggestion, I tried it with:
{code}
booleanValue
:
KW_TRUE^ | KW_FALSE^ | KW_UNKNOWN -> TOK_NULL
;
{code}
but I got the original issue. Did I use it in the wrong way? As far as I 
understand, this rewrite rule will still let the grammar fall into booleanValue 
rule on "unknown" keyword. 
One thing that's not clear to me is: what benefit we have by having "unknown" 
as a boolean constant, but not a reserved keyword (although it indeed seems 
more straightforward).

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch, HIVE-21293.02.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21293) Fix ambiguity in grammar warnings at compilation time (II)

2019-03-06 Thread Laszlo Bodor (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785628#comment-16785628
 ] 

Laszlo Bodor commented on HIVE-21293:
-

[~jcamachorodriguez]: thanks for the suggestion, I tried it with:
{code}
booleanValue
:
KW_TRUE^ | KW_FALSE^ | KW_UNKNOWN -> TOK_NULL
;
{code}
but I got the original issue. Did I use it in the wrong way? As far as I 
understand, this rewrite rule will still let the grammar fall into booleanValue 
rule on "unknown" keyword. 
One thing that's not clear to me is: what benefit we have by having "unknown" 
as a boolean constant, but not a reserved keyword (although it indeed seems 
more straightforward).

> Fix ambiguity in grammar warnings at compilation time (II)
> --
>
> Key: HIVE-21293
> URL: https://issues.apache.org/jira/browse/HIVE-21293
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-21293.01.patch, HIVE-21293.02.patch
>
>
> These are the warnings at compilation time:
> {code}
> warning(200): IdentifiersParser.g:424:5:
> Decision can match input such as "KW_UNKNOWN" using multiple alternatives: 1, 
> 10
> As a result, alternative(s) 10 were disabled for that input
> {code}
> This means that multiple parser rules can match certain query text, possibly 
> leading to unexpected errors at parsing time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HIVE-21337) HMS Metadata migration from Postgres/Derby to other DBs fail

2019-03-06 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785614#comment-16785614
 ] 

Hive QA commented on HIVE-21337:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-16363/dev-support/hive-personality.sh
 |
| git revision | master / 0413fec |
| Default Java | 1.8.0_111 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-16363/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HMS Metadata migration from Postgres/Derby to other DBs fail
> 
>
> Key: HIVE-21337
> URL: https://issues.apache.org/jira/browse/HIVE-21337
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 3.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Minor
> Attachments: HIVE-21337.2.patch, HIVE-21337.patch
>
>
> Customer recently was migrating from Postgres to Oracle for HMS metastore. 
> During import of the [exported] data from HMS metastore from postgres, 
> failures are seen as the COLUMNS_V2.COMMENT is 4000 bytes long whereas oracle 
> and other schemas define it to be 256 bytes.
> This inconsistency in the schema makes the migration cumbersome and manual. 
> This jira makes this column consistent in length across all databases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >