[jira] [Work logged] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22824?focusedWorklogId=393210=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393210
 ]

ASF GitHub Bot logged work on HIVE-22824:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 07:14
Start Date: 26/Feb/20 07:14
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #897: HIVE-22824: 
JoinProjectTranspose rule should skip Projects containing…
URL: https://github.com/apache/hive/pull/897#discussion_r384308354
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
 ##
 @@ -487,7 +483,7 @@ Operator genOPTree(PlannerContext plannerCtx) throws 
SemanticException {
 ASTNode newAST = getOptimizedAST(newPlan);
 
 // 1.1. Fix up the query for insert/ctas/materialized views
-newAST = fixUpAfterCbo(this.getAST(), newAST, cboCtx);
 
 Review comment:
   ok; after all we want to have cbo on more often than off 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393210)
Time Spent: 40m  (was: 0.5h)

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21410) find out the actual port number when hive.server2.thrift.port=0

2020-02-25 Thread zuotingbing (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zuotingbing updated HIVE-21410:
---
Description: 
Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
actual port number which one we should use when using beeline to connect. Log 
the actual port number would help us to better get the correct connect uri.

 

before fixed:

!2019-03-08_163705.png!

after fixed:

!2019-03-08_163747.png!

  was:
before fixed:

!2019-03-08_163705.png!

after fixed:

!2019-03-08_163747.png!


> find out the actual port number when hive.server2.thrift.port=0
> ---
>
> Key: HIVE-21410
> URL: https://issues.apache.org/jira/browse/HIVE-21410
> Project: Hive
>  Issue Type: Improvement
>Reporter: zuotingbing
>Assignee: zuotingbing
>Priority: Minor
> Attachments: 2019-03-08_163705.png, 2019-03-08_163747.png, 
> HIVE-21410.patch
>
>
> Currently, if we set *hive.server2.thrift.port=0*, it hard to find out the 
> actual port number which one we should use when using beeline to connect. Log 
> the actual port number would help us to better get the correct connect uri.
>  
> before fixed:
> !2019-03-08_163705.png!
> after fixed:
> !2019-03-08_163747.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-25 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-22929:
-

Assignee: Krisztian Kasa

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22840) Race condition in formatters of TimestampColumnVector and DateColumnVector

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045154#comment-17045154
 ] 

Hive QA commented on HIVE-22840:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994602/HIVE-22840.03.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 18075 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=92)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20831/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20831/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20831/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994602 - PreCommit-HIVE-Build

> Race condition in formatters of TimestampColumnVector and DateColumnVector 
> ---
>
> Key: HIVE-22840
> URL: https://issues.apache.org/jira/browse/HIVE-22840
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: László Bodor
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22840.03.patch, HIVE-22840.1.patch, 
> HIVE-22840.2.patch, HIVE-22840.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-22405 added support for proleptic calendar. It uses java's 
> SimpleDateFormat/Calendar APIs which are not thread-safe and cause race in 
> some scenarios. 
> As a result of those race conditions, we see some exceptions like
> {code:java}
> 1) java.lang.NumberFormatException: For input string: "" 
> OR 
> java.lang.NumberFormatException: For input string: ".821582E.821582E44"
> OR
> 2) Caused by: java.lang.ArrayIndexOutOfBoundsException: -5325980
>   at 
> sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:453)
>   at 
> java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2397)
> {code}
> This issue is to address those thread-safety issues/race conditions.
> cc [~jcamachorodriguez] [~abstractdog] [~omalley]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22840) Race condition in formatters of TimestampColumnVector and DateColumnVector

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045137#comment-17045137
 ] 

Hive QA commented on HIVE-22840:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
24s{color} | {color:blue} storage-api in master has 58 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} serde in master has 197 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
50s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} storage-api: The patch generated 4 new + 17 unchanged 
- 3 fixed = 21 total (was 20) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} common: The patch generated 0 new + 0 unchanged - 2 
fixed = 0 total (was 2) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch serde passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} The patch ql passed checkstyle {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 2 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
33s{color} | {color:green} storage-api generated 0 new + 48 unchanged - 10 
fixed = 48 total (was 58) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} serde in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
27s{color} | {color:green} ql in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  
xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20831/dev-support/hive-personality.sh
 |
| git revision | master / 0280984 |
| Default Java | 1.8.0_111 |
| findbugs | 

[jira] [Updated] (HIVE-22840) Race condition in formatters of TimestampColumnVector and DateColumnVector

2020-02-25 Thread Shubham Chaurasia (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-22840:
-
Attachment: HIVE-22840.03.patch

> Race condition in formatters of TimestampColumnVector and DateColumnVector 
> ---
>
> Key: HIVE-22840
> URL: https://issues.apache.org/jira/browse/HIVE-22840
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: László Bodor
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22840.03.patch, HIVE-22840.1.patch, 
> HIVE-22840.2.patch, HIVE-22840.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-22405 added support for proleptic calendar. It uses java's 
> SimpleDateFormat/Calendar APIs which are not thread-safe and cause race in 
> some scenarios. 
> As a result of those race conditions, we see some exceptions like
> {code:java}
> 1) java.lang.NumberFormatException: For input string: "" 
> OR 
> java.lang.NumberFormatException: For input string: ".821582E.821582E44"
> OR
> 2) Caused by: java.lang.ArrayIndexOutOfBoundsException: -5325980
>   at 
> sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:453)
>   at 
> java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2397)
> {code}
> This issue is to address those thread-safety issues/race conditions.
> cc [~jcamachorodriguez] [~abstractdog] [~omalley]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22930) Performance: ASTNode::getName() allocates within the walker loops

2020-02-25 Thread Gopal Vijayaraghavan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045106#comment-17045106
 ] 

Gopal Vijayaraghavan commented on HIVE-22930:
-

No idea why that is even used here - the numbers here are not constants across 
builds.

> Performance: ASTNode::getName() allocates within the walker loops
> -
>
> Key: HIVE-22930
> URL: https://issues.apache.org/jira/browse/HIVE-22930
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Priority: Major
> Attachments: ASTNode-name.png
>
>
> {code}
>   /*
>* (non-Javadoc)
>*
>* @see org.apache.hadoop.hive.ql.lib.Node#getName()
>*/
>   @Override
>   public String getName() {
> return String.valueOf(super.getToken().getType());
>   }
> {code}
>  !ASTNode-name.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045064#comment-17045064
 ] 

Jesus Camacho Rodriguez commented on HIVE-22824:


+1 (pending tests)

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name

2020-02-25 Thread Thomas Poepping (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045061#comment-17045061
 ] 

Thomas Poepping commented on HIVE-22928:


Also, it's been a _really_ long time since I've contributed. Are we still doing 
patch files? GitHub pull requests?

> Allow hive.exec.stagingdir to be a fully qualified directory name
> -
>
> Key: HIVE-22928
> URL: https://issues.apache.org/jira/browse/HIVE-22928
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Hive
>Affects Versions: 3.1.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>Priority: Minor
>
> Currently, {{hive.exec.stagingdir}} can only be set as a relative directory 
> name that, for operations like {{insert}} or {{insert overwrite}}, will be 
> placed either under the table directory or the partition directory. 
> For cases where an HDFS cluster is small but the data being inserted is very 
> large (greater than the capacity of the HDFS cluster, as mentioned in a 
> comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their 
> staging directory to be an explicit blobstore path (or any filesystem path), 
> rather than relying on Hive to intelligently build the blobstore path based 
> on an interpretation of the job. We may lose locality guarantees, but because 
> renames are just as expensive on blobstores no matter what the prefix is, 
> this isn't considered a terribly large loss (assuming only blobstore 
> customers use this functionality).
> Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually 
> suffice in this case, as the stagingdir is not the same.
> This commit enables Hive customers to set an absolute location for all 
> staging directories. For instances where the configured stagingdir scheme is 
> not the same as the scheme for the table location, the default stagingdir 
> configuration is used. This avoids a cross-filesystem rename, which is 
> impossible anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22827) Update Flatbuffer version

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045053#comment-17045053
 ] 

Hive QA commented on HIVE-22827:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994580/HIVE-22827.99.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 18073 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[cbo_query31] 
(batchId=305)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testComplexQuery (batchId=290)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testDataTypes (batchId=290)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testEscapedStrings (batchId=290)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testLlapInputFormatEndToEnd 
(batchId=290)
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testNonAsciiStrings (batchId=290)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testComplexQuery 
(batchId=292)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testDataTypes (batchId=292)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testEscapedStrings 
(batchId=292)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testLlapInputFormatEndToEnd
 (batchId=292)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testNonAsciiStrings 
(batchId=292)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testTypesNestedInListWithLimitAndFilters
 (batchId=292)
org.apache.hive.jdbc.TestJdbcWithMiniLlapVectorArrow.testTypesNestedInMapWithLimitAndFilters
 (batchId=292)
org.apache.hive.jdbc.TestNewGetSplitsFormat.testComplexQuery (batchId=290)
org.apache.hive.jdbc.TestNewGetSplitsFormat.testDataTypes (batchId=290)
org.apache.hive.jdbc.TestNewGetSplitsFormat.testEscapedStrings (batchId=290)
org.apache.hive.jdbc.TestNewGetSplitsFormat.testLlapInputFormatEndToEnd 
(batchId=290)
org.apache.hive.jdbc.TestNewGetSplitsFormat.testNonAsciiStrings (batchId=290)
org.apache.hive.jdbc.TestNewGetSplitsFormatReturnPath.testComplexQuery 
(batchId=292)
org.apache.hive.jdbc.TestNewGetSplitsFormatReturnPath.testDataTypes 
(batchId=292)
org.apache.hive.jdbc.TestNewGetSplitsFormatReturnPath.testEscapedStrings 
(batchId=292)
org.apache.hive.jdbc.TestNewGetSplitsFormatReturnPath.testLlapInputFormatEndToEnd
 (batchId=292)
org.apache.hive.jdbc.TestNewGetSplitsFormatReturnPath.testNonAsciiStrings 
(batchId=292)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20829/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20829/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20829/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 23 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994580 - PreCommit-HIVE-Build

> Update Flatbuffer version
> -
>
> Key: HIVE-22827
> URL: https://issues.apache.org/jira/browse/HIVE-22827
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22827.99.patch, HIVE-22827.patch
>
>
> Hive currently uses Flatbuffer 1.2.0. Other Apache projects use a more 
> up-to-date version, e.g. 1.6.0.1. Upgrade to that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22928) Allow hive.exec.stagingdir to be a fully qualified directory name

2020-02-25 Thread Thomas Poepping (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Poepping reassigned HIVE-22928:
--


> Allow hive.exec.stagingdir to be a fully qualified directory name
> -
>
> Key: HIVE-22928
> URL: https://issues.apache.org/jira/browse/HIVE-22928
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Hive
>Affects Versions: 3.1.2
>Reporter: Thomas Poepping
>Assignee: Thomas Poepping
>Priority: Minor
>
> Currently, {{hive.exec.stagingdir}} can only be set as a relative directory 
> name that, for operations like {{insert}} or {{insert overwrite}}, will be 
> placed either under the table directory or the partition directory. 
> For cases where an HDFS cluster is small but the data being inserted is very 
> large (greater than the capacity of the HDFS cluster, as mentioned in a 
> comment by [~ashutoshc] on [HIVE-14270]), the client may want to set their 
> staging directory to be an explicit blobstore path (or any filesystem path), 
> rather than relying on Hive to intelligently build the blobstore path based 
> on an interpretation of the job. We may lose locality guarantees, but because 
> renames are just as expensive on blobstores no matter what the prefix is, 
> this isn't considered a terribly large loss (assuming only blobstore 
> customers use this functionality).
> Note that {{hive.blobstore.use.blobstore.as.scratchdir}} doesn't actually 
> suffice in this case, as the stagingdir is not the same.
> This commit enables Hive customers to set an absolute location for all 
> staging directories. For instances where the configured stagingdir scheme is 
> not the same as the scheme for the table location, the default stagingdir 
> configuration is used. This avoids a cross-filesystem rename, which is 
> impossible anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-25 Thread Vineet Garg (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045052#comment-17045052
 ] 

Vineet Garg commented on HIVE-22824:


[~jcamachorodriguez] Added the test and TODO in latest change. Also opened 
CALCITE-3824

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393070
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384197923
 
 

 ##
 File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 ##
 @@ -2519,6 +2519,9 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 "higher compute cost. (NDV means the number of distinct 
values.). It only affects the FM-Sketch \n" +
 "(not the HLL algorithm which is the default), where it 
computes the number of necessary\n" +
 " bitvectors to achieve the accuracy."),
+HIVE_STATS_USE_UDF_ESTIMATORS("hive.stats.use.statestimators", true,
+"Statestimators are able to provide more accurate column statistic 
infos for UDF results."),
 
 Review comment:
   Statestimators -> Estimators
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393070)
Time Spent: 0.5h  (was: 20m)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393063=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393063
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384197505
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreSchemaInfo.java
 ##
 @@ -132,9 +133,11 @@ public String generateInitFileName(String toVersion) 
throws HiveMetaException {
 String initScriptName = INIT_FILE_PREFIX + toVersion + "." +
 dbType + SQL_FILE_EXTENSION;
 // check if the file exists
-if (!(new File(getMetaStoreScriptDir() + File.separatorChar +
-  initScriptName).exists())) {
-  throw new HiveMetaException("Unknown version specified for 
initialization: " + toVersion);
+File file = new File(getMetaStoreScriptDir() + File.separatorChar +
+  initScriptName);
+if (!(file.exists())) {
 
 Review comment:
   nit. enclosing () for file not needed
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393063)
Time Spent: 20m  (was: 10m)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393069=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393069
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384197755
 
 

 ##
 File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 ##
 @@ -2519,6 +2519,9 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 "higher compute cost. (NDV means the number of distinct 
values.). It only affects the FM-Sketch \n" +
 "(not the HLL algorithm which is the default), where it 
computes the number of necessary\n" +
 " bitvectors to achieve the accuracy."),
+HIVE_STATS_USE_UDF_ESTIMATORS("hive.stats.use.statestimators", true,
 
 Review comment:
   'hive.stats.use.statestimators' -> 'hive.stats.estimators.enable'
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393069)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393072=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393072
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384199270
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -73,6 +74,9 @@
 import org.apache.hadoop.hive.ql.plan.Statistics;
 import org.apache.hadoop.hive.ql.plan.Statistics.State;
 import org.apache.hadoop.hive.ql.stats.BasicStats.Factory;
+import org.apache.hadoop.hive.ql.stats.estimator.IStatEstimator;
 
 Review comment:
   Currently in Hive we do not seem to prefix interfaces with I (maybe we 
should start doing it and I know other projects do, but I wanted to mention it 
since it does not follow current convention).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393072)
Time Spent: 0.5h  (was: 20m)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393068=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393068
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384203016
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/udf/UDFSubstr.java
 ##
 @@ -131,4 +137,52 @@ public BytesWritable evaluate(BytesWritable bw, 
IntWritable pos, IntWritable len
   public BytesWritable evaluate(BytesWritable bw, IntWritable pos){
 return evaluate(bw, pos, maxValue);
   }
+
+  @Override
+  public Optional getStatEstimator() {
+return Optional.of(new SubStrStatEstimator());
+  }
+
+  private static class SubStrStatEstimator implements IStatEstimator {
+
+@Override
+public Optional estimate(List csList) {
+  ColStatistics cs = csList.get(0).clone();
+
+  // this might bad in a skewed case; consider:
+  // 1 row with 1000 long string
+  // 99 rows with 0 length
+  // orig avg is 10
+  // new avg is 5 (if substr(5)) ; but in reality it will stay ~10
+  Optional start = getRangeWidth(csList.get(1).getRange());
+  Range startRange = csList.get(1).getRange();
+  if (startRange != null && startRange.minValue != null) {
+double newAvgColLen = cs.getAvgColLen() - 
startRange.minValue.doubleValue();
+if (newAvgColLen > 0) {
+  cs.setAvgColLen(newAvgColLen);
+}
+
 
 Review comment:
   you can remove \n
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393068)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393065=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393065
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384200546
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -1560,6 +1554,32 @@ public static ColStatistics 
getColStatisticsFromExpression(HiveConf conf, Statis
 }
   }
 
+  if (conf.getBoolVar(ConfVars.HIVE_STATS_USE_UDF_ESTIMATORS)) {
+Optional sep = 
engfd.getGenericUDF().adapt(IStatEstimatorProvider.class);
+if (sep.isPresent()) {
+  Optional se = sep.get().getStatEstimator();
 
 Review comment:
   Should we assume that if a UDF is an estimator provider, it should provide 
an estimator, and thus, throw an exception here if we cannot get the estimator?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393065)
Time Spent: 20m  (was: 10m)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393067
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384201501
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -1590,6 +1610,43 @@ public static ColStatistics 
getColStatisticsFromExpression(HiveConf conf, Statis
 return colStats;
   }
 
+  private static ColStatistics buildColStatForConstant(HiveConf conf, long 
numRows, ExprNodeConstantDesc encd) {
+
+long numNulls = 0;
+long countDistincts = 0;
+if (encd.getValue() == null) {
+  // null projection
+  numNulls = numRows;
+} else {
+  countDistincts = 1;
+}
+String colType = encd.getTypeString();
+colType = colType.toLowerCase();
+ObjectInspector oi = encd.getWritableObjectInspector();
+double avgColSize = getAvgColLenOf(conf, oi, colType);
+ColStatistics colStats = new ColStatistics(encd.getName(), colType);
+colStats.setAvgColLen(avgColSize);
+colStats.setCountDistint(countDistincts);
+colStats.setNumNulls(numNulls);
+
+Optional value = getLongConstValue(encd);
 
 Review comment:
   What about other types? Can we either add them in this patch, or add a 
comment and create a follow-up?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393067)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393066=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393066
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384200980
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java
 ##
 @@ -1560,6 +1554,32 @@ public static ColStatistics 
getColStatisticsFromExpression(HiveConf conf, Statis
 }
   }
 
+  if (conf.getBoolVar(ConfVars.HIVE_STATS_USE_UDF_ESTIMATORS)) {
+Optional sep = 
engfd.getGenericUDF().adapt(IStatEstimatorProvider.class);
+if (sep.isPresent()) {
+  Optional se = sep.get().getStatEstimator();
+  if (se.isPresent()) {
+List csList = new ArrayList();
+for (ExprNodeDesc child : engfd.getChildren()) {
+  ColStatistics cs = getColStatisticsFromExpression(conf, 
parentStats, child);
+  if (cs == null) {
+break;
+  }
+  csList.add(cs);
+}
+if(csList.size() == engfd.getChildren().size()) {
+  Optional res = se.get().estimate(csList);
+  if (res.isPresent()) {
 
 Review comment:
   When could this happen? Maybe when column stats has anything unexpected? 
Could we add a comment clarifying it?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393066)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393071
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384202325
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/stats/estimator/StatEstimators.java
 ##
 @@ -0,0 +1,51 @@
+package org.apache.hadoop.hive.ql.stats.estimator;
+
+import java.util.Optional;
+
+import org.apache.hadoop.hive.ql.plan.ColStatistics;
+
+public class StatEstimators {
+
+  public static class WorstStatCombiner {
 
 Review comment:
   Should this implement an interface? (I do not have a strong opinion, just a 
comment)
   
   Nit. Can the name be changed? DefaultCombiner? Then you can add in the 
comments that it considers  max upper bounds for the properties, etc.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393071)
Time Spent: 0.5h  (was: 20m)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=393064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-393064
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 26/Feb/20 00:09
Start Date: 26/Feb/20 00:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915#discussion_r384201744
 
 

 ##
 File path: 
ql/src/java/org/apache/hadoop/hive/ql/stats/estimator/StatEstimators.java
 ##
 @@ -0,0 +1,51 @@
+package org.apache.hadoop.hive.ql.stats.estimator;
+
+import java.util.Optional;
+
+import org.apache.hadoop.hive.ql.plan.ColStatistics;
+
+public class StatEstimators {
 
 Review comment:
   Could we add comments?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 393064)
Time Spent: 20m  (was: 10m)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22827) Update Flatbuffer version

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045027#comment-17045027
 ] 

Hive QA commented on HIVE-22827:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 12m 17s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20829/dev-support/hive-personality.sh
 |
| git revision | master / 0280984 |
| Default Java | 1.8.0_111 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20829/yetus/patch-asflicense-problems.txt
 |
| modules | C: serde U: serde |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20829/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Update Flatbuffer version
> -
>
> Key: HIVE-22827
> URL: https://issues.apache.org/jira/browse/HIVE-22827
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22827.99.patch, HIVE-22827.patch
>
>
> Hive currently uses Flatbuffer 1.2.0. Other Apache projects use a more 
> up-to-date version, e.g. 1.6.0.1. Upgrade to that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22920) Add row format OpenCSVSerde to the metastore column managed list

2020-02-25 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045018#comment-17045018
 ] 

Ramesh Kumar Thangarajan commented on HIVE-22920:
-

[~ashutoshc], Can you please help review this, I was able to get a green run?

> Add row format OpenCSVSerde to the metastore column managed list
> 
>
> Key: HIVE-22920
> URL: https://issues.apache.org/jira/browse/HIVE-22920
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22920.1.patch, HIVE-22920.2.patch
>
>
> Add row format OpenCSVSerde to the metastore column managed list



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22920) Add row format OpenCSVSerde to the metastore column managed list

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045016#comment-17045016
 ] 

Hive QA commented on HIVE-22920:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994573/HIVE-22920.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18074 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20828/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20828/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20828/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994573 - PreCommit-HIVE-Build

> Add row format OpenCSVSerde to the metastore column managed list
> 
>
> Key: HIVE-22920
> URL: https://issues.apache.org/jira/browse/HIVE-22920
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22920.1.patch, HIVE-22920.2.patch
>
>
> Add row format OpenCSVSerde to the metastore column managed list



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22829) Decimal64: NVL in vectorization miss NPE with CBO on

2020-02-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22829:

Attachment: HIVE-22829.3.patch
Status: Patch Available  (was: Open)

> Decimal64: NVL in vectorization miss NPE with CBO on
> 
>
> Key: HIVE-22829
> URL: https://issues.apache.org/jira/browse/HIVE-22829
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal Vijayaraghavan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22829.3.patch
>
>
> {code}
> select  
> sum(NVL(ss_sales_price, 1.0BD))
> from store_sales where ss_sold_date_sk %  = 1;
> {code}
> {code}
> | notVectorizedReason: exception: 
> java.lang.NullPointerException stack trace: 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4754),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4687),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.vectorizeSelectOperator(Vectorizer.java:4669),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperator(Vectorizer.java:5269),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChild(Vectorizer.java:977),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChildren(Vectorizer.java:864),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperatorTree(Vectorizer.java:834),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.access$2500(Vectorizer.java:245),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:2103),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:2055),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(Vectorizer.java:2030),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:1185),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:1017),
>  
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111),
>  
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180), 
> ... |
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22829) Decimal64: NVL in vectorization miss NPE with CBO on

2020-02-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22829:

Attachment: (was: HIVE-22829.3.patch)

> Decimal64: NVL in vectorization miss NPE with CBO on
> 
>
> Key: HIVE-22829
> URL: https://issues.apache.org/jira/browse/HIVE-22829
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal Vijayaraghavan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22829.3.patch
>
>
> {code}
> select  
> sum(NVL(ss_sales_price, 1.0BD))
> from store_sales where ss_sold_date_sk %  = 1;
> {code}
> {code}
> | notVectorizedReason: exception: 
> java.lang.NullPointerException stack trace: 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4754),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4687),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.vectorizeSelectOperator(Vectorizer.java:4669),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperator(Vectorizer.java:5269),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChild(Vectorizer.java:977),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChildren(Vectorizer.java:864),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperatorTree(Vectorizer.java:834),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.access$2500(Vectorizer.java:245),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:2103),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:2055),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(Vectorizer.java:2030),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:1185),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:1017),
>  
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111),
>  
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180), 
> ... |
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22829) Decimal64: NVL in vectorization miss NPE with CBO on

2020-02-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22829:

Status: Open  (was: Patch Available)

> Decimal64: NVL in vectorization miss NPE with CBO on
> 
>
> Key: HIVE-22829
> URL: https://issues.apache.org/jira/browse/HIVE-22829
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal Vijayaraghavan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22829.3.patch
>
>
> {code}
> select  
> sum(NVL(ss_sales_price, 1.0BD))
> from store_sales where ss_sold_date_sk %  = 1;
> {code}
> {code}
> | notVectorizedReason: exception: 
> java.lang.NullPointerException stack trace: 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4754),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.fixDecimalDataTypePhysicalVariations(Vectorizer.java:4687),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.vectorizeSelectOperator(Vectorizer.java:4669),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperator(Vectorizer.java:5269),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChild(Vectorizer.java:977),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.doProcessChildren(Vectorizer.java:864),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.validateAndVectorizeOperatorTree(Vectorizer.java:834),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.access$2500(Vectorizer.java:245),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:2103),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapOperators(Vectorizer.java:2055),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.validateAndVectorizeMapWork(Vectorizer.java:2030),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:1185),
>  
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:1017),
>  
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111),
>  
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180), 
> ... |
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22840) Race condition in formatters of TimestampColumnVector and DateColumnVector

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


[jira] [Commented] (HIVE-22920) Add row format OpenCSVSerde to the metastore column managed list

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044990#comment-17044990
 ] 

Hive QA commented on HIVE-22920:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
11s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
50s{color} | {color:blue} standalone-metastore/metastore-common in master has 
35 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
36s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
48s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 41m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20828/dev-support/hive-personality.sh
 |
| git revision | master / 2a35bbc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20828/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-common common ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20828/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add row format OpenCSVSerde to the metastore column managed list
> 
>
> Key: HIVE-22920
> URL: https://issues.apache.org/jira/browse/HIVE-22920
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22920.1.patch, HIVE-22920.2.patch
>
>
> Add row format OpenCSVSerde to the metastore column managed list



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22891) Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP Execution Mode

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22891:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master, thanks [~srahman] for your contribution and [~szita] for 
reviewing!

> Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP Execution Mode
> --
>
> Key: HIVE-22891
> URL: https://issues.apache.org/jira/browse/HIVE-22891
> Project: Hive
>  Issue Type: Task
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22891.01.patch, HIVE-22891.02.patch, 
> HIVE-22891.03.patch
>
>
> {code:java}
> try {
>   // TODO: refactor this out
>   if (pathToPartInfo == null) {
> MapWork mrwork;
> if (HiveConf.getVar(conf, 
> HiveConf.ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")) {
>   mrwork = (MapWork) Utilities.getMergeWork(jobConf);
>   if (mrwork == null) {
> mrwork = Utilities.getMapWork(jobConf);
>   }
> } else {
>   mrwork = Utilities.getMapWork(jobConf);
> }
> pathToPartInfo = mrwork.getPathToPartitionInfo();
>   }  PartitionDesc part = extractSinglePartSpec(hsplit);
>   inputFormat = HiveInputFormat.wrapForLlap(inputFormat, jobConf, part);
> } catch (HiveException e) {
>   throw new IOException(e);
> }
> {code}
> The above piece of code in CombineHiveRecordReader.java was introduced in 
> HIVE-15147. This overwrites inputFormat based on the PartitionDesc which is 
> not required in non-LLAP mode of execution as the method 
> HiveInputFormat.wrapForLlap() simply returns the previously defined 
> inputFormat in case of non-LLAP mode. The method call extractSinglePartSpec() 
> has some serious performance implications. If there are large no. of small 
> files, each call in the method extractSinglePartSpec() takes approx ~ (2 - 3) 
> seconds. Hence the same query which runs in Hive 1.x / Hive 2 is way faster 
> than the query run on latest hive.
> {code:java}
> 2020-02-11 07:15:04,701 INFO [main] 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl: Reading ORC rows from 
> 2020-02-11 07:15:06,468 WARN [main] 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader: Multiple partitions 
> found; not going to pass a part spec to LLAP IO: {{logdate=2020-02-03, 
> hour=01, event=win}} and {{logdate=2020-02-03, hour=02, event=act}}
> 2020-02-11 07:15:06,468 INFO [main] 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader: succeeded in getting 
> org.apache.hadoop.mapred.FileSplit{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22891) Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP Execution Mode

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22891:
---
Summary: Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP 
Execution Mode  (was: Skip PartitonDesc Extraction In CombineHiveRecord For 
Non-LLAP Execution Mode)

> Skip PartitionDesc Extraction In CombineHiveRecord For Non-LLAP Execution Mode
> --
>
> Key: HIVE-22891
> URL: https://issues.apache.org/jira/browse/HIVE-22891
> Project: Hive
>  Issue Type: Task
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22891.01.patch, HIVE-22891.02.patch, 
> HIVE-22891.03.patch
>
>
> {code:java}
> try {
>   // TODO: refactor this out
>   if (pathToPartInfo == null) {
> MapWork mrwork;
> if (HiveConf.getVar(conf, 
> HiveConf.ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")) {
>   mrwork = (MapWork) Utilities.getMergeWork(jobConf);
>   if (mrwork == null) {
> mrwork = Utilities.getMapWork(jobConf);
>   }
> } else {
>   mrwork = Utilities.getMapWork(jobConf);
> }
> pathToPartInfo = mrwork.getPathToPartitionInfo();
>   }  PartitionDesc part = extractSinglePartSpec(hsplit);
>   inputFormat = HiveInputFormat.wrapForLlap(inputFormat, jobConf, part);
> } catch (HiveException e) {
>   throw new IOException(e);
> }
> {code}
> The above piece of code in CombineHiveRecordReader.java was introduced in 
> HIVE-15147. This overwrites inputFormat based on the PartitionDesc which is 
> not required in non-LLAP mode of execution as the method 
> HiveInputFormat.wrapForLlap() simply returns the previously defined 
> inputFormat in case of non-LLAP mode. The method call extractSinglePartSpec() 
> has some serious performance implications. If there are large no. of small 
> files, each call in the method extractSinglePartSpec() takes approx ~ (2 - 3) 
> seconds. Hence the same query which runs in Hive 1.x / Hive 2 is way faster 
> than the query run on latest hive.
> {code:java}
> 2020-02-11 07:15:04,701 INFO [main] 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl: Reading ORC rows from 
> 2020-02-11 07:15:06,468 WARN [main] 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader: Multiple partitions 
> found; not going to pass a part spec to LLAP IO: {{logdate=2020-02-03, 
> hour=01, event=win}} and {{logdate=2020-02-03, hour=02, event=act}}
> 2020-02-11 07:15:06,468 INFO [main] 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader: succeeded in getting 
> org.apache.hadoop.mapred.FileSplit{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22927) LLAP should filter guaranteed tasks before killing in node heartbeat

2020-02-25 Thread Prasanth Jayachandran (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044971#comment-17044971
 ] 

Prasanth Jayachandran commented on HIVE-22927:
--

Instead of killing the error'ed out task attempts the original patch killed all 
attempts from a pinging node. This patch catches only the error'ed attempts and 
issues kill on them. 

+1, lgtm. Thanks for fixing!

> LLAP should filter guaranteed tasks before killing in node heartbeat 
> -
>
> Key: HIVE-22927
> URL: https://issues.apache.org/jira/browse/HIVE-22927
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22927.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22927) LLAP should filter guaranteed tasks before killing in node heartbeat

2020-02-25 Thread Prasanth Jayachandran (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-22927:


Assignee: Rajesh Balamohan

> LLAP should filter guaranteed tasks before killing in node heartbeat 
> -
>
> Key: HIVE-22927
> URL: https://issues.apache.org/jira/browse/HIVE-22927
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22927.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22832) Parallelise direct insert directory cleaning process

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044948#comment-17044948
 ] 

Hive QA commented on HIVE-22832:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994558/HIVE-22832.6.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 18073 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_subquery] 
(batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_transactional_full_acid]
 (batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_acid_no_masking] 
(batchId=27)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[mm_all] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_no_buckets]
 (batchId=185)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_vectorization_missing_cols]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[change_allowincompatible_vectorization_false_date]
 (batchId=187)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dp_counter_mm]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_only_empty_query]
 (batchId=188)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_overwrite]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part]
 (batchId=173)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_llap_io]
 (batchId=191)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_part_llap_io]
 (batchId=166)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_stats]
 (batchId=189)
org.apache.hadoop.hive.ql.TestTxnCommands.testMergeCase (batchId=361)
org.apache.hadoop.hive.ql.TestTxnCommands.testMergeDeleteUpdate (batchId=361)
org.apache.hadoop.hive.ql.TestTxnCommands.testQuotedIdentifier (batchId=361)
org.apache.hadoop.hive.ql.TestTxnCommands.testQuotedIdentifier2 (batchId=361)
org.apache.hadoop.hive.ql.TestTxnCommands2.testMerge (batchId=342)
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testMerge
 (batchId=356)
org.apache.hadoop.hive.ql.TestTxnCommandsWithSplitUpdateAndVectorization.testMergeCase
 (batchId=342)
org.apache.hadoop.hive.ql.TestTxnCommandsWithSplitUpdateAndVectorization.testMergeDeleteUpdate
 (batchId=342)
org.apache.hadoop.hive.ql.TestTxnCommandsWithSplitUpdateAndVectorization.testQuotedIdentifier
 (batchId=342)
org.apache.hadoop.hive.ql.TestTxnCommandsWithSplitUpdateAndVectorization.testQuotedIdentifier2
 (batchId=342)
org.apache.hadoop.hive.ql.TestTxnLoadData.testValidations (batchId=317)
org.apache.hadoop.hive.ql.parse.TestStatsReplicationScenariosACID.testForParallelBootstrapLoad
 (batchId=260)
org.apache.hadoop.hive.ql.parse.TestStatsReplicationScenariosACID.testMetadataOnlyDump
 (batchId=260)
org.apache.hadoop.hive.ql.parse.TestStatsReplicationScenariosACID.testNonParallelBootstrapLoad
 (batchId=260)
org.apache.hadoop.hive.ql.parse.TestStatsReplicationScenariosACID.testRetryFailure
 (batchId=260)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20827/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20827/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20827/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994558 - PreCommit-HIVE-Build

> Parallelise direct insert directory cleaning process
> 
>
> Key: HIVE-22832
> URL: https://issues.apache.org/jira/browse/HIVE-22832
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22832.1.patch, HIVE-22832.2.patch, 
> HIVE-22832.3.patch, HIVE-22832.4.patch, HIVE-22832.5.patch, HIVE-22832.6.patch
>
>
> Inside Utilities::handleDirectInsertTableFinalPath, the 
> cleanDirectInsertDirectories method is called sequentially for each element 
> of the directInsertDirectories list, which might have a large number of 
> elements depending on how many partitions were written. This current 
> sequential execution could be improved by parallelising the clean up process. 

[jira] [Commented] (HIVE-22832) Parallelise direct insert directory cleaning process

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044898#comment-17044898
 ] 

Hive QA commented on HIVE-22832:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 5 new + 104 unchanged - 2 
fixed = 109 total (was 106) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20827/dev-support/hive-personality.sh
 |
| git revision | master / 2a35bbc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20827/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20827/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20827/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Parallelise direct insert directory cleaning process
> 
>
> Key: HIVE-22832
> URL: https://issues.apache.org/jira/browse/HIVE-22832
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22832.1.patch, HIVE-22832.2.patch, 
> HIVE-22832.3.patch, HIVE-22832.4.patch, HIVE-22832.5.patch, HIVE-22832.6.patch
>
>
> Inside Utilities::handleDirectInsertTableFinalPath, the 
> cleanDirectInsertDirectories method is called sequentially for each element 
> of the directInsertDirectories list, which might have a large number of 
> elements depending on how many partitions were written. This current 
> sequential execution could be improved by parallelising the clean up process. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22527) Hive on Tez : Job of merging samll files will be submitted into another queue (default queue)

2020-02-25 Thread Naveen Gangam (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044841#comment-17044841
 ] 

Naveen Gangam commented on HIVE-22527:
--

[~zhangbutao] Could you please rebase the patch and attach a new patch for 
master so we could get this thru? Thanks

> Hive on Tez : Job of merging samll files will be submitted into another queue 
> (default queue)
> -
>
> Key: HIVE-22527
> URL: https://issues.apache.org/jira/browse/HIVE-22527
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: HIVE-22527-branch-3.1.0.patch, explain with merge 
> files.png, file merge job.png, hive logs.png
>
>
> Hive on Tez. We enable small file merge configuration with set 
> *hive.merge.tezfiles=true*. So , There will be another job launched for 
> merging files after sql job. However, the merge file job is submitted into 
> another yarn queue, not the queue of current beeline client session. It seems 
> that the merging files job start a new tez session with new conf which is 
> different the current session conf, leading to the merging file job goes into 
> default queue.
>  
> Attachment *hive logs.png* shows that current session queue is 
> *root.bdoc.production* ( String queueName = session.getQueueName();) incoming 
> queue name is *null* ( String confQueueName = 
> conf.get(TezConfiguration.TEZ_QUEUE_NAME);). In fact, we log in to the same 
> beeline client with *set tez.queue.name=* *root.bdoc.production,* and  all  
> jobs should be submitted into the same queue including file merge job.
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L445]
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L446]
>  
> Attachment *explain with merge files.png* shows that ** the stage-4 is 
> individual merge file job which is submitted into another yarn queue(default 
> queue), not the queue root.bdoc.production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044835#comment-17044835
 ] 

Hive QA commented on HIVE-22925:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994561/HIVE-22925.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1447 failed/errored test(s), 18051 tests 
executed
*Failed tests:*
{noformat}
TestJdbcWithMiniLlapArrow - did not produce a TEST-*.xml file (likely timed 
out) (batchId=290)
org.apache.hadoop.hive.cli.TestKuduCliDriver.testCliDriver[kudu_complex_queries]
 (batchId=296)
org.apache.hadoop.hive.cli.TestKuduCliDriver.testCliDriver[kudu_queries] 
(batchId=296)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_materialized_view_rewrite_ssb]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timeseries]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_timestamptz2]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druid_topn] 
(batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_dynamic_partition]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_expressions]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_extractTime]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_floorTime]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_mv] 
(batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_semijoin_reduction_all_types]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_alter]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_insert]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_ts]
 (batchId=204)
org.apache.hadoop.hive.cli.TestMiniDruidKafkaCliDriver.testCliDriver[druidkafkamini_basic]
 (batchId=305)
org.apache.hadoop.hive.cli.TestMiniHiveKafkaCliDriver.testCliDriver[kafka_storage_handler]
 (batchId=305)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[alter_table_location2]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[alter_table_location3]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[bucket5] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[bucket6] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cte_2] 
(batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cte_4] 
(batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[cttl] 
(batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_semijoin_user_level]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynpart_cast] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[empty_dir_in_table]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[except_distinct] 
(batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[external_table_purge]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[external_table_with_space_in_location_path]
 (batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[file_with_header_footer]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[file_with_header_footer_aggregation]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[global_limit] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[import_exported_table]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[insert_into1] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[insert_into2] 
(batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_all] 
(batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_distinct]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[intersect_merge] 
(batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_nullscan] 
(batchId=159)

[jira] [Commented] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044789#comment-17044789
 ] 

Hive QA commented on HIVE-22925:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
2s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
56s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} common: The patch generated 3 new + 371 unchanged - 0 
fixed = 374 total (was 371) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 20 new + 44 unchanged - 1 
fixed = 64 total (was 45) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
58s{color} | {color:red} ql generated 1 new + 99 unchanged - 1 fixed = 100 
total (was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20826/dev-support/hive-personality.sh
 |
| git revision | master / 2a35bbc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20826/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20826/yetus/diff-checkstyle-ql.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20826/yetus/diff-javadoc-javadoc-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20826/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20826/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For 

[jira] [Updated] (HIVE-22827) Update Flatbuffer version

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22827:
---
Status: Patch Available  (was: Reopened)

> Update Flatbuffer version
> -
>
> Key: HIVE-22827
> URL: https://issues.apache.org/jira/browse/HIVE-22827
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22827.99.patch, HIVE-22827.patch
>
>
> Hive currently uses Flatbuffer 1.2.0. Other Apache projects use a more 
> up-to-date version, e.g. 1.6.0.1. Upgrade to that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22827) Update Flatbuffer version

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-22827:
---
Attachment: HIVE-22827.99.patch

> Update Flatbuffer version
> -
>
> Key: HIVE-22827
> URL: https://issues.apache.org/jira/browse/HIVE-22827
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22827.99.patch, HIVE-22827.patch
>
>
> Hive currently uses Flatbuffer 1.2.0. Other Apache projects use a more 
> up-to-date version, e.g. 1.6.0.1. Upgrade to that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HIVE-22827) Update Flatbuffer version

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reopened HIVE-22827:


> Update Flatbuffer version
> -
>
> Key: HIVE-22827
> URL: https://issues.apache.org/jira/browse/HIVE-22827
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22827.patch
>
>
> Hive currently uses Flatbuffer 1.2.0. Other Apache projects use a more 
> up-to-date version, e.g. 1.6.0.1. Upgrade to that version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21487) COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044764#comment-17044764
 ] 

Hive QA commented on HIVE-21487:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994557/HIVE-21487.06.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 18073 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1 (batchId=281)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20825/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20825/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20825/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994557 - PreCommit-HIVE-Build

> COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes
> 
>
> Key: HIVE-21487
> URL: https://issues.apache.org/jira/browse/HIVE-21487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-21487.03.patch, HIVE-21487.04.patch, 
> HIVE-21487.05.patch, HIVE-21487.06.patch, HIVE-21847.01.patch, 
> HIVE-21847.02.patch
>
>
> Looking at a MySQL install where HMS is pointed on Hive 3.1, I see a constant 
> stream of queries of the form:
> {code}
> select CC_STATE from COMPLETED_COMPACTIONS where CC_DATABASE = 
> 'tpcds_orc_exact_1000' and CC_TABLE = 'catalog_returns' and CC_PARTITION = 
> 'cr_returned_date_sk=2452851' and CC_STATE != 'a' order by CC_ID desc;
> {code}
> but the COMPLETED_COMPACTIONS table has no index. In this case it's resulting 
> in a full table scan over 115k rows, which takes around 100ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-25 Thread Jesus Camacho Rodriguez (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044759#comment-17044759
 ] 

Jesus Camacho Rodriguez commented on HIVE-22824:


A couple of comments:
- Can we add a test case for the issue? Currently the following seems to fail:
{code}
CREATE TABLE table1 (a INT, b INT);
INSERT INTO table1 VALUES (1, 2), (1, 2), (1, 2), (1, 2);

EXPLAIN CBO
SELECT sub1.r FROM
(
SELECT
RANK() OVER (ORDER BY t1.b desc) as r
FROM table1 t1
JOIN table1 t2 ON t1.a = t2.b
) sub1
LEFT OUTER JOIN table1 t3
ON sub1.r = t3.a;
{code}
- Can we create a Calcite issue and contribute the fix? iiuc, this issue can 
lead to incorrect rewriting, and thus, results. Further, please create a Hive 
issue / leave a note to remove the logic in Hive's {{JoinProjectTransposeRule}} 
once we upgrade?

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-21487) COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044747#comment-17044747
 ] 

Hive QA commented on HIVE-21487:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
1s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
35s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
22s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
52s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
14s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 55s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20825/dev-support/hive-personality.sh
 |
| git revision | master / 2a35bbc |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20825/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-server . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20825/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes
> 
>
> Key: HIVE-21487
> URL: https://issues.apache.org/jira/browse/HIVE-21487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-21487.03.patch, HIVE-21487.04.patch, 
> HIVE-21487.05.patch, HIVE-21487.06.patch, HIVE-21847.01.patch, 
> HIVE-21847.02.patch
>
>
> Looking at a MySQL install where HMS is pointed on Hive 3.1, I see a constant 
> stream of queries of the form:
> {code}
> select CC_STATE from COMPLETED_COMPACTIONS where CC_DATABASE = 
> 'tpcds_orc_exact_1000' and CC_TABLE = 'catalog_returns' and CC_PARTITION = 
> 'cr_returned_date_sk=2452851' and CC_STATE != 'a' order by CC_ID desc;
> {code}
> but the COMPLETED_COMPACTIONS table has no index. In this case it's resulting 
> in a full table scan over 115k rows, which takes around 100ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-22910) CBO fails when subquery with rank left joined

2020-02-25 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-22910.
---
Resolution: Duplicate

HIVE-22824

> CBO fails when subquery with rank left joined
> -
>
> Key: HIVE-22910
> URL: https://issues.apache.org/jira/browse/HIVE-22910
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> *Repro*
> {code}
> CREATE TABLE table1(a int, b int);
> ANALYZE TABLE table1 COMPUTE STATISTICS FOR COLUMNS;
> EXPLAIN CBO
> SELECT sub1.r FROM
> (
> SELECT
> RANK() OVER (ORDER BY t1.b desc) as r
> FROM table1 t1
> JOIN table1 t2 ON t1.a = t2.b
> ) sub1
> LEFT OUTER JOIN table1 t3
> ON sub1.r = t3.a;
> {code}
> {code}
> See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, 
> or check ./ql/target/surefire-reports or 
> ./itests/qtest/target/surefire-reports/ for specific test cases logs.
>  org.apache.hadoop.hive.ql.parse.SemanticException: Line 0:-1 Invalid column 
> reference 'b': (possible column names are: $hdt$_0.a, $hdt$_0.b, $hdt$_1.b)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:13089)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13031)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12999)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genJoinKeys(SemanticAnalyzer.java:9248)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genJoinOperator(SemanticAnalyzer.java:9409)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genJoinPlan(SemanticAnalyzer.java:9624)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11781)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11661)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:534)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12547)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:361)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
>   at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:219)
>   at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:103)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:183)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:594)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:540)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:534)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:249)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:415)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:346)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:709)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:679)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:169)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
>   at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> 

[jira] [Updated] (HIVE-22920) Add row format OpenCSVSerde to the metastore column managed list

2020-02-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22920:

Attachment: HIVE-22920.2.patch
Status: Patch Available  (was: Open)

> Add row format OpenCSVSerde to the metastore column managed list
> 
>
> Key: HIVE-22920
> URL: https://issues.apache.org/jira/browse/HIVE-22920
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22920.1.patch, HIVE-22920.2.patch
>
>
> Add row format OpenCSVSerde to the metastore column managed list



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22920) Add row format OpenCSVSerde to the metastore column managed list

2020-02-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22920:

Attachment: (was: HIVE-22920.2.patch)

> Add row format OpenCSVSerde to the metastore column managed list
> 
>
> Key: HIVE-22920
> URL: https://issues.apache.org/jira/browse/HIVE-22920
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22920.1.patch
>
>
> Add row format OpenCSVSerde to the metastore column managed list



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22920) Add row format OpenCSVSerde to the metastore column managed list

2020-02-25 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22920:

Status: Open  (was: Patch Available)

> Add row format OpenCSVSerde to the metastore column managed list
> 
>
> Key: HIVE-22920
> URL: https://issues.apache.org/jira/browse/HIVE-22920
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: HIVE-22920.1.patch
>
>
> Add row format OpenCSVSerde to the metastore column managed list



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22824?focusedWorklogId=392747=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392747
 ]

ASF GitHub Bot logged work on HIVE-22824:
-

Author: ASF GitHub Bot
Created on: 25/Feb/20 18:08
Start Date: 25/Feb/20 18:08
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on pull request #897: 
HIVE-22824: JoinProjectTranspose rule should skip Projects containing…
URL: https://github.com/apache/hive/pull/897#discussion_r384034467
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
 ##
 @@ -487,7 +483,7 @@ Operator genOPTree(PlannerContext plannerCtx) throws 
SemanticException {
 ASTNode newAST = getOptimizedAST(newPlan);
 
 // 1.1. Fix up the query for insert/ctas/materialized views
-newAST = fixUpAfterCbo(this.getAST(), newAST, cboCtx);
 
 Review comment:
   @kgyrtkirk The original issue for which HIVE-22578 was opened is being fixed 
by HIVE-22824 (this pull request's change). CBO path was failing because 
JoinProjectTranspose rule was removing project containing windowing (creating 
wrong AST).
   Fall to non-cbo path should happen only for queries for which CBO isn't 
supported (and that will happed before fixUpAfterCbo). So I believe it is okay 
to change AST at this point.
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392747)
Time Spent: 0.5h  (was: 20m)

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044703#comment-17044703
 ] 

Hive QA commented on HIVE-22819:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994555/HIVE-22819.5.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 18044 tests 
executed
*Failed tests:*
{noformat}
TestMiniLlapCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=161)

[unionDistinct_1.q,table_nonprintable.q,file_with_header_footer_aggregation.q,orc_llap_counters1.q,mm_cttas.q,whroot_external1.q,global_limit.q,rcfile_createas1.q,dynamic_partition_pruning_2.q,intersect_merge.q,parquet_struct_type_vectorization.q,results_cache_diff_fs.q,parallel_colstats.q,load_hdfs_file_with_space_in_the_name.q,orc_merge3.q]
org.apache.hadoop.hive.metastore.TestMetastoreHousekeepingLeaderEmptyConfig.testHouseKeepingThreadExistence
 (batchId=251)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20824/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20824/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20824/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994555 - PreCommit-HIVE-Build

> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch, HIVE-22819.5.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22453) Describe table unnecessarily fetches partitions

2020-02-25 Thread Vineet Garg (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044699#comment-17044699
 ] 

Vineet Garg commented on HIVE-22453:


[~touchida] For some reason I am unable to apply the patch on upstream master. 
Can you rebase and re-upload?

> Describe table unnecessarily fetches partitions
> ---
>
> Key: HIVE-22453
> URL: https://issues.apache.org/jira/browse/HIVE-22453
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 2.3.6
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Attachments: HIVE-22453.2.patch, HIVE-22453.2.patch, 
> HIVE-22453.3.patch, HIVE-22453.patch
>
>
> The simple describe table command without EXTENDED and FORMATTED (i.e., 
> DESCRIBE table_name) fetches all partitions when no partition is specified, 
> although it does not display partition statistics in nature.
> The command should not fetch partitions since it can take a long time for a 
> large amount of partitions.
> For instance, in our environment, the command takes around 8 seconds for a 
> table with 8760 (24 * 365) partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22771) Partition location incorrectly formed in FileOutputCommitterContainer

2020-02-25 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-22771:
-
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fix has been pushed to master. Thank you for the contribution [~shivam-mohan]

> Partition location incorrectly formed in FileOutputCommitterContainer
> -
>
> Key: HIVE-22771
> URL: https://issues.apache.org/jira/browse/HIVE-22771
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Affects Versions: 1.2.1
>Reporter: Shivam
>Assignee: Shivam
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22771.2.patch, HIVE-22771.3.patch, 
> HIVE-22771.4.patch, HIVE-22771.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Class _HCatOutputFormat_ in package _org.apache.hive.hcatalog.mapreduce_ uses 
> function _setOutput_ to generate _idHash_ using below statement:
> *+In file org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java+*
>  *line 116: idHash = String.valueOf(Math.random());*
> The output of idHash can be similar to values like this : 7.145347157239135E-4
>  
> And, in class _FileOutputCommitterContainer_ in package 
> _org.apache.hive.hcatalog.mapreduce;_
> Uses below statement to compute final partition path:
> +*In org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java*+
> *line 366: String finalLocn = jobLocation.replaceAll(Path.SEPARATOR + 
> SCRATCH_DIR_NAME + "{color:#ff}\\d
> .?
>  d+"{color},"");*
> *line 367: partPath = new Path(finalLocn);*
>  
> Regex used here is incorrect, since it will only remove integers after the 
> *SCRATCH_DIR_NAME,* and hence will append  'E-4' (for the above example) in 
> the final partition location. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22889) Trim trailing and leading quotes for HCatCli query processing

2020-02-25 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-22889:
-
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fix has been committed to master. Thank you for the patch [~rameshkumar]

> Trim trailing and leading quotes for HCatCli query processing
> -
>
> Key: HIVE-22889
> URL: https://issues.apache.org/jira/browse/HIVE-22889
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22889.1.patch
>
>
> Trim trailing and leading quotes for HCatCli query processing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044652#comment-17044652
 ] 

Hive QA commented on HIVE-22819:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20824/dev-support/hive-personality.sh
 |
| git revision | master / 1046517 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20824/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20824/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch, HIVE-22819.5.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-20948) Eliminate file rename in compactor

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-20948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044626#comment-17044626
 ] 

Hive QA commented on HIVE-20948:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994554/HIVE-20948.07.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18059 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20823/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20823/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20823/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994554 - PreCommit-HIVE-Build

> Eliminate file rename in compactor
> --
>
> Key: HIVE-20948
> URL: https://issues.apache.org/jira/browse/HIVE-20948
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-20948.01.patch, HIVE-20948.02.patch, 
> HIVE-20948.03.patch, HIVE-20948.04.patch, HIVE-20948.05.patch, 
> HIVE-20948.06.patch, HIVE-20948.07.patch
>
>
> Once HIVE-20823 is committed, we should investigate if it's possible to have 
> compactor write directly to base_x_cZ or delta_x_y_cZ.  
> For query based compaction: can we control location of temp table dir?  We 
> support external temp tables so this may work but we'd need to have non-acid 
> insert create files with {{bucket_x}} names.
>  
> For MR/Tez/LLAP based (should this be done at all?), need to figure out how 
> retries of tasks will work.  Just like we currently generate an MR job to 
> compact, we should be able to generate a Tez job.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22918) Investigate empty bucket file creation for ACID tables

2020-02-25 Thread Marton Bod (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044598#comment-17044598
 ] 

Marton Bod commented on HIVE-22918:
---

Based on discussions with [~lpinter], in theory, the lack of empty bucket files 
should pose no problem for compaction either.

> Investigate empty bucket file creation for ACID tables
> --
>
> Key: HIVE-22918
> URL: https://issues.apache.org/jira/browse/HIVE-22918
> Project: Hive
>  Issue Type: Task
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marton Bod
>Priority: Major
>
> When creating an insert-only bucketed table with 5 buckets, and we insert 
> only one row to this table, Hive creates empty files for the other 4 buckets. 
> This logic is in the code for ACID tables as well, but when checking the 
> table's final directory after the insert, I found that only 1 files got 
> created. When debugged this issue, I found that the empty files are created 
> in the staging directory outside the delta directory, therefore they won't 
> get copied by the move task to the final directory. This behavior seems 
> broken, but not sure if we really need the empty files in this case.
> This Jira is about investigating whether or not we need these empty files for 
> ACID tables and if we do, fix the code to have them for ACID tables as well.
> Repro steps: 
> {noformat}
> create table test_mm(key int, id int) clustered by (key) into 5 buckets 
> stored as orc tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert into test_mm values (1,1);
> {noformat}
> The following files are present in the 'test_mm/delta_001_001_' 
> folder:
> {noformat}
> 244 Feb 21 12:08 00_0
>   0 Feb 21 12:08 01_0
>   0 Feb 21 12:08 02_0
>   0 Feb 21 12:08 03_0
>   0 Feb 21 12:08 04_0
> {noformat}
> {noformat}
> create table test_acid(key int, id int) clustered by (key) into 5 buckets 
> stored as orc tblproperties("transactional"="true");
> insert into test_acid values (1,1);
> {noformat}
> The following files are present in the 'test_acid/delta_001_001_' 
> folder:
> {noformat}
>   1 Feb 21 12:13 _orc_acid_version
> 656 Feb 21 12:13 bucket_0
> {noformat}
> However when stopping in the MoveTask with the debugger, it can be seen that 
> the staging directory contains the empty files, so they are generated. 
> However the 00_0 is not a file, it is a directory which contains the 
> delta directory and the data file. When moving the data file to the final 
> location, the move task will only move the files from the delta directory, so 
> the empty files won't be moved.
> {noformat}
> ll 
> test_acid/.hive-staging_hive_2020-02-21_12-16-58_615_787573577176141305-1/-ext-1
>  
> 96 Feb 21 12:17 00_0
> 0 Feb 21 12:17 01_0
> 0 Feb 21 12:17 02_0
> 0 Feb 21 12:17 03_0
> 0 Feb 21 12:17 04_0
> {noformat}
> {noformat}
> ll 
> test_acid/.hive-staging_hive_2020-02-21_12-16-58_615_787573577176141305-1/-ext-1/00_0/delta_001_001_
>  
>   1 Feb 21 12:17 _orc_acid_version
> 656 Feb 21 12:17 bucket_0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22918) Investigate empty bucket file creation for ACID tables

2020-02-25 Thread Marton Bod (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044586#comment-17044586
 ] 

Marton Bod commented on HIVE-22918:
---

Based on my investigation, the empty bucket files are only created if MR is 
used as the execution engine. When using Tez, the empty files are not created - 
not for MM tables and not in the staging directory either during ACID 
(non-direct) inserts.

Additionally, when testing locally using MR as the engine, the empty bucket 
files created for MM tables did not seem to play any role - upon deleting them 
manually, the data could still be read back and compaction worked as well. It 
seems that their creation is most likely a side-effect/bug of how MR works 
under the hood. 

In conclusion, my suggestion would be to:
 # keep the ACID logic as is, i.e. do not add the empty file creation logic to 
ACID, since it seems to be an MR-only phenomenon (which is a deprecated engine 
anyway)
 # investigate whether the empty file creation could be removed from MR as well 
- for users of MR, the cost of creating empty files can be expensive for tables 
with many buckets. E.g. when using a table with 1024 buckets, inserting a 
single record will create 1023 empty files every time, slowing down the query 
execution considerably

> Investigate empty bucket file creation for ACID tables
> --
>
> Key: HIVE-22918
> URL: https://issues.apache.org/jira/browse/HIVE-22918
> Project: Hive
>  Issue Type: Task
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marton Bod
>Priority: Major
>
> When creating an insert-only bucketed table with 5 buckets, and we insert 
> only one row to this table, Hive creates empty files for the other 4 buckets. 
> This logic is in the code for ACID tables as well, but when checking the 
> table's final directory after the insert, I found that only 1 files got 
> created. When debugged this issue, I found that the empty files are created 
> in the staging directory outside the delta directory, therefore they won't 
> get copied by the move task to the final directory. This behavior seems 
> broken, but not sure if we really need the empty files in this case.
> This Jira is about investigating whether or not we need these empty files for 
> ACID tables and if we do, fix the code to have them for ACID tables as well.
> Repro steps: 
> {noformat}
> create table test_mm(key int, id int) clustered by (key) into 5 buckets 
> stored as orc tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert into test_mm values (1,1);
> {noformat}
> The following files are present in the 'test_mm/delta_001_001_' 
> folder:
> {noformat}
> 244 Feb 21 12:08 00_0
>   0 Feb 21 12:08 01_0
>   0 Feb 21 12:08 02_0
>   0 Feb 21 12:08 03_0
>   0 Feb 21 12:08 04_0
> {noformat}
> {noformat}
> create table test_acid(key int, id int) clustered by (key) into 5 buckets 
> stored as orc tblproperties("transactional"="true");
> insert into test_acid values (1,1);
> {noformat}
> The following files are present in the 'test_acid/delta_001_001_' 
> folder:
> {noformat}
>   1 Feb 21 12:13 _orc_acid_version
> 656 Feb 21 12:13 bucket_0
> {noformat}
> However when stopping in the MoveTask with the debugger, it can be seen that 
> the staging directory contains the empty files, so they are generated. 
> However the 00_0 is not a file, it is a directory which contains the 
> delta directory and the data file. When moving the data file to the final 
> location, the move task will only move the files from the delta directory, so 
> the empty files won't be moved.
> {noformat}
> ll 
> test_acid/.hive-staging_hive_2020-02-21_12-16-58_615_787573577176141305-1/-ext-1
>  
> 96 Feb 21 12:17 00_0
> 0 Feb 21 12:17 01_0
> 0 Feb 21 12:17 02_0
> 0 Feb 21 12:17 03_0
> 0 Feb 21 12:17 04_0
> {noformat}
> {noformat}
> ll 
> test_acid/.hive-staging_hive_2020-02-21_12-16-58_615_787573577176141305-1/-ext-1/00_0/delta_001_001_
>  
>   1 Feb 21 12:17 _orc_acid_version
> 656 Feb 21 12:17 bucket_0
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-20948) Eliminate file rename in compactor

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-20948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044581#comment-17044581
 ] 

Hive QA commented on HIVE-20948:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
4s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20823/dev-support/hive-personality.sh
 |
| git revision | master / 1046517 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20823/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20823/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Eliminate file rename in compactor
> --
>
> Key: HIVE-20948
> URL: https://issues.apache.org/jira/browse/HIVE-20948
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-20948.01.patch, HIVE-20948.02.patch, 
> HIVE-20948.03.patch, HIVE-20948.04.patch, HIVE-20948.05.patch, 
> HIVE-20948.06.patch, HIVE-20948.07.patch
>
>
> Once HIVE-20823 is committed, we should investigate if it's possible to have 
> compactor write directly to base_x_cZ or delta_x_y_cZ.  
> For query based compaction: can we control location of temp table dir?  We 
> support external temp tables so this may work but we'd need to have non-acid 
> insert create files with {{bucket_x}} names.
>  
> For MR/Tez/LLAP based (should this be done at all?), need to figure out how 
> retries of tasks will work.  Just like we currently generate an MR job to 
> compact, we should be able to generate a Tez job.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Description: 
In certain cases the TopNKey filter might work in an inefficient way and adds 
extra CPU overhead. For example if the rows are coming in an descending order 
but the filter wants the top N smallest elements the filter will forward 
everything.

Inefficient should be detected in runtime so that the filter can be disabled of 
the ration between forwarder_rows/total_rows is too high.

  was:
In certain cases the TopNKey filter might work in an inefficient way and adds 
extra CPU overhead. For example if the rows are coming in an ascending order 
but the filter wants the top N smallest elements the filter will forward 
everything.

Inefficient should be detected in runtime so that the filter can be disabled of 
the ration between forwarder_rows/total_rows is too high.


> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an descending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-25 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044556#comment-17044556
 ] 

Peter Vary commented on HIVE-22819:
---

+1

> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch, HIVE-22819.5.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Attachment: HIVE-22925.1.patch

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Status: Open  (was: Patch Available)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Status: Patch Available  (was: Open)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Attachment: (was: HIVE-22925.1.patch)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Status: Patch Available  (was: Open)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Attachment: HIVE-22925.1.patch

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Attachment: (was: HIVE-22925.1.patch)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Status: Open  (was: Patch Available)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Status: Patch Available  (was: Open)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-25 Thread Attila Magyar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Attachment: HIVE-22925.1.patch

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an ascending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22832) Parallelise direct insert directory cleaning process

2020-02-25 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod updated HIVE-22832:
--
Attachment: HIVE-22832.6.patch

> Parallelise direct insert directory cleaning process
> 
>
> Key: HIVE-22832
> URL: https://issues.apache.org/jira/browse/HIVE-22832
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22832.1.patch, HIVE-22832.2.patch, 
> HIVE-22832.3.patch, HIVE-22832.4.patch, HIVE-22832.5.patch, HIVE-22832.6.patch
>
>
> Inside Utilities::handleDirectInsertTableFinalPath, the 
> cleanDirectInsertDirectories method is called sequentially for each element 
> of the directInsertDirectories list, which might have a large number of 
> elements depending on how many partitions were written. This current 
> sequential execution could be improved by parallelising the clean up process. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21487) COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes

2020-02-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-21487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér updated HIVE-21487:
-
Attachment: (was: HIVE-21487.06.patch)

> COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes
> 
>
> Key: HIVE-21487
> URL: https://issues.apache.org/jira/browse/HIVE-21487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-21487.03.patch, HIVE-21487.04.patch, 
> HIVE-21487.05.patch, HIVE-21487.06.patch, HIVE-21847.01.patch, 
> HIVE-21847.02.patch
>
>
> Looking at a MySQL install where HMS is pointed on Hive 3.1, I see a constant 
> stream of queries of the form:
> {code}
> select CC_STATE from COMPLETED_COMPACTIONS where CC_DATABASE = 
> 'tpcds_orc_exact_1000' and CC_TABLE = 'catalog_returns' and CC_PARTITION = 
> 'cr_returned_date_sk=2452851' and CC_STATE != 'a' order by CC_ID desc;
> {code}
> but the COMPLETED_COMPACTIONS table has no index. In this case it's resulting 
> in a full table scan over 115k rows, which takes around 100ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21487) COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes

2020-02-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-21487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér updated HIVE-21487:
-
Attachment: HIVE-21487.06.patch

> COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes
> 
>
> Key: HIVE-21487
> URL: https://issues.apache.org/jira/browse/HIVE-21487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-21487.03.patch, HIVE-21487.04.patch, 
> HIVE-21487.05.patch, HIVE-21487.06.patch, HIVE-21847.01.patch, 
> HIVE-21847.02.patch
>
>
> Looking at a MySQL install where HMS is pointed on Hive 3.1, I see a constant 
> stream of queries of the form:
> {code}
> select CC_STATE from COMPLETED_COMPACTIONS where CC_DATABASE = 
> 'tpcds_orc_exact_1000' and CC_TABLE = 'catalog_returns' and CC_PARTITION = 
> 'cr_returned_date_sk=2452851' and CC_STATE != 'a' order by CC_ID desc;
> {code}
> but the COMPLETED_COMPACTIONS table has no index. In this case it's resulting 
> in a full table scan over 115k rows, which takes around 100ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-21487) COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes

2020-02-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-21487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér updated HIVE-21487:
-
Attachment: HIVE-21487.06.patch

> COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes
> 
>
> Key: HIVE-21487
> URL: https://issues.apache.org/jira/browse/HIVE-21487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-21487.03.patch, HIVE-21487.04.patch, 
> HIVE-21487.05.patch, HIVE-21487.06.patch, HIVE-21847.01.patch, 
> HIVE-21847.02.patch
>
>
> Looking at a MySQL install where HMS is pointed on Hive 3.1, I see a constant 
> stream of queries of the form:
> {code}
> select CC_STATE from COMPLETED_COMPACTIONS where CC_DATABASE = 
> 'tpcds_orc_exact_1000' and CC_TABLE = 'catalog_returns' and CC_PARTITION = 
> 'cr_returned_date_sk=2452851' and CC_STATE != 'a' order by CC_ID desc;
> {code}
> but the COMPLETED_COMPACTIONS table has no index. In this case it's resulting 
> in a full table scan over 115k rows, which takes around 100ms.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-25 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod updated HIVE-22819:
--
Attachment: HIVE-22819.5.patch

> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch, HIVE-22819.5.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20948) Eliminate file rename in compactor

2020-02-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-20948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér updated HIVE-20948:
-
Attachment: HIVE-20948.07.patch

> Eliminate file rename in compactor
> --
>
> Key: HIVE-20948
> URL: https://issues.apache.org/jira/browse/HIVE-20948
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-20948.01.patch, HIVE-20948.02.patch, 
> HIVE-20948.03.patch, HIVE-20948.04.patch, HIVE-20948.05.patch, 
> HIVE-20948.06.patch, HIVE-20948.07.patch
>
>
> Once HIVE-20823 is committed, we should investigate if it's possible to have 
> compactor write directly to base_x_cZ or delta_x_y_cZ.  
> For query based compaction: can we control location of temp table dir?  We 
> support external temp tables so this may work but we'd need to have non-acid 
> insert create files with {{bucket_x}} names.
>  
> For MR/Tez/LLAP based (should this be done at all?), need to figure out how 
> retries of tasks will work.  Just like we currently generate an MR job to 
> compact, we should be able to generate a Tez job.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22904) Compaction cleaner cannot find COMPACTION_QUEUE table using postgres db

2020-02-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-22904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér updated HIVE-22904:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Compaction cleaner cannot find COMPACTION_QUEUE table using postgres db
> ---
>
> Key: HIVE-22904
> URL: https://issues.apache.org/jira/browse/HIVE-22904
> Project: Hive
>  Issue Type: Bug
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-22904.01.patch, HIVE-22904.02.patch, 
> HIVE-22904.03.patch, HIVE-22904.04.patch, HIVE-22904.05.patch
>
>
> In CompactionTxnHandler 
> {code:java}
> delete from COMPACTION_QUEUE where cq_id = ?
> {code}
> fails with 
> {code:java}
> org.postgresql.util.PSQLException: ERROR: relation "compaction_queue" does 
> not exist
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22904) Compaction cleaner cannot find COMPACTION_QUEUE table using postgres db

2020-02-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-22904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044495#comment-17044495
 ] 

László Pintér commented on HIVE-22904:
--

Pushed to master. Thanks [~zchovan] and [~pvary] for the review.

> Compaction cleaner cannot find COMPACTION_QUEUE table using postgres db
> ---
>
> Key: HIVE-22904
> URL: https://issues.apache.org/jira/browse/HIVE-22904
> Project: Hive
>  Issue Type: Bug
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
> Attachments: HIVE-22904.01.patch, HIVE-22904.02.patch, 
> HIVE-22904.03.patch, HIVE-22904.04.patch, HIVE-22904.05.patch
>
>
> In CompactionTxnHandler 
> {code:java}
> delete from COMPACTION_QUEUE where cq_id = ?
> {code}
> fails with 
> {code:java}
> org.postgresql.util.PSQLException: ERROR: relation "compaction_queue" does 
> not exist
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-25 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044471#comment-17044471
 ] 

Steve Loughran commented on HIVE-22819:
---

LGTM -this saves two round trips to HDFS, S3 or ABFS.

> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22853) Beeline should use HS2 server defaults for fetchSize

2020-02-25 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044468#comment-17044468
 ] 

David Mollitor commented on HIVE-22853:
---

Same thing in HiveConnection.java, the fetchSize should default to 0.

> Beeline should use HS2 server defaults for fetchSize
> 
>
> Key: HIVE-22853
> URL: https://issues.apache.org/jira/browse/HIVE-22853
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-22853.2.patch, HIVE-22853.3.patch, HIVE-22853.patch
>
>
> Currently beeline uses a hard coded default of 1000 rows for fetchSize. This 
> default value is different from what the server has set. While the beeline 
> user can reset the value via set command, its cumbersome to change the 
> workloads.
> Rather it should default to the server-side value and set should be used to 
> override within the session.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22853) Beeline should use HS2 server defaults for fetchSize

2020-02-25 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044466#comment-17044466
 ] 

David Mollitor commented on HIVE-22853:
---

{quote}
if we follow the same spec as Oracle, we should treat less than 0 as invalid, 
and default to 0 and treat zero as no-limit.
{quote}

I do not think the is correct.  From the docs: _If the value specified is zero, 
then the hint is ignored._

I think this means simply that the driver can use whatever default value it 
wants since the client application has not provided any kind of hint.

So, actually the 'default' value should be 0 in BeeLine.java and should be 
treated as 1000 (current default) in the Driver itself... this behavior is 
correctly implemented in the driver:

https://github.com/apache/hive/blob/037eacea46371015a7f9894c5a9ccfb9708d5c56/jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java#L811

> Beeline should use HS2 server defaults for fetchSize
> 
>
> Key: HIVE-22853
> URL: https://issues.apache.org/jira/browse/HIVE-22853
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 4.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
> Attachments: HIVE-22853.2.patch, HIVE-22853.3.patch, HIVE-22853.patch
>
>
> Currently beeline uses a hard coded default of 1000 rows for fetchSize. This 
> default value is different from what the server has set. While the beeline 
> user can reset the value via set command, its cumbersome to change the 
> workloads.
> Rather it should default to the server-side value and set should be used to 
> override within the session.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22927) LLAP should filter guaranteed tasks before killing in node heartbeat

2020-02-25 Thread Rajesh Balamohan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-22927:

Summary: LLAP should filter guaranteed tasks before killing in node 
heartbeat   (was: LLAP should filter guaranteed tasks for killing in node 
heartbeat )

> LLAP should filter guaranteed tasks before killing in node heartbeat 
> -
>
> Key: HIVE-22927
> URL: https://issues.apache.org/jira/browse/HIVE-22927
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22927.1.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22832) Parallelise direct insert directory cleaning process

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044420#comment-17044420
 ] 

Hive QA commented on HIVE-22832:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994541/HIVE-22832.5.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20822/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20822/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20822/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2020-02-25 13:08:16.695
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-20822/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2020-02-25 13:08:16.698
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   0767c5d..bbdf4c3  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 0767c5d HIVE-22825 : Reduce directory lookup cost for acid 
tables (Rajesh Balamohan via Ashutosh Chauhan)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)
+ git reset --hard origin/master
HEAD is now at bbdf4c3 HIVE-22863: Commit compaction txn if it is opened but 
compaction is skipped (Karen Coppage via Laszlo Pinter)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2020-02-25 13:08:18.435
+ rm -rf ../yetus_PreCommit-HIVE-Build-20822
+ mkdir ../yetus_PreCommit-HIVE-Build-20822
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-20822
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-20822/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Trying to apply the patch with -p0
fatal: corrupt patch at line 87
Trying to apply the patch with -p1
fatal: corrupt patch at line 87
Trying to apply the patch with -p2
fatal: corrupt patch at line 87
The patch does not appear to apply with p0, p1, or p2
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-20822
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994541 - PreCommit-HIVE-Build

> Parallelise direct insert directory cleaning process
> 
>
> Key: HIVE-22832
> URL: https://issues.apache.org/jira/browse/HIVE-22832
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22832.1.patch, HIVE-22832.2.patch, 
> HIVE-22832.3.patch, HIVE-22832.4.patch, HIVE-22832.5.patch
>
>
> Inside Utilities::handleDirectInsertTableFinalPath, the 
> cleanDirectInsertDirectories method is called sequentially for each element 
> of the directInsertDirectories list, which might have a large number of 
> elements depending on how many partitions were written. This current 
> sequential execution could be improved by parallelising the clean up process. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044418#comment-17044418
 ] 

Hive QA commented on HIVE-22872:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994526/HIVE-22872.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 37 failed/errored test(s), 18057 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schq_analyze]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schq_ingest]
 (batchId=184)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schq_materialized]
 (batchId=182)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=176)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb_schq] 
(batchId=181)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query70] 
(batchId=305)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testCleanup[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testCleanup[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testCreate[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testCreate[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testDuplicateCreate[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testDuplicateCreate[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testExclusivePoll[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testExclusivePoll[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testNormalDeleteWithExec[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testNormalDeleteWithExec[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testNormalDelete[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testNormalDelete[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testOutdatedCleanup[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testOutdatedCleanup[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testPoll[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testPoll[Remote]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testUpdate[Embedded]
 (batchId=231)
org.apache.hadoop.hive.metastore.client.TestMetastoreScheduledQueries.testUpdate[Remote]
 (batchId=231)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryService.testScheduledQueryExecution
 (batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.test10Minutes 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.test10Seconds 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.test4Hours 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.test4Hours2 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.testAlter 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.testCreateFromNonDefaultDatabase
 (batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.testDay 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.testDay2 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.testExecuteImmediate
 (batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.testMinutes 
(batchId=357)
org.apache.hadoop.hive.ql.schq.TestScheduledQueryStatements.testSimpleCreate 
(batchId=357)
org.apache.hadoop.hive.schq.TestScheduledQueryIntegration.testScheduledQueryExecutionImpersonation
 (batchId=285)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20821/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20821/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20821/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 37 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994526 - PreCommit-HIVE-Build

> Support 

[jira] [Commented] (HIVE-22863) Commit compaction txn if it is opened but compaction is skipped

2020-02-25 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044416#comment-17044416
 ] 

László Pintér commented on HIVE-22863:
--

Pushed to master. Thanks for the patch [~klcopp]

> Commit compaction txn if it is opened but compaction is skipped
> ---
>
> Key: HIVE-22863
> URL: https://issues.apache.org/jira/browse/HIVE-22863
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22863.01.patch, HIVE-22863.02.patch, 
> HIVE-22863.03.patch, HIVE-22863.04.patch, HIVE-22863.05.patch
>
>
> Currently if a table does not have enough directories to compact, compaction 
> is skipped and the compaction is either (a) marked ready for cleaning or (b) 
> marked compacted. However, the txn the compaction runs in is never committed, 
> it remains open, so TXNS and TXN_COMPONENTS will never be cleared of 
> information about the attempted compaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22863) Commit compaction txn if it is opened but compaction is skipped

2020-02-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-22863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér updated HIVE-22863:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Commit compaction txn if it is opened but compaction is skipped
> ---
>
> Key: HIVE-22863
> URL: https://issues.apache.org/jira/browse/HIVE-22863
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
> Attachments: HIVE-22863.01.patch, HIVE-22863.02.patch, 
> HIVE-22863.03.patch, HIVE-22863.04.patch, HIVE-22863.05.patch
>
>
> Currently if a table does not have enough directories to compact, compaction 
> is skipped and the compaction is either (a) marked ready for cleaning or (b) 
> marked compacted. However, the txn the compaction runs in is never committed, 
> it remains open, so TXNS and TXN_COMPONENTS will never be cleared of 
> information about the attempted compaction.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22832) Parallelise direct insert directory cleaning process

2020-02-25 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod updated HIVE-22832:
--
Attachment: HIVE-22832.5.patch

> Parallelise direct insert directory cleaning process
> 
>
> Key: HIVE-22832
> URL: https://issues.apache.org/jira/browse/HIVE-22832
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22832.1.patch, HIVE-22832.2.patch, 
> HIVE-22832.3.patch, HIVE-22832.4.patch, HIVE-22832.5.patch
>
>
> Inside Utilities::handleDirectInsertTableFinalPath, the 
> cleanDirectInsertDirectories method is called sequentially for each element 
> of the directInsertDirectories list, which might have a large number of 
> elements depending on how many partitions were written. This current 
> sequential execution could be improved by parallelising the clean up process. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-25 Thread Hive QA (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044398#comment-17044398
 ] 

Hive QA commented on HIVE-22872:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
3s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m 
53s{color} | {color:blue} standalone-metastore/metastore-common in master has 
35 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
37s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
18s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
22s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 1 new + 295 unchanged - 0 fixed = 296 total (was 295) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 2 new + 8 unchanged - 1 fixed 
= 10 total (was 9) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 42 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
17s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20821/dev-support/hive-personality.sh
 |
| git revision | master / 0767c5d |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20821/yetus/diff-checkstyle-standalone-metastore_metastore-server.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20821/yetus/diff-checkstyle-ql.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20821/yetus/whitespace-tabs.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20821/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-common common metastore 
standalone-metastore/metastore-server ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20821/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Support multiple executors for scheduled queries
> 

[jira] [Commented] (HIVE-22126) hive-exec packaging should shade guava

2020-02-25 Thread Eugene Chung (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044382#comment-17044382
 ] 

Eugene Chung commented on HIVE-22126:
-

[~dlavati] Shading guava for Hive also requires shading calcite modules. And it 
leads to changing the FQCN of calcite-avatica JDBC driver. e.g. 
 * org.apache.calcite.jdbc.Driver -> 
org.apache.hive.org.apache.calcite.jdbc.Driver

I stopped there cause I was not sure it's okay to change it.

If changing the name of driver is just internal or test concern, I think it's 
okay.

I have some free time these days, so I am going to investigate this again.

> hive-exec packaging should shade guava
> --
>
> Key: HIVE-22126
> URL: https://issues.apache.org/jira/browse/HIVE-22126
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Eugene Chung
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22126.01.patch, HIVE-22126.02.patch, 
> HIVE-22126.03.patch
>
>
> The ql/pom.xml includes complete guava library into hive-exec.jar 
> https://github.com/apache/hive/blob/master/ql/pom.xml#L990 This causes a 
> problems for downstream clients of hive which have hive-exec.jar in their 
> classpath since they are pinned to the same guava version as that of hive. 
> We should shade guava classes so that other components which depend on 
> hive-exec can independently use a different version of guava as needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22824?focusedWorklogId=392505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392505
 ]

ASF GitHub Bot logged work on HIVE-22824:
-

Author: ASF GitHub Bot
Created on: 25/Feb/20 11:47
Start Date: 25/Feb/20 11:47
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #897: HIVE-22824: 
JoinProjectTranspose rule should skip Projects containing…
URL: https://github.com/apache/hive/pull/897#discussion_r383830139
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
 ##
 @@ -487,7 +483,7 @@ Operator genOPTree(PlannerContext plannerCtx) throws 
SemanticException {
 ASTNode newAST = getOptimizedAST(newPlan);
 
 // 1.1. Fix up the query for insert/ctas/materialized views
-newAST = fixUpAfterCbo(this.getAST(), newAST, cboCtx);
 
 Review comment:
   I don't see how this change will not reintroduce the issue fixed in 
HIVE-22578
   
   because the "fixUpAfterCbo" makes calls to a function named replaceASTChild 
which changes the actual ast - and it may make it impossible to fallback to the 
non-cbo path
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392505)
Time Spent: 20m  (was: 10m)

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22781) Add ability to immediately execute a scheduled query

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-22781:
--
Labels: pull-request-available  (was: )

> Add ability to immediately execute a scheduled query
> 
>
> Key: HIVE-22781
> URL: https://issues.apache.org/jira/browse/HIVE-22781
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22781.01.patch, HIVE-22781.02.patch, 
> HIVE-22781.03.patch, HIVE-22781.04.patch, HIVE-22781.04.patch, 
> HIVE-22781.04.patch, HIVE-22781.05.patch, HIVE-22781.05.patch
>
>
> there are some differences when the system invokes the scheduled query / the 
> user executes it in a shell - forcing the schedule to run might be usefull in 
> developing/debugging schedules
> something like:
> {code}
> alter scheduled query a execute
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22781) Add ability to immediately execute a scheduled query

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22781?focusedWorklogId=392486=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392486
 ]

ASF GitHub Bot logged work on HIVE-22781:
-

Author: ASF GitHub Bot
Created on: 25/Feb/20 11:19
Start Date: 25/Feb/20 11:19
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #896: HIVE-22781 
schq execute
URL: https://github.com/apache/hive/pull/896
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392486)
Remaining Estimate: 0h
Time Spent: 10m

> Add ability to immediately execute a scheduled query
> 
>
> Key: HIVE-22781
> URL: https://issues.apache.org/jira/browse/HIVE-22781
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22781.01.patch, HIVE-22781.02.patch, 
> HIVE-22781.03.patch, HIVE-22781.04.patch, HIVE-22781.04.patch, 
> HIVE-22781.04.patch, HIVE-22781.05.patch, HIVE-22781.05.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> there are some differences when the system invokes the scheduled query / the 
> user executes it in a shell - forcing the schedule to run might be usefull in 
> developing/debugging schedules
> something like:
> {code}
> alter scheduled query a execute
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22881) Revise non-recommended Calcite api calls

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22881?focusedWorklogId=392485=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392485
 ]

ASF GitHub Bot logged work on HIVE-22881:
-

Author: ASF GitHub Bot
Created on: 25/Feb/20 11:18
Start Date: 25/Feb/20 11:18
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #919: HIVE-22881 
rexutil usage
URL: https://github.com/apache/hive/pull/919
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392485)
Time Spent: 40m  (was: 0.5h)

> Revise non-recommended Calcite api calls
> 
>
> Key: HIVE-22881
> URL: https://issues.apache.org/jira/browse/HIVE-22881
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22881.01.patch, HIVE-22881.02.patch, 
> HIVE-22881.03.patch, HIVE-22881.03.patch, HIVE-22881.03.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> RexUtil.simplify* methods



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22872?focusedWorklogId=392484=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-392484
 ]

ASF GitHub Bot logged work on HIVE-22872:
-

Author: ASF GitHub Bot
Created on: 25/Feb/20 11:17
Start Date: 25/Feb/20 11:17
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #924: HIVE-22872 
schq executors
URL: https://github.com/apache/hive/pull/924
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 392484)
Remaining Estimate: 0h
Time Spent: 10m

> Support multiple executors for scheduled queries
> 
>
> Key: HIVE-22872
> URL: https://issues.apache.org/jira/browse/HIVE-22872
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22872.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-22872:
--
Labels: pull-request-available  (was: )

> Support multiple executors for scheduled queries
> 
>
> Key: HIVE-22872
> URL: https://issues.apache.org/jira/browse/HIVE-22872
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22872.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-25 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22872:

Attachment: HIVE-22872.01.patch

> Support multiple executors for scheduled queries
> 
>
> Key: HIVE-22872
> URL: https://issues.apache.org/jira/browse/HIVE-22872
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22872.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-25 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22872:

Status: Patch Available  (was: Open)

> Support multiple executors for scheduled queries
> 
>
> Key: HIVE-22872
> URL: https://issues.apache.org/jira/browse/HIVE-22872
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
> Attachments: HIVE-22872.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir

2020-02-25 Thread Oleksiy Sayankin (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041814#comment-17041814
 ] 

Oleksiy Sayankin edited comment on HIVE-22919 at 2/25/20 10:39 AM:
---

*FIXED*


*ROOT-CAUSE*

The root-cause of the issue does not relate to Storage Base Authorization, it 
is about correct update of Hive variable {{hive.metastore.warehouse.dir}}. As 
one can see from the exception:

{code}
FAILED: HiveException org.apache.hadoop.security.AccessControlException: User 
testuser1(user id 5001)  does not have access to hdfs:/tmp/m2/m3.db
{code}

Hive wants to create new database at {{hdfs:/tmp/m2}} despite the operator 
{{SET}} that was executed before
{code}
SET hive.metastore.warehouse.dir=/tmp/m3;
{code}

This happens because {{StorageBasedAuthorizationProvider}} has an instance of 
{{Warehouse}} object that has value for {{hive.metastore.warehouse.dir}}. When 
a user updates the {{hive.metastore.warehouse.dir}} in {{Configuration}} 
instance, this action does not force {{Warehouse}} object to refresh the value 
of {{hive.metastore.warehouse.dir}} and hence it has the old one.

*SOLUTION*

Add {{isWarehouseChanged()}} method to check whether 
{{hive.metastore.warehouse.dir}} has been changed and recreate {{Warehouse}} in 
{{StorageBasedAuthorizationProvider}} if yes.

*EFFECTS*

{{StorageBasedAuthorizationProvider}} initialization.


was (Author: osayankin):
*FIXED*


*ROOT-CAUSE*

The root-cause of the issue does not relate to Storage Base Authorization, it 
is about correct update of Hive variable {{hive.metastore.warehouse.dir}}. As 
one can see from the exception:

{code}
FAILED: HiveException org.apache.hadoop.security.AccessControlException: User 
testuser1(user id 5001)  does not have access to hdfs:/tmp/m2/m3.db
{code}

Hive wants to create new database at {{hdfs:/tmp/m2}} despite the operator 
{{SET}} that was executed before
{code}
SET hive.metastore.warehouse.dir=/tmp/m3;
{code}

This happens because {{StorageBasedAuthorizationProvider}} has an instance of 
{{Warehouse}} object that has value for {{hive.metastore.warehouse.dir}}. When 
a user updates the {{hive.metastore.warehouse.dir}} in {{HiveConf}} instance, 
this action does not force {{Warehouse}} object to refresh the value of 
{{hive.metastore.warehouse.dir}} and hence it has the old one.

*SOLUTION*

Add {{isWarehouseChanged()}} method to check whether 
{{hive.metastore.warehouse.dir}} has been changed and recreate {{Warehouse}} in 
{{StorageBasedAuthorizationProvider}} if yes.

*EFFECTS*

{{StorageBasedAuthorizationProvider}} initialization.

> StorageBasedAuthorizationProvider does not allow create databases after 
> changing hive.metastore.warehouse.dir
> -
>
> Key: HIVE-22919
> URL: https://issues.apache.org/jira/browse/HIVE-22919
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-22919.1.patch, HIVE-22919.2.patch, 
> HIVE-22919.3.patch
>
>
> *ENVIRONMENT:*
> Hive-2.3
> *STEPS TO REPRODUCE:*
> 1. Configure Storage Based Authorization:
> {code:xml}
>   hive.security.authorization.enabled
>   true
> 
> 
>   hive.security.metastore.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.metastore.authenticator.manager
>   
> org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator
> 
> 
>   hive.metastore.pre.event.listeners
>   
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
> {code}
> 2. Create a few directories, change owners and permissions to it:
> {code:java}hadoop fs -mkdir /tmp/m1
> hadoop fs -mkdir /tmp/m2
> hadoop fs -mkdir /tmp/m3
> hadoop fs -chown testuser1:testuser1 /tmp/m[1,3]
> hadoop fs -chmod 700 /tmp/m[1-3]{code}
> 3. Check permissions:
> {code:java}[test@node2 ~]$ hadoop fs -ls /tmp|grep m[1-3]
> drwx--   - testuser1 testuser1  0 2020-02-11 10:25 /tmp/m1
> drwx--   - test  test   0 2020-02-11 10:25 /tmp/m2
> drwx--   - testuser1 testuser1  1 2020-02-11 10:36 /tmp/m3
> [test@node2 ~]$
> {code}
> 4. Loggin into Hive CLI using embedded Hive Metastore as *"testuser1"* user, 
> with *"hive.metastore.warehouse.dir"* set to *"/tmp/m1"*:
> {code:java}
> sudo -u testuser1 hive --hiveconf hive.metastore.uris= --hiveconf 
> hive.metastore.warehouse.dir=/tmp/m1
> {code}
> 5. Perform the next steps:
> {code:sql}-- 1. Check "hive.metastore.warehouse.dir" value:
> SET hive.metastore.warehouse.dir;
> -- 2. Set 

[jira] [Updated] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir

2020-02-25 Thread Oleksiy Sayankin (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-22919:

Status: Patch Available  (was: In Progress)

> StorageBasedAuthorizationProvider does not allow create databases after 
> changing hive.metastore.warehouse.dir
> -
>
> Key: HIVE-22919
> URL: https://issues.apache.org/jira/browse/HIVE-22919
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-22919.1.patch, HIVE-22919.2.patch, 
> HIVE-22919.3.patch
>
>
> *ENVIRONMENT:*
> Hive-2.3
> *STEPS TO REPRODUCE:*
> 1. Configure Storage Based Authorization:
> {code:xml}
>   hive.security.authorization.enabled
>   true
> 
> 
>   hive.security.metastore.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.metastore.authenticator.manager
>   
> org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator
> 
> 
>   hive.metastore.pre.event.listeners
>   
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
> {code}
> 2. Create a few directories, change owners and permissions to it:
> {code:java}hadoop fs -mkdir /tmp/m1
> hadoop fs -mkdir /tmp/m2
> hadoop fs -mkdir /tmp/m3
> hadoop fs -chown testuser1:testuser1 /tmp/m[1,3]
> hadoop fs -chmod 700 /tmp/m[1-3]{code}
> 3. Check permissions:
> {code:java}[test@node2 ~]$ hadoop fs -ls /tmp|grep m[1-3]
> drwx--   - testuser1 testuser1  0 2020-02-11 10:25 /tmp/m1
> drwx--   - test  test   0 2020-02-11 10:25 /tmp/m2
> drwx--   - testuser1 testuser1  1 2020-02-11 10:36 /tmp/m3
> [test@node2 ~]$
> {code}
> 4. Loggin into Hive CLI using embedded Hive Metastore as *"testuser1"* user, 
> with *"hive.metastore.warehouse.dir"* set to *"/tmp/m1"*:
> {code:java}
> sudo -u testuser1 hive --hiveconf hive.metastore.uris= --hiveconf 
> hive.metastore.warehouse.dir=/tmp/m1
> {code}
> 5. Perform the next steps:
> {code:sql}-- 1. Check "hive.metastore.warehouse.dir" value:
> SET hive.metastore.warehouse.dir;
> -- 2. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user does not have an access:
> SET hive.metastore.warehouse.dir=/tmp/m2;
> -- 3. Try to create a database:
> CREATE DATABASE m2;
> -- 4. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user has an access:
> SET hive.metastore.warehouse.dir=/tmp/m3;
> -- 5. Try to create a database:
> CREATE DATABASE m3;
> {code}
> *ACTUAL RESULT:*
> Query 5 fails with an exception below. It does not handle 
> "hive.metastore.warehouse.dir" proprty:
> {code:java}
> hive> -- 5. Try to create a database:
> hive> CREATE DATABASE m3;
> FAILED: HiveException org.apache.hadoop.security.AccessControlException: User 
> testuser1(user id 5001)  does not have access to hdfs:/tmp/m2/m3.db
> hive>
> {code}
> *EXPECTED RESULT:*
> Query 5 creates a database;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >