date:20180926

[jira] [Updated] (HIVE-20644) Avoid exposing sensitive infomation through a Hive Runtime exception

2018-09-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-20644:
--
Summary: Avoid exposing sensitive infomation through a Hive Runtime 
exception  (was: Avoid exposing sensitive infomation through an error message)

> Avoid exposing sensitive infomation through a Hive Runtime exception
> 
>
> Key: HIVE-20644
> URL: https://issues.apache.org/jira/browse/HIVE-20644
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Minor
>
> The HiveException raised from the following methods is exposing the datarow 
> the caused the run time exception.
>  # ReduceRecordSource::GroupIterator::next() - around line 372
>  # MapOperator::process() - around line 567
>  # ExecReducer::reduce() - around line 243
> In all the cases, a string representation of the row is constructed on the 
> fly and is included in
> the error message.
> VectorMapOperator::process() - around line 973 raises the same exception but 
> it's not exposing the row since the row contents are not included in the 
> error message.
> While trying to reproduce above error, I also found that the arguments to a 
> UDF get exposed in log messages from FunctionRegistry::invoke() around line 
> 1114. This too can cause sensitive information to be leaked through error 
> message.
> This way some sensitive information is leaked to a user through exception 
> message. That information may not be available to the user otherwise. Hence 
> it's a kind of security breach or violation of access control.
> The contents of the row or the arguments to a function may be useful for 
> debugging and hence it's worth to add those to logs. Hence proposal here to 
> log a separate message with log level DEBUG or INFO containing the string 
> representation of the row. Users can configure their logging so that 
> DEBUG/INFO messages do not go to the client but at the same time are 
> available in the hive server logs for debugging. The actual exception message 
> will not contain any sensitive data like row data or argument data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20639) Add ability to Write Data from Hive Table/Query to Kafka Topic

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629788#comment-16629788
 ] 

Hive QA commented on HIVE-20639:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
23s{color} | {color:blue} kafka-handler in master has 1 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
44s{color} | {color:blue} ql in master has 2325 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
10s{color} | {color:red} kafka-handler: The patch generated 96 new + 12 
unchanged - 11 fixed = 108 total (was 23) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 38 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 10 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
0s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
31s{color} | {color:red} kafka-handler generated 6 new + 0 unchanged - 1 fixed 
= 6 total (was 1) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:kafka-handler |
|  |  org.apache.hadoop.hive.kafka.KafkaInputSplit defines clone() but doesn't 
implement Cloneable  At KafkaInputSplit.java:implement Cloneable  At 
KafkaInputSplit.java:[line 184] |
|  |  Should org.apache.hadoop.hive.kafka.KafkaSerDe$AvroBytesConverter be a 
_static_ inner class?  At KafkaSerDe.java:inner class?  At 
KafkaSerDe.java:[lines 308-345] |
|  |  Should org.apache.hadoop.hive.kafka.KafkaSerDe$BytesWritableConverter be 
a _static_ inner class?  At KafkaSerDe.java:inner class?  At 
KafkaSerDe.java:[lines 349-355] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hive.kafka.KafkaSerDe$TextBytesConverter.getBytes(Text):in 
org.apache.hadoop.hive.kafka.KafkaSerDe$TextBytesConverter.getBytes(Text): 
String.getBytes()  At KafkaSerDe.java:[line 362] |
|  |  Should org.apache.hadoop.hive.kafka.KafkaSerDe$TextBytesConverter be a 
_static_ inner class?  At KafkaSerDe.java:inner class?  At 
KafkaSerDe.java:[lines 359-365] |
|  |  Boxing/unboxing to parse a primitive 
org.apache.hadoop.hive.kafka.KafkaStorageHandler.commitInsertTable(Table, 
boolean)  At 
KafkaStorageHandler.java:org.apache.hadoop.hive.kafka.KafkaStorageHandler.commitInsertTable(Table,
 boolean)  At KafkaStorageHandler.java:[line 237] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  findbugs  
checkstyle  |
| uname | Linux

[jira] [Assigned] (HIVE-20644) Avoid exposing sensitive infomation through an error message

2018-09-26 Thread Ashutosh Bapat (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat reassigned HIVE-20644:
-


> Avoid exposing sensitive infomation through an error message
> 
>
> Key: HIVE-20644
> URL: https://issues.apache.org/jira/browse/HIVE-20644
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Ashutosh Bapat
>Assignee: Ashutosh Bapat
>Priority: Minor
>
> The HiveException raised from the following methods is exposing the datarow 
> the caused the run time exception.
>  # ReduceRecordSource::GroupIterator::next() - around line 372
>  # MapOperator::process() - around line 567
>  # ExecReducer::reduce() - around line 243
> In all the cases, a string representation of the row is constructed on the 
> fly and is included in
> the error message.
> VectorMapOperator::process() - around line 973 raises the same exception but 
> it's not exposing the row since the row contents are not included in the 
> error message.
> While trying to reproduce above error, I also found that the arguments to a 
> UDF get exposed in log messages from FunctionRegistry::invoke() around line 
> 1114. This too can cause sensitive information to be leaked through error 
> message.
> This way some sensitive information is leaked to a user through exception 
> message. That information may not be available to the user otherwise. Hence 
> it's a kind of security breach or violation of access control.
> The contents of the row or the arguments to a function may be useful for 
> debugging and hence it's worth to add those to logs. Hence proposal here to 
> log a separate message with log level DEBUG or INFO containing the string 
> representation of the row. Users can configure their logging so that 
> DEBUG/INFO messages do not go to the client but at the same time are 
> available in the hive server logs for debugging. The actual exception message 
> will not contain any sensitive data like row data or argument data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19584) Dictionary encoding for string types

2018-09-26 Thread Teddy Choi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629776#comment-16629776
 ] 

Teddy Choi commented on HIVE-19584:
---

Rebased the patch on the latest master branch.

> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.2.patch, 
> HIVE-19584.3.patch, HIVE-19584.4.patch, HIVE-19584.5.patch, 
> HIVE-19584.6.patch, HIVE-19584.7.patch, HIVE-19584.8.patch, HIVE-19584.9.patch
>
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629773#comment-16629773
 ] 

Hive QA commented on HIVE-20636:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941412/HIVE-20636.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testMerge3Way01 
(batchId=315)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14072/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14072/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14072/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941412 - PreCommit-HIVE-Build

> Improve number of null values estimation after outer join
> -
>
> Key: HIVE-20636
> URL: https://issues.apache.org/jira/browse/HIVE-20636
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20636.01.patch, HIVE-20636.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19584) Dictionary encoding for string types

2018-09-26 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-19584:
--
Attachment: (was: HIVE-19584.8.patch)

> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.2.patch, 
> HIVE-19584.3.patch, HIVE-19584.4.patch, HIVE-19584.5.patch, 
> HIVE-19584.6.patch, HIVE-19584.7.patch, HIVE-19584.8.patch, HIVE-19584.9.patch
>
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19584) Dictionary encoding for string types

2018-09-26 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-19584:
--
Attachment: HIVE-19584.9.patch

> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.2.patch, 
> HIVE-19584.3.patch, HIVE-19584.4.patch, HIVE-19584.5.patch, 
> HIVE-19584.6.patch, HIVE-19584.7.patch, HIVE-19584.8.patch, HIVE-19584.9.patch
>
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20540:
--
Fix Version/s: 4.0.0

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20540:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

pushed to master.

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-26 Thread Deepak Jaiswal (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629752#comment-16629752
 ] 

Deepak Jaiswal commented on HIVE-20540:
---

Thanks for the review. You may look it in 
VectorReduceSinkObjectHashOperator.java

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19584) Dictionary encoding for string types

2018-09-26 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-19584:
--
Attachment: HIVE-19584.8.patch

> Dictionary encoding for string types
> 
>
> Key: HIVE-19584
> URL: https://issues.apache.org/jira/browse/HIVE-19584
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-19584.1.patch, HIVE-19584.2.patch, 
> HIVE-19584.3.patch, HIVE-19584.4.patch, HIVE-19584.5.patch, 
> HIVE-19584.6.patch, HIVE-19584.7.patch, HIVE-19584.8.patch, HIVE-19584.8.patch
>
>
> Apache Arrow supports dictionary encoding for some data types. So implement 
> dictionary encoding for string types in Arrow SerDe.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread ASF GitHub Bot (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629738#comment-16629738
 ] 

ASF GitHub Bot commented on HIVE-20632:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/438


> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at

[jira] [Commented] (HIVE-20636) Improve number of null values estimation after outer join

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629731#comment-16629731
 ] 

Hive QA commented on HIVE-20636:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 16s{color} 
| {color:red} 
/data/hiveptest/logs/PreCommit-HIVE-Build-14072/patches/PreCommit-HIVE-Build-14072.patch
 does not apply to master. Rebase required? Wrong Branch? See 
http://cwiki.apache.org/confluence/display/Hive/HowToContribute for help. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14072/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improve number of null values estimation after outer join
> -
>
> Key: HIVE-20636
> URL: https://issues.apache.org/jira/browse/HIVE-20636
> Project: Hive
>  Issue Type: Bug
>  Components: Statistics
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20636.01.patch, HIVE-20636.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20623) Shared work: Extend sharing of map-join cache entries in LLAP

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629724#comment-16629724
 ] 

Hive QA commented on HIVE-20623:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941409/HIVE-20623.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.service.auth.TestCustomAuthentication.org.apache.hive.service.auth.TestCustomAuthentication
 (batchId=248)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14071/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14071/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14071/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941409 - PreCommit-HIVE-Build

> Shared work: Extend sharing of map-join cache entries in LLAP
> -
>
> Key: HIVE-20623
> URL: https://issues.apache.org/jira/browse/HIVE-20623
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Logical Optimizer
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20623.01.patch, HIVE-20623.02.patch, 
> HIVE-20623.02.patch, HIVE-20623.patch, hash-shared-work.json.txt, 
> hash-shared-work.svg
>
>
> For a query like this
> {code}
> with all_sales as (
> select ss_customer_sk as customer_sk, ss_ext_list_price-ss_ext_discount_amt 
> as ext_price from store_sales
> UNION ALL
> select ws_bill_customer_sk as customer_sk, 
> ws_ext_list_price-ws_ext_discount_amt as ext_price from web_sales
> UNION ALL
> select cs_bill_customer_sk as customer_sk, cs_ext_sales_price - 
> cs_ext_discount_amt as ext_price from catalog_sales)
> select sum(ext_price) total_price, c_customer_id from all_sales, customer 
> where customer_sk = c_customer_sk
> group by c_customer_id
> order by total_price desc 
> limit 100;
> {code}
> The hashtable used for all 3 joins are identical, which is loaded 3x times in 
> the same LLAP instance because they are named.
> {code}
> cacheKey = "HASH_MAP_" + this.getOperatorId() + "_container";
> {code}
> in the cache.
> If those are identical in nature (i.e vectorization, hashtable type etc), 
> then the duplication is just wasted CPU, memory and network - using the cache 
> name for hashtables which will be identical in layout would be extremely 
> useful.
> In cases where the join is pushed through a UNION, those are identical.
> This optimization can only be done without concern for accidental delays when 
> the same upstream task is generating all of these hashtables, which is what 
> is achieved by the shared scan optimizer already.
> In case the shared work is not present, this has potential downsides - in 
> case two customer broadcasts were sourced from "Map 1" and "Map 2", the Map 1 
> builder will block the other task from reading from Map 2, even though Map 2 
> might have started after, but finished ahead of Map 1.
> So this specific optimization can always be considered for cases where the 
> shared work unifies the operator tree and the parents of all the RS entries 
> involved are same (& the RS layout is the same).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20623) Shared work: Extend sharing of map-join cache entries in LLAP

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629703#comment-16629703
 ] 

Hive QA commented on HIVE-20623:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
53s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14071/dev-support/hive-personality.sh
 |
| git revision | master / b49f02c |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14071/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Shared work: Extend sharing of map-join cache entries in LLAP
> -
>
> Key: HIVE-20623
> URL: https://issues.apache.org/jira/browse/HIVE-20623
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Logical Optimizer
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20623.01.patch, HIVE-20623.02.patch, 
> HIVE-20623.02.patch, HIVE-20623.patch, hash-shared-work.json.txt, 
> hash-shared-work.svg
>
>
> For a query like this
> {code}
> with all_sales as (
> select ss_customer_sk as customer_sk, ss_ext_list_price-ss_ext_discount_amt 
> as ext_price from store_sales
> UNION ALL
> select ws_bill_customer_sk as customer_sk, 
> ws_ext_list_price-ws_ext_discount_amt as ext_price from web_sales
> UNION ALL
> select cs_bill_customer_sk as customer_sk, cs_ext_sales_price - 
> cs_ext_discount_amt as ext_price from catalog_sales)
> select sum(ext_price) total_price, c_customer_id from all_sales, customer 
> where customer_sk = c_customer_sk
> group by c_customer_id
> order by total_price desc 
> limit 100;
> {code}
> The hashtable used for all 3

[jira] [Commented] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629688#comment-16629688
 ] 

Hive QA commented on HIVE-20535:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941384/HIVE-20535.17.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14998 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow.testKillQuery (batchId=252)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14070/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14070/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14070/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941384 - PreCommit-HIVE-Build

> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.16.patch, 
> HIVE-20535.17.patch, HIVE-20535.2.patch, HIVE-20535.3.patch, 
> HIVE-20535.4.patch, HIVE-20535.5.patch, HIVE-20535.6.patch, 
> HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20629) Hive incremental replication fails with events missing error if database is kept idle for more than an hour

2018-09-26 Thread mahesh kumar behera (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-20629:
---
Status: Patch Available  (was: Open)

> Hive incremental replication fails with events missing error if database is 
> kept idle for more than an hour 
> 
>
> Key: HIVE-20629
> URL: https://issues.apache.org/jira/browse/HIVE-20629
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20629.01.patch, HIVE-20629.02.patch, 
> HIVE-20629.03.patch
>
>
> Start a source cluster with 2 database. Replicate the databases to target 
> after doing some operations. Keep taking incremental dump for both database 
> and keep replicating them to target cluster. Keep one the database idle for 
> more than 24 hrs. After 24 hrs, the incremental dump of idle database fails 
> with event missing error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Fix Version/s: 4.0.0

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at

[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629673#comment-16629673
 ] 

Sankar Hariappan commented on HIVE-20632:
-

02.patch is committed to master!

Thanks [~maheshk114] and [~jcamachorodriguez]!

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Affects Version/s: (was: 3.2.0)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Target Version/s: 4.0.0  (was: 4.0.0, 3.2.0)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at

[jira] [Commented] (HIVE-20535) Add new configuration to set the size of the global compile lock

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629668#comment-16629668
 ] 

Hive QA commented on HIVE-20535:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
31s{color} | {color:blue} common in master has 65 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
48s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 2 new + 142 unchanged - 6 
fixed = 144 total (was 148) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14070/dev-support/hive-personality.sh
 |
| git revision | master / b6c0cd4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14070/yetus/diff-checkstyle-ql.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14070/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add new configuration to set the size of the global compile lock
> 
>
> Key: HIVE-20535
> URL: https://issues.apache.org/jira/browse/HIVE-20535
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Reporter: denys kuzmenko
>Assignee: denys kuzmenko
>Priority: Major
> Attachments: HIVE-20535.1.patch, HIVE-20535.10.patch, 
> HIVE-20535.11.patch, HIVE-20535.12.patch, HIVE-20535.13.patch, 
> HIVE-20535.14.patch, HIVE-20535.15.patch, HIVE-20535.16.patch, 
> HIVE-20535.17.patch, HIVE-20535.2.patch, HIVE-20535.3.patch, 
> HIVE-20535.4.patch, HIVE-20535.5.patch, HIVE-20535.6.patch, 
> HIVE-20535.8.patch, HIVE-20535.9.patch
>
>
> When removing the compile lock, it is quite risky to remove it entirely.
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17917) VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629641#comment-16629641
 ] 

Hive QA commented on HIVE-17917:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941365/HIVE-17917.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14998 tests 
executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.miniHS2.TestHs2ConnectionMetricsBinary.testOpenConnectionMetrics
 (batchId=256)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14069/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14069/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14069/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941365 - PreCommit-HIVE-Build

> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> ---
>
> Key: HIVE-17917
> URL: https://issues.apache.org/jira/browse/HIVE-17917
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Saurabh Seth
>Priority: Minor
> Attachments: HIVE-17917.2.patch, HIVE-17917.patch
>
>
> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
> computation is currently (after HIVE-17458) is done once per split.  It could 
> instead be done once per file (since the result is the same for each split of 
> the same file) and passed along in OrcSplit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-17917) VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629627#comment-16629627
 ] 

Hive QA commented on HIVE-17917:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
20s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
27s{color} | {color:red} root in master failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
48s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
23s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 23s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} The patch . passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} ql: The patch generated 0 new + 430 unchanged - 2 
fixed = 430 total (was 432) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} ql generated 0 new + 2325 unchanged - 1 fixed = 2325 
total (was 2326) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14069/dev-support/hive-personality.sh
 |
| git revision | master / b6c0cd4 |
| Default Java | 1.8.0_111 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14069/yetus/branch-compile-root.txt
 |
| findbugs | v3.0.0 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14069/yetus/patch-compile-root.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14069/yetus/patch-compile-root.txt
 |
| modules | C: . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14069/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization
> ---
>
> Key: HIVE-17917
> URL: https://issues.apache.org/jira/browse/HIVE-17917
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: Eugene Koifman
>Assignee: Saurabh Seth
>Priority: Minor
> Attachments: HIVE-17917.2.patch, HIVE-17917.patch
>
>
> VectorizedOrcAcidRowBatchReader.computeOffsetAndBucket optimization() 
> computation is currently (after HIVE-17458) is done once per split.  It could 
> instead be done once per file (since the result is the same for each split of 
> the same file) and passed along in OrcSplit



--
This message was sent by Atlassian

[jira] [Updated] (HIVE-20150) TopNKey pushdown

2018-09-26 Thread Teddy Choi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-20150:
--
Attachment: HIVE-20150.11.patch

> TopNKey pushdown
> 
>
> Key: HIVE-20150
> URL: https://issues.apache.org/jira/browse/HIVE-20150
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Teddy Choi
>Assignee: Teddy Choi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20150.1.patch, HIVE-20150.10.patch, 
> HIVE-20150.11.patch, HIVE-20150.2.patch, HIVE-20150.4.patch, 
> HIVE-20150.5.patch, HIVE-20150.6.patch, HIVE-20150.7.patch, 
> HIVE-20150.8.patch, HIVE-20150.9.patch
>
>
> TopNKey operator is implemented in HIVE-17896, but it needs more work in 
> pushdown implementation. So this issue covers TopNKey pushdown implementation 
> with proper tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-26 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629582#comment-16629582
 ] 

Prasanth Jayachandran commented on HIVE-20540:
--

ok. I was under the impression that for this rowNum set this bucketNum. Not 
exactly sure how/where setRowNum and setBucketNum methods are invoked. 

 

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-26 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629583#comment-16629583
 ] 

Prasanth Jayachandran commented on HIVE-20540:
--

+1

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20638) Upgrade version of Jetty to 9.3.25.v20180904

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629575#comment-16629575
 ] 

Hive QA commented on HIVE-20638:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941363/HIVE-20638.01.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 14994 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.hbase.TestHBaseSerDe.testHBaseSerDeWithAvroSchemaUrl 
(batchId=197)
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS
 (batchId=322)
org.apache.hive.hcatalog.cli.TestPermsGrp.testCustomPerms (batchId=209)
org.apache.hive.hcatalog.templeton.TestWebHCatE2e.org.apache.hive.hcatalog.templeton.TestWebHCatE2e
 (batchId=205)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14068/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14068/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14068/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941363 - PreCommit-HIVE-Build

> Upgrade version of Jetty to 9.3.25.v20180904
> 
>
> Key: HIVE-20638
> URL: https://issues.apache.org/jira/browse/HIVE-20638
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-20638.01.patch
>
>
> Current version is 9.3.20.v20170531



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18876) Remove Superfluous Logging in Driver

2018-09-26 Thread Naveen Gangam (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-18876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629557#comment-16629557
 ] 

Naveen Gangam commented on HIVE-18876:
--

Looks good to me. +1 from me.

> Remove Superfluous Logging in Driver
> 
>
> Key: HIVE-18876
> URL: https://issues.apache.org/jira/browse/HIVE-18876
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Trivial
>  Labels: noob
> Attachments: HIVE-18876.1.patch, HIVE-18876.2.patch
>
>
> [https://github.com/apache/hive/blob/a4198f584aa0792a16d1e1eeb2ef3147403b8acb/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L2188-L2190]
> {code:java}
> if (console != null) {
>   console.printInfo("OK");
> }
> {code}
>  # Console can never be 'null'
>  # OK is not an informative logging message, and in the HiveServer2 logs, it 
> is often interwoven into the logging and is pretty much useless on its own, 
> without having to trace back through the logs to see what happened before it. 
>  This is also printed out, even if an error occurred
> Please remove this block of code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20638) Upgrade version of Jetty to 9.3.25.v20180904

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629553#comment-16629553
 ] 

Hive QA commented on HIVE-20638:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
48s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
24s{color} | {color:red} root in master failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
22s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
27s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
20s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 20s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  xml  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14068/dev-support/hive-personality.sh
 |
| git revision | master / b6c0cd4 |
| Default Java | 1.8.0_111 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14068/yetus/branch-compile-root.txt
 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14068/yetus/patch-compile-root.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14068/yetus/patch-compile-root.txt
 |
| modules | C: . U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14068/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Upgrade version of Jetty to 9.3.25.v20180904
> 
>
> Key: HIVE-20638
> URL: https://issues.apache.org/jira/browse/HIVE-20638
> Project: Hive
>  Issue Type: Bug
>Reporter: Laszlo Bodor
>Assignee: Laszlo Bodor
>Priority: Major
> Attachments: HIVE-20638.01.patch
>
>
> Current version is 9.3.20.v20170531



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19378) "hive.lock.numretries" Is Misleading

2018-09-26 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-19378:
-
Status: Open  (was: Patch Available)

> "hive.lock.numretries" Is Misleading
> 
>
> Key: HIVE-19378
> URL: https://issues.apache.org/jira/browse/HIVE-19378
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0, 2.4.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19378.1.patch
>
>
> Configuration 'hive.lock.numretries' is confusing.  It's not actually a 
> 'retry' count, it's the total number of attempt to try:
>  
> {code:java|title=ZooKeeperHiveLockManager.java}
> do {
>   lastException = null;
>   tryNum++;
>   try {
> if (tryNum > 1) {
>   Thread.sleep(sleepTime);
>   prepareRetry();
> }
> ret = lockPrimitive(key, mode, keepAlive, parentCreated, 
> conflictingLocks);
> ...
> } while (tryNum < numRetriesForLock);
> {code}
> So, from this code you can see that on the first loop, {{tryNum}} is set to 
> 1, in which case, if the configuration num*retries* is set to 1, there will 
> be one attempt total.  With a *retry* value of 1, I would assume one initial 
> attempt and one additional retry.  Please change to:
> {code}
> while (tryNum <= numRetriesForLock);
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16370) Avro data type null not supported on partitioned tables

2018-09-26 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-16370:
-
Status: Open  (was: Patch Available)

> Avro data type null not supported on partitioned tables
> ---
>
> Key: HIVE-16370
> URL: https://issues.apache.org/jira/browse/HIVE-16370
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1, 1.1.0
>Reporter: rui miranda
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-16370.01-branch-1.patch, HIVE-16370.2.patch
>
>
> I was attempting to create hive tables over some partitioned Avro files. It 
> seems the void data type (Avro null) is not supported on partitioned tables 
> (i could not replicate the bug on an un-partitioned table).
> ---
> i managed to replicate the bug on two different hive versions.
> Hive 1.1.0-cdh5.10.0
> Hive 2.1.1-amzn-0
> 
> how to replicate (avro tools are required to create the avro files):
> $ wget 
> http://mirror.serversupportforum.de/apache/avro/avro-1.8.1/java/avro-tools-1.8.1.jar
> $ mkdir /tmp/avro
> $ mkdir /tmp/avro/null
> $ echo "{ \
>   \"type\" : \"record\", \
>   \"name\" : \"null_failure\", \
>   \"namespace\" : \"org.apache.avro.null_failure\", \
>   \"doc\":\"the purpose of this schema is to replicate the hive avro null 
> failure\", \
>   \"fields\" : [{\"name\":\"one\", \"type\":\"null\",\"default\":null}] \
> } " > /tmp/avro/null/schema.avsc
> $ echo "{\"one\":null}" > /tmp/avro/null/data.json
> $ java -jar avro-tools-1.8.1.jar fromjson --schema-file 
> /tmp/avro/null/schema.avsc /tmp/avro/null/data.json > /tmp/avro/null/data.avro
> $ hdfs dfs -mkdir /tmp/avro
> $ hdfs dfs -mkdir /tmp/avro/null
> $ hdfs dfs -mkdir /tmp/avro/null/schema
> $ hdfs dfs -mkdir /tmp/avro/null/data
> $ hdfs dfs -mkdir /tmp/avro/null/data/foo=bar
> $ hdfs dfs -copyFromLocal /tmp/avro/null/schema.avsc 
> /tmp/avro/null/schema/schema.avsc
> $ hdfs dfs -copyFromLocal /tmp/avro/null/data.avro 
> /tmp/avro/null/data/foo=bar/data.avro
> $ hive 
> hive> CREATE EXTERNAL TABLE avro_null
> PARTITIONED BY (foo string)
>   ROW FORMAT SERDE
>   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
>   STORED as INPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>   OUTPUTFORMAT
>   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION
> '/tmp/avro/null/data/'
>   TBLPROPERTIES (
> 'avro.schema.url'='/tmp/avro/null/schema/schema.avsc')
> ;
> OK
> Time taken: 3.127 seconds
> hive> msck repair table avro_null;
> OK
> Partitions not in metastore:  avro_null:foo=bar
> Repair: Added partition to metastore avro_null:foo=bar
> Time taken: 0.712 seconds, Fetched: 2 row(s)
> hive> select * from avro_null;
> FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
> Failed with exception Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported 
> yet.java.lang.RuntimeException: Hive internal error inside 
> isAssignableFromSettablePrimitiveOI void not supported yet.
> hive> select foo, count(1)  from avro_null group by foo;
> OK
> bar   1
> Time taken: 29.806 seconds, Fetched: 1 row(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-10296) Cast exception observed when hive runs a multi join query on metastore (postgres), since postgres pushes the filter into the join, and ignores the condition before apply

2018-09-26 Thread Karthik Manamcheri (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Manamcheri reassigned HIVE-10296:
-

Assignee: Karthik Manamcheri

> Cast exception observed when hive runs a multi join query on metastore 
> (postgres), since postgres pushes the filter into the join, and ignores the 
> condition before applying cast
> -
>
> Key: HIVE-10296
> URL: https://issues.apache.org/jira/browse/HIVE-10296
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Yash Datta
>Assignee: Karthik Manamcheri
>Priority: Major
>
> Try to drop a partition from hive:
> ALTER TABLE f___edr_bin_source___900_sub_id DROP IF EXISTS PARTITION ( 
> exporttimestamp=1427824800, timestamp=1427824800)
> This triggers a query on the metastore like this :
>  "select "PARTITIONS"."PART_ID" from "PARTITIONS" inner join "TBLS" on 
> "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ? inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID" and "DBS"."NAME" = ? inner join 
> "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 inner join 
> "PARTITION_KEY_VALS" "FILTER1" on "FILTER1"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER1"."INTEGER_IDX" = 1 where ( (((case when 
> "TBLS"."TBL_NAME" = ? and "DBS"."NAME" = ? then cast("FILTER0"."PART_KEY_VAL" 
> as decimal(21,0)) else null end) = ?) and ((case when "TBLS"."TBL_NAME" = ? 
> and "DBS"."NAME" = ? then cast("FILTER1"."PART_KEY_VAL" as decimal(21,0)) 
> else null end) = ?)) )"
> In some cases, when the internal tables in postgres (metastore) have some 
> amount of data, the query plan pushes the condition down into the join.
> Now because of DERBY-6358 , case when clause is used before the cast, but in 
> this case , cast is evaluated before condition being evaluated. So in case we 
> have different tables partitioned on string and integer columns, cast 
> exception is observed!
> 15/04/06 08:41:20 ERROR metastore.ObjectStore: Direct SQL failed, falling 
> back to ORM 
> javax.jdo.JDODataStoreException: Error executing SQL query "select 
> "PARTITIONS"."PART_ID" from "PARTITIONS" inner join "TBLS" on 
> "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ? inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID" and "DBS"."NAME" = ? inner join 
> "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 inner join 
> "PARTITION_KEY_VALS" "FILTER1" on "FILTER1"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER1"."INTEGER_IDX" = 1 where ( (((case when 
> "TBLS"."TBL_NAME" = ? and "DBS"."NAME" = ? then cast("FILTER0"."PART_KEY_VAL" 
> as decimal(21,0)) else null end) = ?) and ((case when "TBLS"."TBL_NAME" = ? 
> and "DBS"."NAME" = ? then cast("FILTER1"."PART_KEY_VAL" as decimal(21,0)) 
> else null end) = ?)) )". 
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
>  
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:300)
>  
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:211)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1915)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1909)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:1909)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:1882)
>  
> org.postgresql.util.PSQLException: ERROR: invalid input syntax for type 
> numeric: "__DEFAULT_BINSRC__" 
> 15/04/06 08:41:20 INFO metastore.ObjectStore: JDO filter pushdown cannot be 
> used: Filtering is supported only on partition keys of type string 
> 15/04/06 08:41:20 ERROR metastore.ObjectStore: 
> javax.jdo.JDOException: Exception thrown when executing query 
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
>  
> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:275) 
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesNoTxn(ObjectStore.java:1700)
>  
> at 
>

[jira] [Commented] (HIVE-10296) Cast exception observed when hive runs a multi join query on metastore (postgres), since postgres pushes the filter into the join, and ignores the condition before appl

2018-09-26 Thread Karthik Manamcheri (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629547#comment-16629547
 ] 

Karthik Manamcheri commented on HIVE-10296:
---

[~saucam] [~sershe] I am also facing this and would like to fix this. Is the 
suggested fix to add it to the list?

To answer your question about DEFAULT_BINSRC, I am sure its the first value 
which is non-numeric in PARTITION_KEY_VALS

> Cast exception observed when hive runs a multi join query on metastore 
> (postgres), since postgres pushes the filter into the join, and ignores the 
> condition before applying cast
> -
>
> Key: HIVE-10296
> URL: https://issues.apache.org/jira/browse/HIVE-10296
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Yash Datta
>Priority: Major
>
> Try to drop a partition from hive:
> ALTER TABLE f___edr_bin_source___900_sub_id DROP IF EXISTS PARTITION ( 
> exporttimestamp=1427824800, timestamp=1427824800)
> This triggers a query on the metastore like this :
>  "select "PARTITIONS"."PART_ID" from "PARTITIONS" inner join "TBLS" on 
> "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ? inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID" and "DBS"."NAME" = ? inner join 
> "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 inner join 
> "PARTITION_KEY_VALS" "FILTER1" on "FILTER1"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER1"."INTEGER_IDX" = 1 where ( (((case when 
> "TBLS"."TBL_NAME" = ? and "DBS"."NAME" = ? then cast("FILTER0"."PART_KEY_VAL" 
> as decimal(21,0)) else null end) = ?) and ((case when "TBLS"."TBL_NAME" = ? 
> and "DBS"."NAME" = ? then cast("FILTER1"."PART_KEY_VAL" as decimal(21,0)) 
> else null end) = ?)) )"
> In some cases, when the internal tables in postgres (metastore) have some 
> amount of data, the query plan pushes the condition down into the join.
> Now because of DERBY-6358 , case when clause is used before the cast, but in 
> this case , cast is evaluated before condition being evaluated. So in case we 
> have different tables partitioned on string and integer columns, cast 
> exception is observed!
> 15/04/06 08:41:20 ERROR metastore.ObjectStore: Direct SQL failed, falling 
> back to ORM 
> javax.jdo.JDODataStoreException: Error executing SQL query "select 
> "PARTITIONS"."PART_ID" from "PARTITIONS" inner join "TBLS" on 
> "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ? inner join 
> "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID" and "DBS"."NAME" = ? inner join 
> "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 inner join 
> "PARTITION_KEY_VALS" "FILTER1" on "FILTER1"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER1"."INTEGER_IDX" = 1 where ( (((case when 
> "TBLS"."TBL_NAME" = ? and "DBS"."NAME" = ? then cast("FILTER0"."PART_KEY_VAL" 
> as decimal(21,0)) else null end) = ?) and ((case when "TBLS"."TBL_NAME" = ? 
> and "DBS"."NAME" = ? then cast("FILTER1"."PART_KEY_VAL" as decimal(21,0)) 
> else null end) = ?)) )". 
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)
>  
> at 
> org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) 
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:300)
>  
> at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:211)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1915)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1909)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:1909)
>  
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:1882)
>  
> org.postgresql.util.PSQLException: ERROR: invalid input syntax for type 
> numeric: "__DEFAULT_BINSRC__" 
> 15/04/06 08:41:20 INFO metastore.ObjectStore: JDO filter pushdown cannot be 
> used: Filtering is supported only on partition keys of type string 
> 15/04/06 08:41:20 ERROR metastore.ObjectStore: 
> javax.jdo.JDOException: Exception thrown when executing query 
> at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
>  
> at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:275) 
> at 
>

[jira] [Updated] (HIVE-17231) ColumnizedDeleteEventRegistry.DeleteReaderValue optimization

2018-09-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-17231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17231:
--
Description: 
 For unbucketed tables DeleteReaderValue will currently return all delete 
events.  Once we trust that
 the N in bucketN for "base" spit is reliable, all delete events not 
matching N can be skipped.

This is useful to protect against extreme cases where someone runs an 
update/delete on a partition that matches 10 billion rows thus generates very 
many delete events.

Since HIVE-19890 all acid data files must have bucketid/writerid in the file 
name match bucketid/writerid in ROW__ID in the data.

{{OrcRawRecrodMerger.getDeltaFiles()}} should only return files representing 
the right {{bucket}}

  was:
 For unbucketed tables DeleteReaderValue will currently return all delete 
events.  Once we trust that
 the N in bucketN for "base" spit is reliable, all delete events not 
matching N can be skipped.

This is useful to protect against extreme cases where someone runs an 
update/delete on a partition that matches 10 billion rows thus generates very 
many delete events.

Since HIVE-19890 all acid data files must have bucketid/writerid in the file 
name match bucketid/writerid in ROW__ID in the data.


> ColumnizedDeleteEventRegistry.DeleteReaderValue optimization
> 
>
> Key: HIVE-17231
> URL: https://issues.apache.org/jira/browse/HIVE-17231
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Priority: Major
>
>  For unbucketed tables DeleteReaderValue will currently return all delete 
> events.  Once we trust that
>  the N in bucketN for "base" spit is reliable, all delete events not 
> matching N can be skipped.
> This is useful to protect against extreme cases where someone runs an 
> update/delete on a partition that matches 10 billion rows thus generates very 
> many delete events.
> Since HIVE-19890 all acid data files must have bucketid/writerid in the file 
> name match bucketid/writerid in ROW__ID in the data.
> {{OrcRawRecrodMerger.getDeltaFiles()}} should only return files representing 
> the right {{bucket}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20593) Load Data for partitioned ACID tables fails with bucketId out of range: -1

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20593:
--
Fix Version/s: 4.0.0

> Load Data for partitioned ACID tables fails with bucketId out of range: -1
> --
>
> Key: HIVE-20593
> URL: https://issues.apache.org/jira/browse/HIVE-20593
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20593.1.patch, HIVE-20593.2.patch, 
> HIVE-20593.3.patch
>
>
> Load data for ACID tables is failing to load ORC files when it is converted 
> to IAS job.
>  
> The tempTblObj is inherited from target table. However, the only table 
> property which needs to be inherited is bucketing version. Properties like 
> transactional etc should be ignored.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20510) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20510:
--
Fix Version/s: 4.0.0

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer
> 
>
> Key: HIVE-20510
> URL: https://issues.apache.org/jira/browse/HIVE-20510
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20510.1.patch, HIVE-20510.2.patch, 
> HIVE-20510.3.patch, HIVE-20510.4.patch, HIVE-20510.5.patch
>
>
> sorted dynamic partition optimizer does not work on bucketed tables when 
> vectorization is enabled.
>  
> cc [~mmccline]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20510) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer

2018-09-26 Thread Deepak Jaiswal (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629544#comment-16629544
 ] 

Deepak Jaiswal commented on HIVE-20510:
---

Just updated. Thanks for reminding.

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer
> 
>
> Key: HIVE-20510
> URL: https://issues.apache.org/jira/browse/HIVE-20510
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20510.1.patch, HIVE-20510.2.patch, 
> HIVE-20510.3.patch, HIVE-20510.4.patch, HIVE-20510.5.patch
>
>
> sorted dynamic partition optimizer does not work on bucketed tables when 
> vectorization is enabled.
>  
> cc [~mmccline]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-26 Thread Deepak Jaiswal (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629543#comment-16629543
 ] 

Deepak Jaiswal commented on HIVE-20540:
---

They both are set separately. Hence individual flags.

 

setRowNum(), setBucketNum() setBucketNum() may result in wrong results.

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20642) Add Tests for HIVE-12812

2018-09-26 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan reassigned HIVE-20642:



> Add Tests for HIVE-12812
> 
>
> Key: HIVE-20642
> URL: https://issues.apache.org/jira/browse/HIVE-20642
> Project: Hive
>  Issue Type: Test
>Affects Versions: 4.0.0
>Reporter: Alice Fan
>Assignee: Alice Fan
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20319) group by and union all always generate empty query result

2018-09-26 Thread Alice Fan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629540#comment-16629540
 ] 

Alice Fan commented on HIVE-20319:
--

Hi [~tiana528],
This issue will be resolved by applying this patch of 
[HIVE-12812|https://issues.apache.org/jira/browse/HIVE-12812]. As union remove 
optimizer will write the result directly to nested subdirs and it will require 
to enable mapred.input.dir.recursive. For this case, a workaround will be 'Set 
mapred.input.dir.recursive=true' in session level, then you will be able to see 
correct result. I will add your example as test case for HIVE-12812. Thanks.

> group by and union all always generate empty query result
> -
>
> Key: HIVE-20319
> URL: https://issues.apache.org/jira/browse/HIVE-20319
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.3.2
> Environment: Run on MR, hadoop 2.7.3
>Reporter: Wang Yan
>Priority: Blocker
>
> The following query always generates empty results which is wrong.
> {code:sql}
> create table if not exists test_table(column1 string, column2 int);
> insert into test_table values('a',1),('b',2);
> set hive.optimize.union.remove=true;
> select column1 from test_table group by column1
> union all
> select column1 from test_table group by column1;
> {code}
> Actual result : empty
> Expected result: 
> {code:java}
> a
> b
> a
> b
> {code}
> Note that correct result is generated when set 
> hive.optimize.union.remove=false.
> It seems like the fix in HIVE-12788 is insufficient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20637) Allow any udfs with 0 arguments or with constant arguments as part of default clause

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629519#comment-16629519
 ] 

Hive QA commented on HIVE-20637:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941353/HIVE-20637.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14996 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14067/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14067/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14067/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941353 - PreCommit-HIVE-Build

> Allow any udfs with 0 arguments or with constant arguments as part of default 
> clause
> 
>
> Key: HIVE-20637
> URL: https://issues.apache.org/jira/browse/HIVE-20637
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Affects Versions: 3.0.1
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
> Fix For: 3.0.1
>
> Attachments: HIVE-20637.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19891) inserting into external tables with custom partition directories may cause data loss

2018-09-26 Thread Peter Vary (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629516#comment-16629516
 ] 

Peter Vary commented on HIVE-19891:
---

Upps... :) Sorry [~sershe]... Some kind of keyboard shortcut?

> inserting into external tables with custom partition directories may cause 
> data loss
> 
>
> Key: HIVE-19891
> URL: https://issues.apache.org/jira/browse/HIVE-19891
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19891.01.patch, HIVE-19891.02.patch, 
> HIVE-19891.03.patch, HIVE-19891.04.patch, HIVE-19891.05.patch, 
> HIVE-19891.06.patch, HIVE-19891.07.patch, HIVE-19891.patch
>
>
> tbl1 is just used as a prop to create data, could be an existing directory 
> for an external table.
> Due to weird behavior of LoadTableDesc (some ancient code for overriding old 
> partition path), custom partition path is overwritten after the query and the 
> data in it ceases being a part of the table (can be seen in desc formatted 
> output with masking commented out in QTestUtil)
> This affects branch-1 too, so it's pretty old.
> {noformat}drop table tbl1;
> CREATE TABLE tbl1 (index int, value int ) PARTITIONED BY ( created_date 
> string );
> insert into tbl1 partition(created_date='2018-02-01') VALUES (2, 2);
> CREATE external TABLE tbl2 (index int, value int ) PARTITIONED BY ( 
> created_date string );
> ALTER TABLE tbl2 ADD PARTITION(created_date='2018-02-01');
> ALTER TABLE tbl2 PARTITION(created_date='2018-02-01') SET LOCATION 
> 'file:/Users/sergey/git/hivegit/itests/qtest/target/warehouse/tbl1/created_date=2018-02-01';
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> insert into tbl2 partition(created_date='2018-02-01') VALUES (1, 1);
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19891) inserting into external tables with custom partition directories may cause data loss

2018-09-26 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629515#comment-16629515
 ] 

Sergey Shelukhin commented on HIVE-19891:
-

Hmm?

> inserting into external tables with custom partition directories may cause 
> data loss
> 
>
> Key: HIVE-19891
> URL: https://issues.apache.org/jira/browse/HIVE-19891
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19891.01.patch, HIVE-19891.02.patch, 
> HIVE-19891.03.patch, HIVE-19891.04.patch, HIVE-19891.05.patch, 
> HIVE-19891.06.patch, HIVE-19891.07.patch, HIVE-19891.patch
>
>
> tbl1 is just used as a prop to create data, could be an existing directory 
> for an external table.
> Due to weird behavior of LoadTableDesc (some ancient code for overriding old 
> partition path), custom partition path is overwritten after the query and the 
> data in it ceases being a part of the table (can be seen in desc formatted 
> output with masking commented out in QTestUtil)
> This affects branch-1 too, so it's pretty old.
> {noformat}drop table tbl1;
> CREATE TABLE tbl1 (index int, value int ) PARTITIONED BY ( created_date 
> string );
> insert into tbl1 partition(created_date='2018-02-01') VALUES (2, 2);
> CREATE external TABLE tbl2 (index int, value int ) PARTITIONED BY ( 
> created_date string );
> ALTER TABLE tbl2 ADD PARTITION(created_date='2018-02-01');
> ALTER TABLE tbl2 PARTITION(created_date='2018-02-01') SET LOCATION 
> 'file:/Users/sergey/git/hivegit/itests/qtest/target/warehouse/tbl1/created_date=2018-02-01';
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> insert into tbl2 partition(created_date='2018-02-01') VALUES (1, 1);
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-19891) inserting into external tables with custom partition directories may cause data loss

2018-09-26 Thread Sergey Shelukhin (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-19891:
---

Assignee: Sergey Shelukhin  (was: Peter Vary)

> inserting into external tables with custom partition directories may cause 
> data loss
> 
>
> Key: HIVE-19891
> URL: https://issues.apache.org/jira/browse/HIVE-19891
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19891.01.patch, HIVE-19891.02.patch, 
> HIVE-19891.03.patch, HIVE-19891.04.patch, HIVE-19891.05.patch, 
> HIVE-19891.06.patch, HIVE-19891.07.patch, HIVE-19891.patch
>
>
> tbl1 is just used as a prop to create data, could be an existing directory 
> for an external table.
> Due to weird behavior of LoadTableDesc (some ancient code for overriding old 
> partition path), custom partition path is overwritten after the query and the 
> data in it ceases being a part of the table (can be seen in desc formatted 
> output with masking commented out in QTestUtil)
> This affects branch-1 too, so it's pretty old.
> {noformat}drop table tbl1;
> CREATE TABLE tbl1 (index int, value int ) PARTITIONED BY ( created_date 
> string );
> insert into tbl1 partition(created_date='2018-02-01') VALUES (2, 2);
> CREATE external TABLE tbl2 (index int, value int ) PARTITIONED BY ( 
> created_date string );
> ALTER TABLE tbl2 ADD PARTITION(created_date='2018-02-01');
> ALTER TABLE tbl2 PARTITION(created_date='2018-02-01') SET LOCATION 
> 'file:/Users/sergey/git/hivegit/itests/qtest/target/warehouse/tbl1/created_date=2018-02-01';
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> insert into tbl2 partition(created_date='2018-02-01') VALUES (1, 1);
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16812:
--
Attachment: HIVE-16812.06.patch

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch, HIVE-16812.06.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-26 Thread Alice Fan (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629510#comment-16629510
 ] 

Alice Fan commented on HIVE-19302:
--

Tested load_data_using_job.q and 
TestHs2ConnectionMetricsBinary.testOpenConnectionMetrics locally. They both 
passed with the patch 7.

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, HIVE-19302.7.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20637) Allow any udfs with 0 arguments or with constant arguments as part of default clause

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629507#comment-16629507
 ] 

Hive QA commented on HIVE-20637:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
38s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
11s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
26s{color} | {color:red} root in master failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
52s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  7m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
21s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  3m 21s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} The patch . passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} ql: The patch generated 0 new + 808 unchanged - 3 
fixed = 808 total (was 811) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} ql generated 0 new + 2325 unchanged - 1 fixed = 2325 
total (was 2326) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  6m 
19s{color} | {color:red} root generated 1 new + 387 unchanged - 1 fixed = 388 
total (was 388) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
53s{color} | {color:red} ql generated 1 new + 99 unchanged - 1 fixed = 100 
total (was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14067/dev-support/hive-personality.sh
 |
| git revision | master / b6c0cd4 |
| Default Java | 1.8.0_111 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14067/yetus/branch-compile-root.txt
 |
| findbugs | v3.0.0 |
| compile | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14067/yetus/patch-compile-root.txt
 |
| javac | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14067/yetus/patch-compile-root.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14067/yetus/diff-javadoc-javadoc-root.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14067/yetus/diff-javadoc-javadoc-ql.txt
 |
| modules | C: . ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14067/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Allow any udfs with 0 arguments or with constant arguments as part of default 
> clause
> 
>
> Key: HIVE-20637
> URL: https://issues.apache.org/jira/browse/HIVE-20637
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Affects Versions: 3.0.1
>Reporter: Miklos

[jira] [Commented] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-26 Thread Naveen Gangam (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629497#comment-16629497
 ] 

Naveen Gangam commented on HIVE-19302:
--

Looks good to me. +1 pending test failure analysis

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, HIVE-19302.7.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12812) Enable mapred.input.dir.recursive by default to support union with aggregate function

2018-09-26 Thread Naveen Gangam (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629495#comment-16629495
 ] 

Naveen Gangam commented on HIVE-12812:
--

Looks good to me as well. +1 for me

> Enable mapred.input.dir.recursive by default to support union with aggregate 
> function
> -
>
> Key: HIVE-12812
> URL: https://issues.apache.org/jira/browse/HIVE-12812
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Chaoyu Tang
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-12812.1.patch, HIVE-12812.2.patch, 
> HIVE-12812.patch, HIVE-12812.patch, HIVE-12812.patch
>
>
> When union remove optimization is enabled, union query with aggregate 
> function writes its subquery intermediate results to subdirs which needs 
> mapred.input.dir.recursive to be enabled in order to be fetched. This 
> property is not defined by default in Hive and often ignored by user, which 
> causes the query failure and is hard to be debugged.
> So we need set mapred.input.dir.recursive to true whenever union remove 
> optimization is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20619) Include MultiDelimitSerDe in HIveServer2 By Default

2018-09-26 Thread Naveen Gangam (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629490#comment-16629490
 ] 

Naveen Gangam commented on HIVE-20619:
--

Looks good to me. So +1 for me.

> Include MultiDelimitSerDe in HIveServer2 By Default
> ---
>
> Key: HIVE-20619
> URL: https://issues.apache.org/jira/browse/HIVE-20619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-20619.1.patch, HIVE-20619.2.patch
>
>
> In [HIVE-20020], the hive-contrib JAR file was removed from the HiveServer2 
> classpath.  With this change, the {{MultiDelimitSerDe}} is no longer 
> included.  This is fine, because {{MultiDelimitSerDe}} was a pain in that 
> environment anyway.  It was available to HiveServer2, and therefore would 
> work with a limited set of queries (select * from table limit 1) but any 
> other query on that table which launched a MapReduce project would fail 
> because the hive-contrib JAR file was not sent out with the rest of the Hive 
> JARs for MapReduce jobs.
> Please bring {{MultiDelimitSerDe}} back into the fold so that it's available 
> to users out of the box without having to install the hive-contrib JAR into 
> the HiveServer2 auxiliary directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20562) Intermittent test failures from Druid tests

2018-09-26 Thread Janaki Lahorani (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629482#comment-16629482
 ] 

Janaki Lahorani commented on HIVE-20562:


Thanks [~bslim].  I will mark it as resolved.

> Intermittent test failures from Druid tests
> ---
>
> Key: HIVE-20562
> URL: https://issues.apache.org/jira/browse/HIVE-20562
> Project: Hive
>  Issue Type: Bug
>Reporter: Janaki Lahorani
>Assignee: slim bouguerra
>Priority: Major
>
> Druid tests are failing intermittently in Hive Pre-commit jobs.
> The typical failures include:
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_dynamic_partition]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_expressions]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_alter]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_insert]
>  (batchId=193)
> The test log shows the following:
> Exception: org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: 
> java.sql.SQLException: Cannot create PoolableConnectionFactory 
> (java.net.ConnectException : Error connecting to server localhost on port 
> 60,000 with message Connection refused.)
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: 
> java.sql.SQLException: Cannot create PoolableConnectionFactory 
> (java.net.ConnectException : Error connecting to server localhost on port 
> 60,000 with message Connection refused.)
>   at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:1077)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.clearTablesCreatedDuringTests(QTestUtil.java:958)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.clearTestSideEffects(QTestUtil.java:1039)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver$5.invokeInternal(CoreCliDriver.java:135)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver$5.invokeInternal(CoreCliDriver.java:131)
>   at 
> org.apache.hadoop.hive.util.ElapsedTimeLoggingWrapper.invoke(ElapsedTimeLoggingWrapper.java:33)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.tearDown(CoreCliDriver.java:138)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:94)
> The following search shows many Hive Jiras with patches where Druid tests are 
> failing.
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HIVE%20AND%20text%20~%20druidmini%20ORDER%20BY%20key%20DESC



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HIVE-20562) Intermittent test failures from Druid tests

2018-09-26 Thread Janaki Lahorani (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani resolved HIVE-20562.

Resolution: Duplicate

> Intermittent test failures from Druid tests
> ---
>
> Key: HIVE-20562
> URL: https://issues.apache.org/jira/browse/HIVE-20562
> Project: Hive
>  Issue Type: Bug
>Reporter: Janaki Lahorani
>Assignee: slim bouguerra
>Priority: Major
>
> Druid tests are failing intermittently in Hive Pre-commit jobs.
> The typical failures include:
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_dynamic_partition]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_expressions]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test1]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_alter]
>  (batchId=193)
> org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver[druidmini_test_insert]
>  (batchId=193)
> The test log shows the following:
> Exception: org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: 
> java.sql.SQLException: Cannot create PoolableConnectionFactory 
> (java.net.ConnectException : Error connecting to server localhost on port 
> 60,000 with message Connection refused.)
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.skife.jdbi.v2.exceptions.UnableToObtainConnectionException: 
> java.sql.SQLException: Cannot create PoolableConnectionFactory 
> (java.net.ConnectException : Error connecting to server localhost on port 
> 60,000 with message Connection refused.)
>   at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:1077)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.clearTablesCreatedDuringTests(QTestUtil.java:958)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.clearTestSideEffects(QTestUtil.java:1039)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver$5.invokeInternal(CoreCliDriver.java:135)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver$5.invokeInternal(CoreCliDriver.java:131)
>   at 
> org.apache.hadoop.hive.util.ElapsedTimeLoggingWrapper.invoke(ElapsedTimeLoggingWrapper.java:33)
>   at 
> org.apache.hadoop.hive.cli.control.CoreCliDriver.tearDown(CoreCliDriver.java:138)
>   at 
> org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:94)
> The following search shows many Hive Jiras with patches where Druid tests are 
> failing.
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20HIVE%20AND%20text%20~%20druidmini%20ORDER%20BY%20key%20DESC



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629477#comment-16629477
 ] 

Hive QA commented on HIVE-20632:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941425/HIVE-20632.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14999 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14066/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14066/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14066/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941425 - PreCommit-HIVE-Build

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
>

[jira] [Assigned] (HIVE-19891) inserting into external tables with custom partition directories may cause data loss

2018-09-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-19891:
-

Assignee: Peter Vary  (was: Sergey Shelukhin)

> inserting into external tables with custom partition directories may cause 
> data loss
> 
>
> Key: HIVE-19891
> URL: https://issues.apache.org/jira/browse/HIVE-19891
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Peter Vary
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-19891.01.patch, HIVE-19891.02.patch, 
> HIVE-19891.03.patch, HIVE-19891.04.patch, HIVE-19891.05.patch, 
> HIVE-19891.06.patch, HIVE-19891.07.patch, HIVE-19891.patch
>
>
> tbl1 is just used as a prop to create data, could be an existing directory 
> for an external table.
> Due to weird behavior of LoadTableDesc (some ancient code for overriding old 
> partition path), custom partition path is overwritten after the query and the 
> data in it ceases being a part of the table (can be seen in desc formatted 
> output with masking commented out in QTestUtil)
> This affects branch-1 too, so it's pretty old.
> {noformat}drop table tbl1;
> CREATE TABLE tbl1 (index int, value int ) PARTITIONED BY ( created_date 
> string );
> insert into tbl1 partition(created_date='2018-02-01') VALUES (2, 2);
> CREATE external TABLE tbl2 (index int, value int ) PARTITIONED BY ( 
> created_date string );
> ALTER TABLE tbl2 ADD PARTITION(created_date='2018-02-01');
> ALTER TABLE tbl2 PARTITION(created_date='2018-02-01') SET LOCATION 
> 'file:/Users/sergey/git/hivegit/itests/qtest/target/warehouse/tbl1/created_date=2018-02-01';
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> insert into tbl2 partition(created_date='2018-02-01') VALUES (1, 1);
> select * from tbl2;
> describe formatted tbl2 partition(created_date='2018-02-01');
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629446#comment-16629446
 ] 

Hive QA commented on HIVE-20632:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
49s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m  
1s{color} | {color:blue} standalone-metastore/metastore-server in master has 
182 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
17s{color} | {color:red} itests/hive-unit: The patch generated 1 new + 188 
unchanged - 0 fixed = 189 total (was 188) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} ql: The patch generated 1 new + 442 unchanged - 2 
fixed = 443 total (was 444) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  3m 
52s{color} | {color:red} ql generated 1 new + 2325 unchanged - 1 fixed = 2326 
total (was 2326) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Exception is caught when Exception is not thrown in 
org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(String, List, 
List, boolean, HiveTxnManager)  At Hive.java:is not thrown in 
org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(String, List, 
List, boolean, HiveTxnManager)  At Hive.java:[line 1576] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14066/dev-support/hive-personality.sh
 |
| git revision | master / b6c0cd4 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14066/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14066/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14066/yetus/new-findbugs-ql.html
 |
| modules | C: itests/hive-unit ql standalone-metastore/metastore-server U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14066/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically

[jira] [Updated] (HIVE-20639) Add ability to Write Data from Hive Table/Query to Kafka Topic

2018-09-26 Thread slim bouguerra (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20639:
--
Attachment: HIVE-20639.2.patch

> Add ability to Write Data from Hive Table/Query to Kafka Topic
> --
>
> Key: HIVE-20639
> URL: https://issues.apache.org/jira/browse/HIVE-20639
> Project: Hive
>  Issue Type: New Feature
>  Components: kafka integration
>Reporter: slim bouguerra
>Assignee: slim bouguerra
>Priority: Major
> Attachments: HIVE-20639.2.patch, HIVE-20639.patch
>
>
> This patch adds multiple record writers to allow Hive user writing data 
> directly to a Kafka Topic.
> The writer provides multiple write semantics modes.
> * A None where all the records will be delivered with no guarantee or reties.
> * B At_least_once, each record will be delivered with retries from the Kafka 
> Producer and Hive Write Task. 
> * C Exactly_once , Writer will be using Kafka Transaction API to ensure that 
> each record is delivered once.
> In addition to the new feature i have refactored the existing code to make it 
> more readable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20523) Improve table statistics for Parquet format

2018-09-26 Thread George Pachitariu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20523:
-
Attachment: HIVE-20523.6.patch
Status: Patch Available  (was: Open)

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, 
> HIVE-20523.6.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20523) Improve table statistics for Parquet format

2018-09-26 Thread George Pachitariu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Pachitariu updated HIVE-20523:
-
Status: Open  (was: Patch Available)

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20510) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer

2018-09-26 Thread Sergey Shelukhin (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629412#comment-16629412
 ] 

Sergey Shelukhin commented on HIVE-20510:
-

What is the fix version for this? It's empty right now

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer
> 
>
> Key: HIVE-20510
> URL: https://issues.apache.org/jira/browse/HIVE-20510
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20510.1.patch, HIVE-20510.2.patch, 
> HIVE-20510.3.patch, HIVE-20510.4.patch, HIVE-20510.5.patch
>
>
> sorted dynamic partition optimizer does not work on bucketed tables when 
> vectorization is enabled.
>  
> cc [~mmccline]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20563) Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type are different

2018-09-26 Thread Matt McCline (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629402#comment-16629402
 ] 

Matt McCline commented on HIVE-20563:
-

Actually, the vectorized data type conversion was the problem.  No 
VectorUDFAdaptor fix is needed.

*HOWEVER*, the addition of implicit casts exposes a vector scratch column reuse 
bug.

> Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type 
> are different
> ---
>
> Key: HIVE-20563
> URL: https://issues.apache.org/jira/browse/HIVE-20563
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-20563.01.patch, HIVE-20563.02.patch
>
>
> With the following stacktrace:
> {code}
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
> ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) 
> [hadoop-mapreduce-client-common-3.1.0.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_181]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_181]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_181]
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:973)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:154) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
>  ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[?:1.8.0_181]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[?:1.8.0_181]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[?:1.8.0_181]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  ~[?:1.8.0_181]
> at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating 
> cstring1
> at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:149)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
>

[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629401#comment-16629401
 ] 

Hive QA commented on HIVE-20523:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941348/HIVE-20523.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 14999 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nested_column_pruning] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_analyze] 
(batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_complex_types_vectorization]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_join] 
(batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_map_type_vectorization]
 (batchId=90)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_no_row_serde] 
(batchId=74)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_struct_type_vectorization]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_non_dictionary_encoding_vectorization]
 (batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_types_vectorization]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_decimal_date]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_part_project]
 (batchId=37)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_numeric_overflows]
 (batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_parquet_projection]
 (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_parquet_types]
 (batchId=71)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=188)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_join] 
(batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_decimal_date]
 (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_part_project]
 (batchId=126)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_input_format_excludes]
 (batchId=131)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_parquet_projection]
 (batchId=130)
org.apache.hadoop.hive.metastore.TestHiveMetaStoreSchemaMethods.schemaQuery 
(batchId=228)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14064/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14064/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14064/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 22 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941348 - PreCommit-HIVE-Build

> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19302) Logging Too Verbose For TableNotFound

2018-09-26 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-19302:
-
Attachment: HIVE-19302.7.patch

> Logging Too Verbose For TableNotFound
> -
>
> Key: HIVE-19302
> URL: https://issues.apache.org/jira/browse/HIVE-19302
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.2.0, 3.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Minor
> Attachments: HIVE-19302.4.patch, HIVE-19302.5.patch, 
> HIVE-19302.6.patch, HIVE-19302.7.patch, table_not_found_cdh6.txt
>
>
> There is way too much logging when a user submits a query against a table 
> which does not exist.  In an ad-hoc setting, it is quite normal that a user 
> fat-fingers a table name.  Yet, from the perspective of the Hive 
> administrator, there was perhaps a major issue based on the volume and 
> severity of logging.  Please change the logging to INFO level, and do not 
> present a stack trace, for such a trivial error.
>  
> See the attached file for a sample of what logging a single "table not found" 
> query generates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20511) REPL DUMP is leaking metastore connections

2018-09-26 Thread Damon Cortesi (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629354#comment-16629354
 ] 

Damon Cortesi commented on HIVE-20511:
--

Looks like the root case of this is the same as HIVE-20600. Is there a way we 
can backport this patch to the 2.3 branch?

> REPL DUMP is leaking metastore connections
> --
>
> Key: HIVE-20511
> URL: https://issues.apache.org/jira/browse/HIVE-20511
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-20511.01.patch, HIVE-20511.02.patch, 
> HIVE-20511.03.patch, HIVE-20511.04.patch, HIVE-20511.05.patch
>
>
> With remote metastore, REPL DUMP  leaking connections. Each repl dump task is 
> leaking one connection due to the usage of stale hive object. 
> {code}
> 18/09/04 16:01:46 INFO ReplState: REPL::EVENT_DUMP: 
> {"dbName":"*","eventId":"566","eventType":"EVENT_COMMIT_TXN","eventsDumpProgress":"1/0","dumpTime":1536076906}
> 18/09/04 16:01:46 INFO events.AbstractEventHandler: Processing#567 OPEN_TXN 
> message : 
> {"txnIds":null,"timestamp":1536076905,"fromTxnId":269,"toTxnId":269,"server":"thrift://metastore-service.warehouse-1536062326-s74h.svc.cluster.local:9083","servicePrincipal":""}
> 18/09/04 16:01:46 INFO ReplState: REPL::EVENT_DUMP: 
> {"dbName":"*","eventId":"567","eventType":"EVENT_OPEN_TXN","eventsDumpProgress":"2/0","dumpTime":1536076906}
> 18/09/04 16:01:46 INFO metastore.HiveMetaStoreClient: Trying to connect to 
> metastore with URI 
> thrift://metastore-service.warehouse-1536062326-s74h.svc.cluster.local:9083
> 18/09/04 16:01:46 INFO metastore.HiveMetaStoreClient: Opened a connection to 
> metastore, current connections: 471
> 18/09/04 16:01:46 INFO metastore.HiveMetaStoreClient: Connected to metastore.
> 18/09/04 16:01:46 INFO metastore.RetryingMetaStoreClient: 
> RetryingMetaStoreClient proxy=class 
> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hive 
> (auth:SIMPLE) retries=24 delay=5 lifetime=0
> 18/09/04 16:01:46 INFO ReplState: REPL::END: 
> {"dbName":"*","dumpType":"INCREMENTAL","actualNumEvents":2,"dumpEndTime":1536076906,"dumpDir":"/user/hive/repl/e45bde27-74dc-45cd-9823-400a8fc1aea3","lastReplId":"567"}
> 18/09/04 16:01:46 INFO repl.ReplDumpTask: Done dumping events, preparing to 
> return /user/hive/repl/e45bde27-74dc-45cd-9823-400a8fc1aea3,567
> 18/09/04 16:01:46 INFO ql.Driver: Completed executing 
> command(queryId=hive_20180904160145_30f9570a-44e0-4f3b-b961-1906d3972fc4); 
> Time taken: 0.585 seconds
> OK
> 18/09/04 16:01:46 INFO ql.Driver: OK
> 18/09/04 16:01:46 INFO lockmgr.DbTxnManager: Stopped heartbeat for query: 
> hive_20180904160145_30f9570a-44e0-4f3b-b961-1906d3972fc4
> 18/09/04 16:01:46 INFO metastore.HiveMetaStoreClient: Trying to connect to 
> metastore with URI 
> thrift://metastore-service.warehouse-1536062326-s74h.svc.cluster.local:9083
> 18/09/04 16:01:46 INFO metastore.HiveMetaStoreClient: Opened a connection to 
> metastore, current connections: 472
> 18/09/04 16:01:46 INFO metastore.HiveMetaStoreClient: Connected to metastore.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20523) Improve table statistics for Parquet format

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629352#comment-16629352
 ] 

Hive QA commented on HIVE-20523:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
55s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} ql: The patch generated 1 new + 12 unchanged - 2 fixed 
= 13 total (was 14) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14064/dev-support/hive-personality.sh
 |
| git revision | master / dca6ef0 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14064/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14064/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Improve table statistics for Parquet format
> ---
>
> Key: HIVE-20523
> URL: https://issues.apache.org/jira/browse/HIVE-20523
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Reporter: George Pachitariu
>Assignee: George Pachitariu
>Priority: Minor
> Attachments: HIVE-20523.1.patch, HIVE-20523.2.patch, 
> HIVE-20523.3.patch, HIVE-20523.4.patch, HIVE-20523.5.patch, HIVE-20523.patch
>
>
> Right now, in the table basic statistics, the *raw data size* for a row with 
> any data type in the Parquet format is 1. This is an underestimated value 
> when columns are complex data structures, like arrays.
> Having tables with underestimated raw data size makes Hive assign less 
> containers (mappers/reducers) to it, making the overall query slower. 
> Heavy underestimation also makes Hive choose MapJoin instead of the 
> ShuffleJoin that can fail with OOM errors.
> In this patch, I compute the columns data size better, taking into account 
> complex structures. I followed the Writer implementation for the ORC format.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20459) add ThriftHiveMetastore.get_open_txns(long txnid)

2018-09-26 Thread Igor Kryvenko (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Kryvenko updated HIVE-20459:
-
Attachment: HIVE-20459.04.patch

> add ThriftHiveMetastore.get_open_txns(long txnid)
> -
>
> Key: HIVE-20459
> URL: https://issues.apache.org/jira/browse/HIVE-20459
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Transactions
>Reporter: Eugene Koifman
>Assignee: Igor Kryvenko
>Priority: Minor
> Attachments: HIVE-20459.01.patch, HIVE-20459.02.patch, 
> HIVE-20459.03.patch, HIVE-20459.04.patch
>
>
> we currently have {{ThriftHiveMetastore.get_open_txns()}} which maps to 
> {{TxnHandler.getOpenTxns()}}.  The usual usage is 
> {{TxnUtils.createValidReadTxnList(GetOpenTxnsResponse txns, long 
> currentTxn)}} where the complete list transactions is obtained from Metastore 
> and then anything above currentTxn is thrown away.  
> Would be useful to add {{ThriftHiveMetastore.get_open_txns(long txnid)}} and 
> {{TxnHandler.getOpenTxns(long)}} to not retrieve things that will be thrown 
> away.  Especially when there are a lot of running transactions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20595) Add findbugs-exclude.xml to metastore-server

2018-09-26 Thread Peter Vary (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20595:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for the patch [~lpinter]!

> Add findbugs-exclude.xml to metastore-server
> 
>
> Key: HIVE-20595
> URL: https://issues.apache.org/jira/browse/HIVE-20595
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: HIVE-20595.01.patch, HIVE-20595.02.patch, 
> HIVE-20595.03.patch
>
>
> The findbugs-exclude.xml is missing from 
> standalone-metastore/metastore-server/findbugs. This should be added, 
> otherwise the findbugs check will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20595) Add findbugs-exclude.xml to metastore-server

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629323#comment-16629323
 ] 

Hive QA commented on HIVE-20595:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941340/HIVE-20595.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 14998 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14063/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14063/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14063/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941340 - PreCommit-HIVE-Build

> Add findbugs-exclude.xml to metastore-server
> 
>
> Key: HIVE-20595
> URL: https://issues.apache.org/jira/browse/HIVE-20595
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Blocker
> Attachments: HIVE-20595.01.patch, HIVE-20595.02.patch, 
> HIVE-20595.03.patch
>
>
> The findbugs-exclude.xml is missing from 
> standalone-metastore/metastore-server/findbugs. This should be added, 
> otherwise the findbugs check will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-18778) Needs to capture input/output entities in explain

2018-09-26 Thread Daniel Dai (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-18778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Dai updated HIVE-18778:
--
Attachment: HIVE-18778.11.branch-3.patch

> Needs to capture input/output entities in explain
> -
>
> Key: HIVE-18778
> URL: https://issues.apache.org/jira/browse/HIVE-18778
> Project: Hive
>  Issue Type: Bug
>Reporter: Daniel Dai
>Assignee: Daniel Dai
>Priority: Major
> Attachments: HIVE-18778-SparkPositive.patch, HIVE-18778.1.patch, 
> HIVE-18778.10.branch-3.patch, HIVE-18778.11.branch-3.patch, 
> HIVE-18778.2.patch, HIVE-18778.3.patch, HIVE-18778.4.patch, 
> HIVE-18778.5.patch, HIVE-18778.6.patch, HIVE-18778.7.patch, 
> HIVE-18778.8.patch, HIVE-18778.9.branch-3.patch, HIVE-18778.9.patch, 
> HIVE-18778_TestCliDriver.patch, HIVE-18788_SparkNegative.patch, 
> HIVE-18788_SparkPerf.patch
>
>
> With Sentry enabled, commands like explain drop table foo fail with {{explain 
> drop table foo;}}
> {code}
> Error: Error while compiling statement: FAILED: SemanticException No valid 
> privileges
>  Required privilege( Table) not available in input privileges
>  The required privileges: (state=42000,code=4)
> {code}
> Sentry fails to authorize because the ExplainSemanticAnalyzer uses an 
> instance of DDLSemanticAnalyzer to analyze the explain query.
> {code}
> BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(conf, input);
> sem.analyze(input, ctx);
> sem.validate()
> {code}
> The inputs/outputs entities for this query are set in the above code. 
> However, these are never set on the instance of ExplainSemanticAnalyzer 
> itself and thus is not propagated into the HookContext in the calling Driver 
> code.
> {code}
> sem.analyze(tree, ctx); --> this results in calling the above code that uses 
> DDLSA
> hookCtx.update(sem); --> sem is an instance of ExplainSemanticAnalyzer, this 
> code attempts to update the HookContext with the input/output info from ESA 
> which is never set.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20540) Vectorization : Support loading bucketed tables using sorted dynamic partition optimizer - II

2018-09-26 Thread Prasanth Jayachandran (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629299#comment-16629299
 ] 

Prasanth Jayachandran commented on HIVE-20540:
--

Checking for row number set alone is sufficient right? Do we also need to check 
for bucketNumSet? If not later flag can probably be removed. 

 

> Vectorization : Support loading bucketed tables using sorted dynamic 
> partition optimizer - II
> -
>
> Key: HIVE-20540
> URL: https://issues.apache.org/jira/browse/HIVE-20540
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
> Attachments: HIVE-20540.1.patch, HIVE-20540.2.patch, 
> HIVE-20540.3.patch
>
>
> Followup to HIVE-20510 with remaining issues,
>  
> 1. Avoid using Reflection.
> 2. In VectorizationContext, use correct place to setup the VectorExpression. 
> It may be missed in certain cases.
> 3. In BucketNumExpression, make sure that a value is not overwritten before 
> it is processed. Use a flag to achieve this.
> cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20545) Exclude large-sized parameters from serialization of Table and Partition thrift objects in HMS notifications

2018-09-26 Thread Alexander Kolbasov (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629293#comment-16629293
 ] 

Alexander Kolbasov commented on HIVE-20545:
---

I think the synopsis is somewhat misleading - the patch provides  a mechanism 
to exclude matching parameters from notifications. The motivation for this is 
the ability to exclude potentially large parameters, so this should be put into 
description.

> Exclude large-sized parameters from serialization of Table and Partition 
> thrift objects in HMS notifications
> 
>
> Key: HIVE-20545
> URL: https://issues.apache.org/jira/browse/HIVE-20545
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Bharathkrishna Guruvayoor Murali
>Assignee: Bharathkrishna Guruvayoor Murali
>Priority: Major
> Attachments: HIVE-20545.1.patch, HIVE-20545.2.patch, 
> HIVE-20545.3.branch-3.patch, HIVE-20545.3.patch, HIVE-20545.4.patch
>
>
> Clients can add large-sized parameters in Table/Partition objects. So we need 
> to enable adding regex patterns through HiveConf to match parameters to be 
> filtered from table and partition objects before serialization in HMS 
> notifications.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20641) load_data_using_job is failing

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20641:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> load_data_using_job is failing
> --
>
> Key: HIVE-20641
> URL: https://issues.apache.org/jira/browse/HIVE-20641
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> load_data_using_job is failing due to result diff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20623) Shared work: Extend sharing of map-join cache entries in LLAP

2018-09-26 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629287#comment-16629287
 ] 

Gopal V commented on HIVE-20623:


LGTM - +1 tests pending

Can you add a comment in the code before commits on the line of 
"this.getClass().getName();" explaining why the classname is used.

> Shared work: Extend sharing of map-join cache entries in LLAP
> -
>
> Key: HIVE-20623
> URL: https://issues.apache.org/jira/browse/HIVE-20623
> Project: Hive
>  Issue Type: Improvement
>  Components: llap, Logical Optimizer
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Attachments: HIVE-20623.01.patch, HIVE-20623.02.patch, 
> HIVE-20623.02.patch, HIVE-20623.patch, hash-shared-work.json.txt, 
> hash-shared-work.svg
>
>
> For a query like this
> {code}
> with all_sales as (
> select ss_customer_sk as customer_sk, ss_ext_list_price-ss_ext_discount_amt 
> as ext_price from store_sales
> UNION ALL
> select ws_bill_customer_sk as customer_sk, 
> ws_ext_list_price-ws_ext_discount_amt as ext_price from web_sales
> UNION ALL
> select cs_bill_customer_sk as customer_sk, cs_ext_sales_price - 
> cs_ext_discount_amt as ext_price from catalog_sales)
> select sum(ext_price) total_price, c_customer_id from all_sales, customer 
> where customer_sk = c_customer_sk
> group by c_customer_id
> order by total_price desc 
> limit 100;
> {code}
> The hashtable used for all 3 joins are identical, which is loaded 3x times in 
> the same LLAP instance because they are named.
> {code}
> cacheKey = "HASH_MAP_" + this.getOperatorId() + "_container";
> {code}
> in the cache.
> If those are identical in nature (i.e vectorization, hashtable type etc), 
> then the duplication is just wasted CPU, memory and network - using the cache 
> name for hashtables which will be identical in layout would be extremely 
> useful.
> In cases where the join is pushed through a UNION, those are identical.
> This optimization can only be done without concern for accidental delays when 
> the same upstream task is generating all of these hashtables, which is what 
> is achieved by the shared scan optimizer already.
> In case the shared work is not present, this has potential downsides - in 
> case two customer broadcasts were sourced from "Map 1" and "Map 2", the Map 1 
> builder will block the other task from reading from Map 2, even though Map 2 
> might have started after, but finished ahead of Map 1.
> So this specific optimization can always be considered for cases where the 
> shared work unifies the operator tree and the parents of all the RS entries 
> involved are same (& the RS layout is the same).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16812) VectorizedOrcAcidRowBatchReader doesn't filter delete events

2018-09-26 Thread Eugene Koifman (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-16812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629286#comment-16629286
 ] 

Eugene Koifman commented on HIVE-16812:
---

looks like at lest vector_acid4.q failure is related

> VectorizedOrcAcidRowBatchReader doesn't filter delete events
> 
>
> Key: HIVE-16812
> URL: https://issues.apache.org/jira/browse/HIVE-16812
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-16812.02.patch, HIVE-16812.04.patch, 
> HIVE-16812.05.patch
>
>
> the c'tor of VectorizedOrcAcidRowBatchReader has
> {noformat}
> // Clone readerOptions for deleteEvents.
> Reader.Options deleteEventReaderOptions = readerOptions.clone();
> // Set the range on the deleteEventReaderOptions to 0 to INTEGER_MAX 
> because
> // we always want to read all the delete delta files.
> deleteEventReaderOptions.range(0, Long.MAX_VALUE);
> {noformat}
> This is suboptimal since base and deltas are sorted by ROW__ID.  So for each 
> split if base we can find min/max ROW_ID and only load events from delta that 
> are in [min,max] range.  This will reduce the number of delete events we load 
> in memory (to no more than there in the split).
> When we support sorting on PK, the same should apply but we'd need to make 
> sure to store PKs in ORC index
> See {{OrcRawRecordMerger.discoverKeyBounds()}}
> {{hive.acid.key.index}} in Orc footer has an index of ROW__IDs so we should 
> know min/max easily for any file written by {{OrcRecordUpdater}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-26 Thread Gopal V (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629285#comment-16629285
 ] 

Gopal V commented on HIVE-20640:


+1 tests pending

> Upgrade Hive to use ORC 1.5.3
> -
>
> Key: HIVE-20640
> URL: https://issues.apache.org/jira/browse/HIVE-20640
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20640.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12812) Enable mapred.input.dir.recursive by default to support union with aggregate function

2018-09-26 Thread Yongzhi Chen (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-12812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629277#comment-16629277
 ] 

Yongzhi Chen commented on HIVE-12812:
-

The change LGTM  +1

> Enable mapred.input.dir.recursive by default to support union with aggregate 
> function
> -
>
> Key: HIVE-12812
> URL: https://issues.apache.org/jira/browse/HIVE-12812
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Chaoyu Tang
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-12812.1.patch, HIVE-12812.2.patch, 
> HIVE-12812.patch, HIVE-12812.patch, HIVE-12812.patch
>
>
> When union remove optimization is enabled, union query with aggregate 
> function writes its subquery intermediate results to subdirs which needs 
> mapred.input.dir.recursive to be enabled in order to be fetched. This 
> property is not defined by default in Hive and often ignored by user, which 
> causes the query failure and is hard to be debugged.
> So we need set mapred.input.dir.recursive to true whenever union remove 
> optimization is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20595) Add findbugs-exclude.xml to metastore-server

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629273#comment-16629273
 ] 

Hive QA commented on HIVE-20595:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
27s{color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
13s{color} | {color:red} metastore-server in master failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 10m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  findbugs  xml  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14063/dev-support/hive-personality.sh
 |
| git revision | master / dca6ef0 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14063/yetus/branch-findbugs-standalone-metastore_metastore-server.txt
 |
| modules | C: standalone-metastore/metastore-server U: 
standalone-metastore/metastore-server |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14063/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Add findbugs-exclude.xml to metastore-server
> 
>
> Key: HIVE-20595
> URL: https://issues.apache.org/jira/browse/HIVE-20595
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Laszlo Pinter
>Assignee: Laszlo Pinter
>Priority: Blocker
> Attachments: HIVE-20595.01.patch, HIVE-20595.02.patch, 
> HIVE-20595.03.patch
>
>
> The findbugs-exclude.xml is missing from 
> standalone-metastore/metastore-server/findbugs. This should be added, 
> otherwise the findbugs check will fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20619) Include MultiDelimitSerDe in HIveServer2 By Default

2018-09-26 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-20619:
-
Attachment: HIVE-20619.2.patch
Status: Patch Available  (was: Open)

> Include MultiDelimitSerDe in HIveServer2 By Default
> ---
>
> Key: HIVE-20619
> URL: https://issues.apache.org/jira/browse/HIVE-20619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-20619.1.patch, HIVE-20619.2.patch
>
>
> In [HIVE-20020], the hive-contrib JAR file was removed from the HiveServer2 
> classpath.  With this change, the {{MultiDelimitSerDe}} is no longer 
> included.  This is fine, because {{MultiDelimitSerDe}} was a pain in that 
> environment anyway.  It was available to HiveServer2, and therefore would 
> work with a limited set of queries (select * from table limit 1) but any 
> other query on that table which launched a MapReduce project would fail 
> because the hive-contrib JAR file was not sent out with the rest of the Hive 
> JARs for MapReduce jobs.
> Please bring {{MultiDelimitSerDe}} back into the fold so that it's available 
> to users out of the box without having to install the hive-contrib JAR into 
> the HiveServer2 auxiliary directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20619) Include MultiDelimitSerDe in HIveServer2 By Default

2018-09-26 Thread Alice Fan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alice Fan updated HIVE-20619:
-
Status: Open  (was: Patch Available)

> Include MultiDelimitSerDe in HIveServer2 By Default
> ---
>
> Key: HIVE-20619
> URL: https://issues.apache.org/jira/browse/HIVE-20619
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, Serializers/Deserializers
>Affects Versions: 3.0.0, 4.0.0
>Reporter: BELUGA BEHR
>Assignee: Alice Fan
>Priority: Major
> Attachments: HIVE-20619.1.patch
>
>
> In [HIVE-20020], the hive-contrib JAR file was removed from the HiveServer2 
> classpath.  With this change, the {{MultiDelimitSerDe}} is no longer 
> included.  This is fine, because {{MultiDelimitSerDe}} was a pain in that 
> environment anyway.  It was available to HiveServer2, and therefore would 
> work with a limited set of queries (select * from table limit 1) but any 
> other query on that table which launched a MapReduce project would fail 
> because the hive-contrib JAR file was not sent out with the rest of the Hive 
> JARs for MapReduce jobs.
> Please bring {{MultiDelimitSerDe}} back into the fold so that it's available 
> to users out of the box without having to install the hive-contrib JAR into 
> the HiveServer2 auxiliary directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20563) Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type are different

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629249#comment-16629249
 ] 

Hive QA commented on HIVE-20563:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12941341/HIVE-20563.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 14997 tests 
executed
*Failed tests:*
{noformat}
TestMiniDruidCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=195)

[druidmini_masking.q,druidmini_test1.q,druidkafkamini_basic.q,druidmini_joins.q,druid_timestamptz.q]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_case_when_1] 
(batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_case_when_2] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_when_case_null] 
(batchId=38)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_case] 
(batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_data_using_job]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[load_dyn_part3]
 (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_adaptor_usage_mode]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_case_when_1]
 (batchId=181)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_case_when_2]
 (batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_case_when_conversion]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf_adaptor_1]
 (batchId=182)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_when_case_null]
 (batchId=167)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_3]
 (batchId=179)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorization_short_regress]
 (batchId=171)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_case]
 (batchId=173)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[parquet_vectorization_3]
 (batchId=145)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_3] 
(batchId=146)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_short_regress]
 (batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorized_case] 
(batchId=136)
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testIfConditionalExprs
 (batchId=301)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/14062/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14062/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14062/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12941341 - PreCommit-HIVE-Build

> Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type 
> are different
> ---
>
> Key: HIVE-20563
> URL: https://issues.apache.org/jira/browse/HIVE-20563
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-20563.01.patch, HIVE-20563.02.patch
>
>
> With the following stacktrace:
> {code}
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
> ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) 
> [hadoop-mapreduce-client-common-3.1.0.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) 
>

[jira] [Updated] (HIVE-20604) Minor compaction disables ORC column stats

2018-09-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20604:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to master
thanks Prasanth for the review

> Minor compaction disables ORC column stats
> --
>
> Key: HIVE-20604
> URL: https://issues.apache.org/jira/browse/HIVE-20604
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-20604.01.patch
>
>
> {noformat}
>   @Override
>   public org.apache.hadoop.hive.ql.exec.FileSinkOperator.RecordWriter
> getRawRecordWriter(Path path, Options options) throws IOException {
> final Path filename = AcidUtils.createFilename(path, options);
> final OrcFile.WriterOptions opts =
> OrcFile.writerOptions(options.getTableProperties(), 
> options.getConfiguration());
> if (!options.isWritingBase()) {
>   opts.bufferSize(OrcRecordUpdater.DELTA_BUFFER_SIZE)
>   .stripeSize(OrcRecordUpdater.DELTA_STRIPE_SIZE)
>   .blockPadding(false)
>   .compress(CompressionKind.NONE)
>   .rowIndexStride(0)
>   ;
> }
> {noformat}
> {{rowIndexStride(0)}} makes {{StripeStatistics.getColumnStatistics()}} return 
> objects but with meaningless values, like min/max for 
> {{IntegerColumnStatistics}} set to MIN_LONG/MAX_LONG.
> This interferes with ability to infer min ROW_ID for a split but also creates 
> inefficient files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20641) load_data_using_job is failing

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal updated HIVE-20641:
--
Status: Patch Available  (was: In Progress)

> load_data_using_job is failing
> --
>
> Key: HIVE-20641
> URL: https://issues.apache.org/jira/browse/HIVE-20641
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> load_data_using_job is failing due to result diff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Work started] (HIVE-20641) load_data_using_job is failing

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-20641 started by Deepak Jaiswal.
-
> load_data_using_job is failing
> --
>
> Key: HIVE-20641
> URL: https://issues.apache.org/jira/browse/HIVE-20641
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> load_data_using_job is failing due to result diff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20641) load_data_using_job is failing

2018-09-26 Thread Deepak Jaiswal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-20641:
-


> load_data_using_job is failing
> --
>
> Key: HIVE-20641
> URL: https://issues.apache.org/jira/browse/HIVE-20641
> Project: Hive
>  Issue Type: Bug
>Reporter: Deepak Jaiswal
>Assignee: Deepak Jaiswal
>Priority: Major
>
> load_data_using_job is failing due to result diff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20640:
--
Status: Patch Available  (was: Open)

> Upgrade Hive to use ORC 1.5.3
> -
>
> Key: HIVE-20640
> URL: https://issues.apache.org/jira/browse/HIVE-20640
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20640.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20640:
--
Attachment: HIVE-20640.01.patch

> Upgrade Hive to use ORC 1.5.3
> -
>
> Key: HIVE-20640
> URL: https://issues.apache.org/jira/browse/HIVE-20640
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
> Attachments: HIVE-20640.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20544) TOpenSessionReq logs password and username

2018-09-26 Thread Karen Coppage (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-20544:
-
Attachment: HIVE-20544.4.patch
Status: Patch Available  (was: Open)

Thanks so much for the strategy, [~pvary]. I got another review on RB from 
[~asherman] (thanks, Andrew!), hence a new patch anyway.

> TOpenSessionReq logs password and username
> --
>
> Key: HIVE-20544
> URL: https://issues.apache.org/jira/browse/HIVE-20544
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, patch, security
> Attachments: HIVE-20544.1.patch, HIVE-20544.2.patch, 
> HIVE-20544.3.patch, HIVE-20544.3.patch, HIVE-20544.4.patch, HIVE-20544.patch, 
> non-solution.patch, working-solution.patch
>
>
> In 
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq,
>  if client protocol is unset, validate() and toString() prints both username 
> and password to logs.
> Logging a password is a security risk. We should hide the ***.
> =Edit= (no longer relevant, see comments)
> This issue is tricky since it is caused in a fully generated class. I've been 
> playing around and have found one working solution, butI'd truly appreciate 
> ideas for a more elegant solution or input.
> The problem:
>  TCLIService.thrift is the template for generating all classes in 
> service-rpc. Struct TOpenSessionReq is OpenSession()'s one parameter and is 
> defined thus:
> {noformat}
> struct TOpenSessionReq {
>   1: required TProtocolVersion client_protocol = 
> TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10
>   2: optional string username
>   3: optional string password
>   4: optional map configuration
> }
> {noformat}
> In the generated class TOpenSessionReq.java, client_protocol is checked by a 
> validate() method, which is called quite a few times; if client_protocol is 
> not set, it throws a TProtocolException, passing along a toString(). This 
> toString() gets the names and values of all fields, including username and 
> password.
> Working solution:
>  * Create a separate struct containing only the username and password, and 
> pass it to OpenSession() as a second parameter. Since all fields in the new 
> struct are "optional", the generated validate() is empty – toString() is 
> never used. This involves changing core classes and breaks the "Each function 
> should take exactly one parameter" coding convention (detailed at 
> service-rpc/if/TCLIService.thrift:27).
>  See working-solution.patch.
> What doesn't work:
>  * Making client_protocol optional instead of required. Apparently this will 
> break everything.
>  * Overwriting toString() – TOpenSessionReq is a struct.
>  * Creating two Thrift structs, one struct for required (TRequiredReq) and 
> one for optional (TOptionalReq) fields, and nesting them in struct 
> TOpenSessionReq. This doesn't work because validate() in TOpenSessionReq can 
> call TOptionalReq.toString(), which prints the password to logs. This will 
> happen if TRequiredReq.client_protocol isn't set.
>  See non-solution.patch
>  * Asking Thrift devs to change their code. I wrote them an email but have no 
> expectations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-20640:
--
Component/s: ORC

> Upgrade Hive to use ORC 1.5.3
> -
>
> Key: HIVE-20640
> URL: https://issues.apache.org/jira/browse/HIVE-20640
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-20544) TOpenSessionReq logs password and username

2018-09-26 Thread Karen Coppage (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage updated HIVE-20544:
-
Status: Open  (was: Patch Available)

> TOpenSessionReq logs password and username
> --
>
> Key: HIVE-20544
> URL: https://issues.apache.org/jira/browse/HIVE-20544
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: beginner, patch, security
> Attachments: HIVE-20544.1.patch, HIVE-20544.2.patch, 
> HIVE-20544.3.patch, HIVE-20544.3.patch, HIVE-20544.patch, non-solution.patch, 
> working-solution.patch
>
>
> In 
> service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TOpenSessionReq,
>  if client protocol is unset, validate() and toString() prints both username 
> and password to logs.
> Logging a password is a security risk. We should hide the ***.
> =Edit= (no longer relevant, see comments)
> This issue is tricky since it is caused in a fully generated class. I've been 
> playing around and have found one working solution, butI'd truly appreciate 
> ideas for a more elegant solution or input.
> The problem:
>  TCLIService.thrift is the template for generating all classes in 
> service-rpc. Struct TOpenSessionReq is OpenSession()'s one parameter and is 
> defined thus:
> {noformat}
> struct TOpenSessionReq {
>   1: required TProtocolVersion client_protocol = 
> TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10
>   2: optional string username
>   3: optional string password
>   4: optional map configuration
> }
> {noformat}
> In the generated class TOpenSessionReq.java, client_protocol is checked by a 
> validate() method, which is called quite a few times; if client_protocol is 
> not set, it throws a TProtocolException, passing along a toString(). This 
> toString() gets the names and values of all fields, including username and 
> password.
> Working solution:
>  * Create a separate struct containing only the username and password, and 
> pass it to OpenSession() as a second parameter. Since all fields in the new 
> struct are "optional", the generated validate() is empty – toString() is 
> never used. This involves changing core classes and breaks the "Each function 
> should take exactly one parameter" coding convention (detailed at 
> service-rpc/if/TCLIService.thrift:27).
>  See working-solution.patch.
> What doesn't work:
>  * Making client_protocol optional instead of required. Apparently this will 
> break everything.
>  * Overwriting toString() – TOpenSessionReq is a struct.
>  * Creating two Thrift structs, one struct for required (TRequiredReq) and 
> one for optional (TOptionalReq) fields, and nesting them in struct 
> TOpenSessionReq. This doesn't work because validate() in TOpenSessionReq can 
> call TOptionalReq.toString(), which prints the password to logs. This will 
> happen if TRequiredReq.client_protocol isn't set.
>  See non-solution.patch
>  * Asking Thrift devs to change their code. I wrote them an email but have no 
> expectations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-20640) Upgrade Hive to use ORC 1.5.3

2018-09-26 Thread Eugene Koifman (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-20640:
-


> Upgrade Hive to use ORC 1.5.3
> -
>
> Key: HIVE-20640
> URL: https://issues.apache.org/jira/browse/HIVE-20640
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20563) Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type are different

2018-09-26 Thread Hive QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-20563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16629200#comment-16629200
 ] 

Hive QA commented on HIVE-20563:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
42s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
49s{color} | {color:blue} ql in master has 2326 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 23m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-14062/dev-support/hive-personality.sh
 |
| git revision | master / bef6c9f |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| modules | C: itests ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-14062/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: CASE WHEN expression fails when THEN/ELSE type and result type 
> are different
> ---
>
> Key: HIVE-20563
> URL: https://issues.apache.org/jira/browse/HIVE-20563
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Matt McCline
>Priority: Major
> Attachments: HIVE-20563.01.patch, HIVE-20563.02.patch
>
>
> With the following stacktrace:
> {code}
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) 
> ~[hadoop-mapreduce-client-common-3.1.0.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552) 
> [hadoop-mapreduce-client-common-3.1.0.jar:?]
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:163) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) 
> ~[hadoop-mapreduce-client-core-3.1.0.jar:?]
>

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Attachment: HIVE-20632.02.patch

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Attachment: (was: HIVE-20632.02.patch)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Status: Patch Available  (was: Open)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)

[jira] [Updated] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20632:

Status: Open  (was: Patch Available)

> Query with get_splits UDF fails if materialized view is created on queried 
> table. 
> --
>
> Key: HIVE-20632
> URL: https://issues.apache.org/jira/browse/HIVE-20632
> Project: Hive
>  Issue Type: Bug
>  Components: Materialized views, UDF
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: UDF, materializedviews, pull-request-available
> Attachments: HIVE-20632.01.patch, HIVE-20632.02.patch
>
>
> Scenario:
>  # Create ACID table t1 and insert few rows.
>  # Create materialized view mv as select a from t1 where a > 5;
>  # Run get_split query "select get_splits( select a from t1 where a > 5); – 
> This fails with AssertionError.
> {code:java}
> java.lang.AssertionError
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:673)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidMaterializedViews(Hive.java:1495)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getAllValidMaterializedViews(Hive.java:1478)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2160)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1777)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1669)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1043)
> at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
> at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1428)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:475)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12309)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:355)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1872)
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1819)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.createPlanFragment(GenericUDTFGetSplits.java:266)
> at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFGetSplits.process(GenericUDTFGetSplits.java:204)
> at org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:519)
> at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:511)
> at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
> at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2764)
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.runStatementOnDriver(TestAcidOnTez.java:927)
> at 
> org.apache.hadoop.hive.ql.TestAcidOnTez.testGetSplitsLocksWithMaterializedView(TestAcidOnTez.java:803)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Attachment: HIVE-20627.01.patch

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Attachment: (was: HIVE-20627.01.patch)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Status: Patch Available  (was: Open)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>

[jira] [Updated] (HIVE-20627) Concurrent async queries intermittently fails with LockException and cause memory leak.

2018-09-26 Thread Sankar Hariappan (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-20627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20627:

Status: Open  (was: Patch Available)

> Concurrent async queries intermittently fails with LockException and cause 
> memory leak.
> ---
>
> Key: HIVE-20627
> URL: https://issues.apache.org/jira/browse/HIVE-20627
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Transactions
>Affects Versions: 4.0.0, 3.2.0
>Reporter: Sankar Hariappan
>Assignee: Sankar Hariappan
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-20627.01.patch
>
>
> When multiple async queries are executed from same session, it leads to 
> multiple async query execution DAGs share the same Hive object which is set 
> by caller for all threads. In case of loading dynamic partitions, it creates 
> MoveTask which re-creates the Hive object and closes the shared Hive object 
> which causes metastore connection issues for other async execution thread who 
> still access it. This is also seen if ReplDumpTask and ReplLoadTask are part 
> of the DAG.
> *Call Stack:*
> {code:java}
> 2018-09-16T04:38:04,280 ERROR [load-dynamic-partitions-7]: metadata.Hive 
> (Hive.java:call(2436)) - Exception when loading partition with parameters 
> partPath=hdfs://mycluster/warehouse/tablespace/managed/hive/tbl_3bcvvdubni/.hive-staging_hive_2018-09-16_04-35-50_708_7776079613819042057-1147/-ext-1/age=55,
>  table=tbl_3bcvvdubni, partSpec={age=55}, loadFileType=KEEP_EXISTING, 
> listBucketingLevel=0, isAcid=true, hasFollowingStatsTask=true
> org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the 
> metastore
> at 
> org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidWriteIds(DbTxnManager.java:714)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableValidWriteIdListWithTxnList(AcidUtils.java:1791)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1756) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.ql.io.AcidUtils.getTableSnapshot(AcidUtils.java:1714) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1976) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2415) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.hadoop.hive.ql.metadata.Hive$5.call(Hive.java:2406) 
> [hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_171]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_171]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
> Caused by: org.apache.thrift.protocol.TProtocolException: Required field 
> 'validTxnList' is unset! 
> Struct:GetValidWriteIdsRequest(fullTableNames:[default.tbl_3bcvvdubni], 
> validTxnList:null)
> at 
> org.apache.hadoop.hive.metastore.api.GetValidWriteIdsRequest.validate(GetValidWriteIdsRequest.java:396)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.validate(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args$get_valid_write_ids_argsStandardScheme.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_valid_write_ids_args.write(ThriftHiveMetastore.java)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:71) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) 
> ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_valid_write_ids(ThriftHiveMetastore.java:5443)
>  ~[hive-exec-3.1.0.3.0.1.0-184.jar:3.1.0.3.0.1.0-184]
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_valid_write_ids(ThriftHiveMetastore.java:5435)
>

1 2 >

1 - 100 of 167 matches

Mail list logo