date:20160811

[jira] [Updated] (HIVE-14528) After enabling Hive Parquet Vectorization, many queries in TPCx-BB(BigBench) failed with NullPointerException and IllegalArgumentException

2016-08-11 Thread KaiXu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

KaiXu updated HIVE-14528:
-
Description: 
We use TPCx-BB(BigBench) to evaluate the performance of Hive Parquet 
Vectorization in our local cluster(E5-2699 v3, 256G, 72 vcores, 1 master node + 
5 worker nodes). During our performance test of enable Parquet Vectorization, 
we found that many queries failed with the two errors:
a. Error: java.lang.NullPointerException@ VectorizedParquetInputFormat.java:188
  For queries: q02, q03, q04, q06, q08, q11, q14, q15, q18, q19, q21, q23
b. java.io.IOException: java.io.IOException: 
java.lang.IllegalArgumentException: 8 > 4@ HiveIOExceptionHandlerChain.java:121
 For queries: q07, q09, q13, q17, q24
a:
Error: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.close(VectorizedParquetInputFormat.java:188)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doClose(CombineHiveRecordReader.java:74)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.close(HiveContextAwareRecordReader.java:106)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.close(HadoopShimsSecure.java:172)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.close(MapTask.java:210)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1972)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

b:
Error: java.io.IOException: java.io.IOException: 
java.lang.IllegalArgumentException: 8 > 4
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:230)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:199)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: java.lang.IllegalArgumentException: 8 > 4
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:357)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:106)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:42)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:118)
at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:228)
... 11 more
Caused by: java.lang.IllegalArgumentException: 8 > 4
at java.util.Arrays.copyOfRange(Arrays.java:3519)
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.assignVector(VectorizedParquetInputFormat.java:313)
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.next(VectorizedParquetInputFormat.java:235)
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat$VectorizedParquetRecordReader.next(VectorizedParquetInputFormat.java:97)
at

[jira] [Commented] (HIVE-14433) refactor LLAP plan cache avoidance and fix issue in merge processor

2016-08-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418326#comment-15418326
 ] 

Hive QA commented on HIVE-14433:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823390/HIVE-14433.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10419 tests 
executed
*Failed tests:*
{noformat}
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/859/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/859/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-859/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823390 - PreCommit-HIVE-MASTER-Build

> refactor LLAP plan cache avoidance and fix issue in merge processor
> ---
>
> Key: HIVE-14433
> URL: https://issues.apache.org/jira/browse/HIVE-14433
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1, 2.2.0, 2.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14433.01.patch, HIVE-14433.02.patch, 
> HIVE-14433.03.patch, HIVE-14433.patch
>
>
> Map and reduce processors do this:
> {noformat}
> if (LlapProxy.isDaemon()) {
>   cache = new org.apache.hadoop.hive.ql.exec.mr.ObjectCache(); // do not 
> cache plan
> ...
> {noformat}
> but merge processor just gets the plan. If it runs in LLAP, it can get a 
> cached plan. Need to move this logic into ObjectCache itself, via a isPlan 
> arg or something. That will also fix this issue for merge processor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Open  (was: Patch Available)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14523.01.patch, HIVE-14523.02.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Patch Available  (was: Open)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14523.01.patch, HIVE-14523.02.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14527) Schema evolution tests are not running in TestCliDriver

2016-08-11 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14527:
-
Reporter: Matt McCline  (was: Prasanth Jayachandran)

> Schema evolution tests are not running in TestCliDriver
> ---
>
> Key: HIVE-14527
> URL: https://issues.apache.org/jira/browse/HIVE-14527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Prasanth Jayachandran
>
> HIVE-14376 broke something that makes schema evolution tests being excluded 
> from TestCliDriver test suite. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14527) Schema evolution tests are not running in TestCliDriver

2016-08-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418268#comment-15418268
 ] 

Prasanth Jayachandran commented on HIVE-14527:
--

cc/ [~mmccline]

> Schema evolution tests are not running in TestCliDriver
> ---
>
> Key: HIVE-14527
> URL: https://issues.apache.org/jira/browse/HIVE-14527
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Matt McCline
>Assignee: Prasanth Jayachandran
>
> HIVE-14376 broke something that makes schema evolution tests being excluded 
> from TestCliDriver test suite. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14526) HadoopMetrics2Reporter logs way, way too much on INFO level

2016-08-11 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418256#comment-15418256
 ] 

Josh Elser commented on HIVE-14526:
---

[~sershe], [~sushanth] had told me about this one and I thought we had 
addressed it in v0.1.1 of the plugin (I thought Hive is on 0.1.2 by now). 
HIVE-14394, I think, brought that in.

> HadoopMetrics2Reporter logs way, way too much on INFO level
> ---
>
> Key: HIVE-14526
> URL: https://issues.apache.org/jira/browse/HIVE-14526
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14526.patch
>
>
> {noformat}
> # grep -c HadoopMetrics2Reporter hiveserver2.log.2016-08-11
> 547524076
> # grep -c . hiveserver2.log.2016-08-11
> 548430185
> # ll hiveserver2.log.2016-08-11
> -rw-r--r-- 1 hive hadoop 204695432463 Aug 11 23:59 hiveserver2.log.2016-08-11
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Status: Patch Available  (was: Open)

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.16.patch, HIVE-14035.17.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14526) HadoopMetrics2Reporter logs way, way too much on INFO level

2016-08-11 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418254#comment-15418254
 ] 

Prasanth Jayachandran commented on HIVE-14526:
--

I think this should be enough. +1

> HadoopMetrics2Reporter logs way, way too much on INFO level
> ---
>
> Key: HIVE-14526
> URL: https://issues.apache.org/jira/browse/HIVE-14526
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14526.patch
>
>
> {noformat}
> # grep -c HadoopMetrics2Reporter hiveserver2.log.2016-08-11
> 547524076
> # grep -c . hiveserver2.log.2016-08-11
> 548430185
> # ll hiveserver2.log.2016-08-11
> -rw-r--r-- 1 hive hadoop 204695432463 Aug 11 23:59 hiveserver2.log.2016-08-11
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14526) HadoopMetrics2Reporter logs way, way too much on INFO level

2016-08-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418250#comment-15418250
 ] 

Ashutosh Chauhan commented on HIVE-14526:
-

[~prasanth_j] will know 

> HadoopMetrics2Reporter logs way, way too much on INFO level
> ---
>
> Key: HIVE-14526
> URL: https://issues.apache.org/jira/browse/HIVE-14526
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14526.patch
>
>
> {noformat}
> # grep -c HadoopMetrics2Reporter hiveserver2.log.2016-08-11
> 547524076
> # grep -c . hiveserver2.log.2016-08-11
> 548430185
> # ll hiveserver2.log.2016-08-11
> -rw-r--r-- 1 hive hadoop 204695432463 Aug 11 23:59 hiveserver2.log.2016-08-11
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418248#comment-15418248
 ] 

Hive QA commented on HIVE-14035:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823322/HIVE-14035.16.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10457 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/858/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/858/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-858/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823322 - PreCommit-HIVE-MASTER-Build

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.16.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14433) refactor LLAP plan cache avoidance and fix issue in merge processor

2016-08-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14433:

Attachment: HIVE-14433.03.patch

again


> refactor LLAP plan cache avoidance and fix issue in merge processor
> ---
>
> Key: HIVE-14433
> URL: https://issues.apache.org/jira/browse/HIVE-14433
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.1, 2.2.0, 2.1.1
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14433.01.patch, HIVE-14433.02.patch, 
> HIVE-14433.03.patch, HIVE-14433.patch
>
>
> Map and reduce processors do this:
> {noformat}
> if (LlapProxy.isDaemon()) {
>   cache = new org.apache.hadoop.hive.ql.exec.mr.ObjectCache(); // do not 
> cache plan
> ...
> {noformat}
> but merge processor just gets the plan. If it runs in LLAP, it can get a 
> cached plan. Need to move this logic into ObjectCache itself, via a isPlan 
> arg or something. That will also fix this issue for merge processor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14521) codahale metrics exceptions

2016-08-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14521:

Description: 
One some random setup, I see bazillions of errors like this in HS2 log:
{noformat}
2016-08-08 04:52:18,619 WARN  [HiveServer2-Handler-Pool: Thread-101]: 
log.PerfLogger (PerfLogger.java:beginMetrics(226)) - Error recording metrics
java.io.IOException: Scope named api_Driver.run is not closed, cannot be opened.
at 
org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics$CodahaleMetricsScope.open(CodahaleMetrics.java:133)
at 
org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.startStoredScope(CodahaleMetrics.java:220)
at 
org.apache.hadoop.hive.ql.log.PerfLogger.beginMetrics(PerfLogger.java:223)
at 
org.apache.hadoop.hive.ql.log.PerfLogger.PerfLogBegin(PerfLogger.java:143)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:378)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1214)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1208)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:276)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:468)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
{noformat}

I suspect that either, just like the metastore deadline, this needs better 
error handling when whatever the metrics surround fails; or, it is just not 
thread safe.
But I actually haven't looked at the code yet.

  was:
One some random setup, I see bazillions of errors like this in HS2 log, Gb-s of 
logs worth:
{noformat}
2016-08-08 04:52:18,619 WARN  [HiveServer2-Handler-Pool: Thread-101]: 
log.PerfLogger (PerfLogger.java:beginMetrics(226)) - Error recording metrics
java.io.IOException: Scope named api_Driver.run is not closed, cannot be opened.
at 
org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics$CodahaleMetricsScope.open(CodahaleMetrics.java:133)
at 
org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.startStoredScope(CodahaleMetrics.java:220)
at 
org.apache.hadoop.hive.ql.log.PerfLogger.beginMetrics(PerfLogger.java:223)
at 
org.apache.hadoop.hive.ql.log.PerfLogger.PerfLogBegin(PerfLogger.java:143)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:378)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1214)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1208)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:276)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:468)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
{noformat}

I suspect that either, just like the metastore deadline, this needs better 
error handling when whatever the metrics surround fails; or, it is just not 
thread safe.
But I actually haven't looked at the code yet.


> codahale metrics exceptions
> ---
>
> Key: HIVE-14521
> URL: https://issues.apache.org/jira/browse/HIVE-14521
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> One some random setup, I see bazillions of errors like this in HS2 log:
> {noformat}
> 2016-08-08 04:52:18,619 WARN  [HiveServer2-Handler-Pool: Thread-101]: 
> log.PerfLogger (PerfLogger.java:beginMetrics(226)) - Error recording metrics
> java.io.IOException: Scope named api_Driver.run is not closed, cannot be 
> opened.
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics$CodahaleMetricsScope.open(CodahaleMetrics.java:133)
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.startStoredScope(CodahaleMetrics.java:220)
> at 
>

[jira] [Updated] (HIVE-14526) HadoopMetrics2Reporter logs way, way too much

2016-08-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14526:

Status: Patch Available  (was: Open)

> HadoopMetrics2Reporter logs way, way too much
> -
>
> Key: HIVE-14526
> URL: https://issues.apache.org/jira/browse/HIVE-14526
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14526.patch
>
>
> {noformat}
> # grep -c HadoopMetrics2Reporter hiveserver2.log.2016-08-11
> 547524076
> # grep -c . hiveserver2.log.2016-08-11
> 548430185
> # ll hiveserver2.log.2016-08-11
> -rw-r--r-- 1 hive hadoop 204695432463 Aug 11 23:59 hiveserver2.log.2016-08-11
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14526) HadoopMetrics2Reporter logs way, way too much

2016-08-11 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14526:

Attachment: HIVE-14526.patch

Silencing the class logger. [~ashutoshc] can you take a look? Will these files 
be sufficient?

> HadoopMetrics2Reporter logs way, way too much
> -
>
> Key: HIVE-14526
> URL: https://issues.apache.org/jira/browse/HIVE-14526
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14526.patch
>
>
> {noformat}
> # grep -c HadoopMetrics2Reporter hiveserver2.log.2016-08-11
> 547524076
> # grep -c . hiveserver2.log.2016-08-11
> 548430185
> # ll hiveserver2.log.2016-08-11
> -rw-r--r-- 1 hive hadoop 204695432463 Aug 11 23:59 hiveserver2.log.2016-08-11
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14526) HadoopMetrics2Reporter logs way, way too much

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418233#comment-15418233
 ] 

Sergey Shelukhin commented on HIVE-14526:
-

[~elserj] fyi

> HadoopMetrics2Reporter logs way, way too much
> -
>
> Key: HIVE-14526
> URL: https://issues.apache.org/jira/browse/HIVE-14526
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> {noformat}
> # grep -c HadoopMetrics2Reporter hiveserver2.log.2016-08-11
> 547524076
> # grep -c . hiveserver2.log.2016-08-11
> 548430185
> # ll hiveserver2.log.2016-08-11
> -rw-r--r-- 1 hive hadoop 204695432463 Aug 11 23:59 hiveserver2.log.2016-08-11
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12917:
--
Comment: was deleted

(was: Status update: Ke Jia is working on the document, the inital version is 
under under review by few people, please feel free to send email to Ke if you 
want to review it. )

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14521) codahale metrics exceptions

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418214#comment-15418214
 ] 

Sergey Shelukhin commented on HIVE-14521:
-

Actually, there aren't Gbs of these errors, there are just a few thousand... 
still, that's a lot. I'll file a separate bug for Gbs ;)

> codahale metrics exceptions
> ---
>
> Key: HIVE-14521
> URL: https://issues.apache.org/jira/browse/HIVE-14521
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> One some random setup, I see bazillions of errors like this in HS2 log, Gb-s 
> of logs worth:
> {noformat}
> 2016-08-08 04:52:18,619 WARN  [HiveServer2-Handler-Pool: Thread-101]: 
> log.PerfLogger (PerfLogger.java:beginMetrics(226)) - Error recording metrics
> java.io.IOException: Scope named api_Driver.run is not closed, cannot be 
> opened.
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics$CodahaleMetricsScope.open(CodahaleMetrics.java:133)
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.startStoredScope(CodahaleMetrics.java:220)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.beginMetrics(PerfLogger.java:223)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.PerfLogBegin(PerfLogger.java:143)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:378)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1214)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1208)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:276)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:468)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:456)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {noformat}
> I suspect that either, just like the metastore deadline, this needs better 
> error handling when whatever the metrics surround fails; or, it is just not 
> thread safe.
> But I actually haven't looked at the code yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu resolved HIVE-12546.
-
Resolution: Cannot Reproduce

Close it since we can't reproduce it now. Feel free to reopen it once you can.

> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>Assignee: Junjie Chen
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418212#comment-15418212
 ] 

Dapeng Sun edited comment on HIVE-12917 at 8/12/16 1:09 AM:


Status update: Ke Jia is working on the document, the inital version is under 
under review by few people, please feel free to send email to Ke if you want to 
review it. 


was (Author: dapengsun):
Status update: Ke Jia is working on the document, the first version is under 
under review by few people, please feel free to send email to Ke if you want to 
review it. 

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Reopened] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun reopened HIVE-12917:
---
  Assignee: Ke Jia  (was: Dapeng Sun)

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12917) Document for Hive authorization V2

2016-08-11 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HIVE-12917:
--
Status: Patch Available  (was: Reopened)

> Document for Hive authorization V2
> --
>
> Key: HIVE-12917
> URL: https://issues.apache.org/jira/browse/HIVE-12917
> Project: Hive
>  Issue Type: Bug
>Reporter: Dapeng Sun
>Assignee: Ke Jia
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junjie Chen reassigned HIVE-12546:
--

Assignee: Junjie Chen

> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>Assignee: Junjie Chen
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-12546 started by Junjie Chen.
--
> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>Assignee: Junjie Chen
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work stopped] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Junjie Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-12546 stopped by Junjie Chen.
--
> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>Assignee: Junjie Chen
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11693) CommonMergeJoinOperator throws exception with tez

2016-08-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418137#comment-15418137
 ] 

Hive QA commented on HIVE-11693:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12766760/HIVE-11693.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10418 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/857/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/857/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-857/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12766760 - PreCommit-HIVE-MASTER-Build

> CommonMergeJoinOperator throws exception with tez
> -
>
> Key: HIVE-11693
> URL: https://issues.apache.org/jira/browse/HIVE-11693
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Selina Zhang
> Attachments: HIVE-11693.1.patch
>
>
> Got this when executing a simple query with latest hive build + tez latest 
> version.
> {noformat}
> Error: Failure while running task: 
> attempt_1439860407967_0291_2_03_45_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Hive Runtime Error while closing operators: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.lang.RuntimeException: java.io.IOException: Please check if 
> you are invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:316)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:375)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.doFirstFetchIfNeeded(CommonMergeJoinOperator.java:482)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:434)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:384)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
> at 
>

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Patch Available  (was: Open)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14523.01.patch, HIVE-14523.02.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Attachment: HIVE-14523.02.patch

Updated patch with fixes from HIVE-14233.

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch, HIVE-14523.01.patch, 
> HIVE-14523.02.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Open  (was: Patch Available)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch, HIVE-14523.01.patch, 
> HIVE-14523.02.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14233) Improve vectorization for ACID by eliminating row-by-row stitching

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14233:
-
Attachment: HIVE-14233.07.patch

Fix a functional bug that was throwing up with ArrayOutOfBoundsExceptions as it 
was trying push SARGs for delete_deltas.

> Improve vectorization for ACID by eliminating row-by-row stitching
> --
>
> Key: HIVE-14233
> URL: https://issues.apache.org/jira/browse/HIVE-14233
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions, Vectorization
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14233.01.patch, HIVE-14233.02.patch, 
> HIVE-14233.03.patch, HIVE-14233.04.patch, HIVE-14233.05.patch, 
> HIVE-14233.06.patch, HIVE-14233.07.patch
>
>
> This JIRA proposes to improve vectorization for ACID by eliminating 
> row-by-row stitching when reading back ACID files. In the current 
> implementation, a vectorized row batch is created by populating the batch one 
> row at a time, before the vectorized batch is passed up along the operator 
> pipeline. This row-by-row stitching limitation was because of the fact that 
> the ACID insert/update/delete events from various delta files needed to be 
> merged together before the actual version of a given row was found out. 
> HIVE-14035 has enabled us to break away from that limitation by splitting 
> ACID update events into a combination of delete+insert. In fact, it has now 
> enabled us to create splits on delta files.
> Building on top of HIVE-14035, this JIRA proposes to solve this earlier 
> bottleneck in the vectorized code path for ACID by now directly reading row 
> batches from the underlying ORC files and avoiding any stitching altogether. 
> Once a row batch is read from the split (which may be on a base/delta file), 
> the deleted rows will be found by cross-referencing them against a data 
> structure that will just keep track of deleted events (found in the 
> deleted_delta files). This will lead to a large performance gain when reading 
> ACID files in vectorized fashion, while enabling further optimizations in 
> future that can be done on top of that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14519) Multi insert query bug

2016-08-11 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-14519:

Attachment: HIVE-14519.1.patch

> Multi insert query bug
> --
>
> Key: HIVE-14519
> URL: https://issues.apache.org/jira/browse/HIVE-14519
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-14519.1.patch
>
>
> When running multi-insert queries, when one of the query is not returning 
> results, the other query is not returning the right result.
> For example:
> After following query, there is no value in /tmp/emp/dir3/00_0
> {noformat}
> From (select * from src) a
> insert overwrite directory '/tmp/emp/dir1/'
> select key, value
> insert overwrite directory '/tmp/emp/dir2/'
> select 'header'
> where 1=2
> insert overwrite directory '/tmp/emp/dir3/'
> select key, value 
> where key = 100;
> {noformat}
> where clause in the second insert should not affect the third insert. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14519) Multi insert query bug

2016-08-11 Thread Yongzhi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-14519:

Status: Patch Available  (was: Open)

Need code review.

> Multi insert query bug
> --
>
> Key: HIVE-14519
> URL: https://issues.apache.org/jira/browse/HIVE-14519
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Attachments: HIVE-14519.1.patch
>
>
> When running multi-insert queries, when one of the query is not returning 
> results, the other query is not returning the right result.
> For example:
> After following query, there is no value in /tmp/emp/dir3/00_0
> {noformat}
> From (select * from src) a
> insert overwrite directory '/tmp/emp/dir1/'
> select key, value
> insert overwrite directory '/tmp/emp/dir2/'
> select 'header'
> where 1=2
> insert overwrite directory '/tmp/emp/dir3/'
> select key, value 
> where key = 100;
> {noformat}
> where clause in the second insert should not affect the third insert. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14521) codahale metrics exceptions

2016-08-11 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418052#comment-15418052
 ] 

Szehon Ho commented on HIVE-14521:
--

Sorry never mind, read access by TezJobMonitor should be ok as it doesnt create 
a scope, hm..

> codahale metrics exceptions
> ---
>
> Key: HIVE-14521
> URL: https://issues.apache.org/jira/browse/HIVE-14521
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> One some random setup, I see bazillions of errors like this in HS2 log, Gb-s 
> of logs worth:
> {noformat}
> 2016-08-08 04:52:18,619 WARN  [HiveServer2-Handler-Pool: Thread-101]: 
> log.PerfLogger (PerfLogger.java:beginMetrics(226)) - Error recording metrics
> java.io.IOException: Scope named api_Driver.run is not closed, cannot be 
> opened.
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics$CodahaleMetricsScope.open(CodahaleMetrics.java:133)
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.startStoredScope(CodahaleMetrics.java:220)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.beginMetrics(PerfLogger.java:223)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.PerfLogBegin(PerfLogger.java:143)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:378)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1214)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1208)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:276)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:468)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:456)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {noformat}
> I suspect that either, just like the metastore deadline, this needs better 
> error handling when whatever the metrics surround fails; or, it is just not 
> thread safe.
> But I actually haven't looked at the code yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14519) Multi insert query bug

2016-08-11 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418053#comment-15418053
 ] 

Yongzhi Chen commented on HIVE-14519:
-

The issue is related to constant propagation optimizer, after
set hive.optimize.constant.propagation=false;
The query works fine.

The constant optimizer makes where 1=2 a filter always return false, which 
trigger WhereFalseProcessor in NullScanOptimizer.
The WhereFalseProcessor checks the filter's ancestors, if there is an 
TableScanOperator, change the TableScanOperator to
read MetadataOnly. Reading only metadata means no rows will be fetched, it 
works fine if the TableScanOperator only works with
the filter always returns false (where 1=2). But in the multi insert case, this 
TableScanOperator also an ancestor for two other operators:
filter( key = 100) and a filesinkoperator. No rows returned in the 
TableScanOperator causes issues to the other two inserts.

Fix it by do not use the NullScanOptimizer when the false filter has peers. 

> Multi insert query bug
> --
>
> Key: HIVE-14519
> URL: https://issues.apache.org/jira/browse/HIVE-14519
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
>
> When running multi-insert queries, when one of the query is not returning 
> results, the other query is not returning the right result.
> For example:
> After following query, there is no value in /tmp/emp/dir3/00_0
> {noformat}
> From (select * from src) a
> insert overwrite directory '/tmp/emp/dir1/'
> select key, value
> insert overwrite directory '/tmp/emp/dir2/'
> select 'header'
> where 1=2
> insert overwrite directory '/tmp/emp/dir3/'
> select key, value 
> where key = 100;
> {noformat}
> where clause in the second insert should not affect the third insert. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14479) Add some join tests for acid table

2016-08-11 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418044#comment-15418044
 ] 

Eugene Koifman commented on HIVE-14479:
---

it would be more flexible if the 2nd param in assertExplainHasString(String 
string, List queryPlan, String testFor) simply took the whole message.  
Then "switch" statement is not needed and it works for all future cases.



> Add some join tests for acid table
> --
>
> Key: HIVE-14479
> URL: https://issues.apache.org/jira/browse/HIVE-14479
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-14479.1.patch, HIVE-14479.2.patch, 
> HIVE-14479.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12634) Add command to kill an ACID transaction

2016-08-11 Thread Wei Zheng (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418040#comment-15418040
 ] 

Wei Zheng commented on HIVE-12634:
--

Wiki has been updated.
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AbortTransactions

> Add command to kill an ACID transaction
> ---
>
> Key: HIVE-12634
> URL: https://issues.apache.org/jira/browse/HIVE-12634
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>  Labels: TODOC1.3, TODOC2.1
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12634.1.patch, HIVE-12634.2.patch, 
> HIVE-12634.3.patch, HIVE-12634.4.patch, HIVE-12634.5.patch, 
> HIVE-12634.6.patch, HIVE-12634.7.patch, HIVE-12634.branch-1.patch
>
>
> Should add a CLI command to abort a (runaway) transaction.
> This should clean up all state related to this txn.
> The initiator of this (if still alive) will get an error trying to 
> heartbeat/commit, i.e. will become aware that the txn is dead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-11 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418026#comment-15418026
 ] 

Naveen Gangam commented on HIVE-14513:
--

The same test failed in another build (2 builds after mine)
https://builds.apache.org/view/H-L/view/Hive/job/PreCommit-HIVE-MASTER-Build/856/
So this does appear to be a flaky test. So the fix looks good. +1 for me.

> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14521) codahale metrics exceptions

2016-08-11 Thread Szehon Ho (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15418015#comment-15418015
 ] 

Szehon Ho commented on HIVE-14521:
--

So there is a method 'SessionState.getPerfLogger().cleanupPerfLogMetrics();' 
that i introduced in the driver return path for success or failure case that 
should take care of the error handling in theory.

As per multi threading, the scopes are Thread local, but they are opened at 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L348]
 and closed 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/Driver.java#L559]
 so i believe this should be on the same thread.

I did a search and noticed that the PerfLogger.COMPILE scope is also used in 
TezJobMonitor, that might cause a conflict as its different thread.  Is it 
possible to rename that one?

> codahale metrics exceptions
> ---
>
> Key: HIVE-14521
> URL: https://issues.apache.org/jira/browse/HIVE-14521
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> One some random setup, I see bazillions of errors like this in HS2 log, Gb-s 
> of logs worth:
> {noformat}
> 2016-08-08 04:52:18,619 WARN  [HiveServer2-Handler-Pool: Thread-101]: 
> log.PerfLogger (PerfLogger.java:beginMetrics(226)) - Error recording metrics
> java.io.IOException: Scope named api_Driver.run is not closed, cannot be 
> opened.
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics$CodahaleMetricsScope.open(CodahaleMetrics.java:133)
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.startStoredScope(CodahaleMetrics.java:220)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.beginMetrics(PerfLogger.java:223)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.PerfLogBegin(PerfLogger.java:143)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:378)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1214)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1208)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:276)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:468)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:456)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {noformat}
> I suspect that either, just like the metastore deadline, this needs better 
> error handling when whatever the metrics surround fails; or, it is just not 
> thread safe.
> But I actually haven't looked at the code yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Attachment: HIVE-14523.01.patch

Realized that the previous file naming convention doesn't trigger the Ptest.

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch, HIVE-14523.01.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Open  (was: Patch Available)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch, HIVE-14523.01.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Patch Available  (was: Open)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch, HIVE-14523.01.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Open  (was: Patch Available)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Status: Patch Available  (was: Open)

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14520) We should set a timeout for the blocking calls in TestMsgBusConnection

2016-08-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417978#comment-15417978
 ] 

Hive QA commented on HIVE-14520:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823319/HIVE-14520.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10419 tests 
executed
*Failed tests:*
{noformat}
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_transform_ppr1
org.apache.hive.hcatalog.listener.TestMsgBusConnection.testConnection
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/856/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/856/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-856/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823319 - PreCommit-HIVE-MASTER-Build

> We should set a timeout for the blocking calls in TestMsgBusConnection
> --
>
> Key: HIVE-14520
> URL: https://issues.apache.org/jira/browse/HIVE-14520
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14520.1.patch
>
>
> consumer.receive() is a blocking call and if it fails, it will block for 
> ever. Need to set timeout at the bare minimum to force the test to fail 
> incase of failure rather than timing out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14479) Add some join tests for acid table

2016-08-11 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14479:
-
Attachment: HIVE-14479.3.patch

patch 3 removes the qfile as the file size estimation in explain always changes 
depending on where the test is run. Moved the test into a JUnit test.

> Add some join tests for acid table
> --
>
> Key: HIVE-14479
> URL: https://issues.apache.org/jira/browse/HIVE-14479
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-14479.1.patch, HIVE-14479.2.patch, 
> HIVE-14479.3.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3

2016-08-11 Thread Abdullah Yousufi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417975#comment-15417975
 ] 

Abdullah Yousufi commented on HIVE-14373:
-

Sure that sounds great! You could email me the patch if that works for you.

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Abdullah Yousufi
> Attachments: HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14448) Queries with predicate fail when ETL split strategy is chosen for ACID tables

2016-08-11 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14448:

Status: Patch Available  (was: Open)

Let's wait until a successful run before more code review.

> Queries with predicate fail when ETL split strategy is chosen for ACID tables
> -
>
> Key: HIVE-14448
> URL: https://issues.apache.org/jira/browse/HIVE-14448
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-14448.01.patch, HIVE-14448.02.patch, 
> HIVE-14448.patch
>
>
> When ETL split strategy is applied to ACID tables with predicate pushdown 
> (SARG enabled), split generation fails for ACID. This bug will be usually 
> exposed when working with data at scale, because in most otherwise cases only 
> BI split strategy is chosen. My guess is that this is happening because the 
> correct readerSchema is not being picked up when we try to extract SARG 
> column names.
> Quickest way to reproduce is to add the following unit test to 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
> {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
>  @Test
>   public void testETLSplitStrategyForACID() throws Exception {
> hiveConf.setVar(HiveConf.ConfVars.HIVE_ORC_SPLIT_STRATEGY, "ETL");
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVEOPTINDEXFILTER, true);
> runStatementOnDriver("insert into " + Table.ACIDTBL + " values(1,2)");
> runStatementOnDriver("alter table " + Table.ACIDTBL + " compact 'MAJOR'");
> runWorker(hiveConf);
> List rs = runStatementOnDriver("select * from " +  Table.ACIDTBL  
> + " where a = 1");
> int[][] resultData = new int[][] {{1,2}};
> Assert.assertEquals(stringifyValues(resultData), rs);
>   }
> {code}
> Back-trace for this failed test is as follows:
> {code}
> exec.Task: Job Submission failed with exception 
> 'java.lang.RuntimeException(ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException)'
> java.lang.RuntimeException: ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1570)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1656)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:370)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:488)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:321)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:417)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:141)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1962)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1653)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1389)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1131)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1119)
>   at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runStatementOnDriver(TestTxnCommands2.java:1292)
>   at 
>

[jira] [Updated] (HIVE-14523) ACID performance improvement patches

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14523:
-
Attachment: HIVE-14035_14199_14233.01.patch

First version.

> ACID performance improvement patches
> 
>
> Key: HIVE-14523
> URL: https://issues.apache.org/jira/browse/HIVE-14523
> Project: Hive
>  Issue Type: Test
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
>Priority: Trivial
> Attachments: HIVE-14035_14199_14233.01.patch
>
>
> This is a trivial non-functional JIRA that combines the features introduced 
> HIVE-14035, HIVE-14199 and HIVE-14233 into a single patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14448) Queries with predicate fail when ETL split strategy is chosen for ACID tables

2016-08-11 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417970#comment-15417970
 ] 

Eugene Koifman commented on HIVE-14448:
---

HIVE-14035 has TestTxnCommands2WithSplitUpdate.testOrcPPD() and testOrcNoPPD() 
both of which have a special case specifically because of this bug.  The 
special casing should be removed as part of HIVE-14448 - it will also provide 
additional testing.

> Queries with predicate fail when ETL split strategy is chosen for ACID tables
> -
>
> Key: HIVE-14448
> URL: https://issues.apache.org/jira/browse/HIVE-14448
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-14448.01.patch, HIVE-14448.02.patch, 
> HIVE-14448.patch
>
>
> When ETL split strategy is applied to ACID tables with predicate pushdown 
> (SARG enabled), split generation fails for ACID. This bug will be usually 
> exposed when working with data at scale, because in most otherwise cases only 
> BI split strategy is chosen. My guess is that this is happening because the 
> correct readerSchema is not being picked up when we try to extract SARG 
> column names.
> Quickest way to reproduce is to add the following unit test to 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
> {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
>  @Test
>   public void testETLSplitStrategyForACID() throws Exception {
> hiveConf.setVar(HiveConf.ConfVars.HIVE_ORC_SPLIT_STRATEGY, "ETL");
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVEOPTINDEXFILTER, true);
> runStatementOnDriver("insert into " + Table.ACIDTBL + " values(1,2)");
> runStatementOnDriver("alter table " + Table.ACIDTBL + " compact 'MAJOR'");
> runWorker(hiveConf);
> List rs = runStatementOnDriver("select * from " +  Table.ACIDTBL  
> + " where a = 1");
> int[][] resultData = new int[][] {{1,2}};
> Assert.assertEquals(stringifyValues(resultData), rs);
>   }
> {code}
> Back-trace for this failed test is as follows:
> {code}
> exec.Task: Job Submission failed with exception 
> 'java.lang.RuntimeException(ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException)'
> java.lang.RuntimeException: ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1570)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1656)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:370)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:488)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:321)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:417)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:141)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1962)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1653)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1389)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1131)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1119)
>

[jira] [Updated] (HIVE-14448) Queries with predicate fail when ETL split strategy is chosen for ACID tables

2016-08-11 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-14448:

Attachment: HIVE-14448.02.patch

> Queries with predicate fail when ETL split strategy is chosen for ACID tables
> -
>
> Key: HIVE-14448
> URL: https://issues.apache.org/jira/browse/HIVE-14448
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-14448.01.patch, HIVE-14448.02.patch, 
> HIVE-14448.patch
>
>
> When ETL split strategy is applied to ACID tables with predicate pushdown 
> (SARG enabled), split generation fails for ACID. This bug will be usually 
> exposed when working with data at scale, because in most otherwise cases only 
> BI split strategy is chosen. My guess is that this is happening because the 
> correct readerSchema is not being picked up when we try to extract SARG 
> column names.
> Quickest way to reproduce is to add the following unit test to 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
> {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
>  @Test
>   public void testETLSplitStrategyForACID() throws Exception {
> hiveConf.setVar(HiveConf.ConfVars.HIVE_ORC_SPLIT_STRATEGY, "ETL");
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVEOPTINDEXFILTER, true);
> runStatementOnDriver("insert into " + Table.ACIDTBL + " values(1,2)");
> runStatementOnDriver("alter table " + Table.ACIDTBL + " compact 'MAJOR'");
> runWorker(hiveConf);
> List rs = runStatementOnDriver("select * from " +  Table.ACIDTBL  
> + " where a = 1");
> int[][] resultData = new int[][] {{1,2}};
> Assert.assertEquals(stringifyValues(resultData), rs);
>   }
> {code}
> Back-trace for this failed test is as follows:
> {code}
> exec.Task: Job Submission failed with exception 
> 'java.lang.RuntimeException(ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException)'
> java.lang.RuntimeException: ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1570)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1656)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:370)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:488)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:321)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:417)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:141)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1962)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1653)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1389)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1131)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1119)
>   at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runStatementOnDriver(TestTxnCommands2.java:1292)
>   at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.testETLSplitStrategyForACID(TestTxnCommands2.java:280)
>   at

[jira] [Commented] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417964#comment-15417964
 ] 

Eugene Koifman commented on HIVE-14035:
---

+1 patch 16 pending bot run

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.16.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14298) NPE could be thrown in HMS when an ExpressionTree could not be made from a filter

2016-08-11 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417959#comment-15417959
 ] 

Chaoyu Tang edited comment on HIVE-14298 at 8/11/16 9:22 PM:
-

It might need a little bit efforts since qtest is using embedded HMS with its 
property hive.metastore.limit.partition.request default disabled, and Unit 
tests need convert the query predicates to expressions and pass them to the 
relevant HMS APIs. Maybe we can raise a JIRA for it.


was (Author: ctang.ma):
It might need a little bit efforts since qtest is using embedded HMS with its 
property hive.metastore.limit.partition.request default disabled, and Unit 
tests need convert the query predicates to expression to pass in. Maybe we can 
raise a JIRA for it.

> NPE could be thrown in HMS when an ExpressionTree could not be made from a 
> filter
> -
>
> Key: HIVE-14298
> URL: https://issues.apache.org/jira/browse/HIVE-14298
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14298.patch, HIVE-14298.patch, HIVE-14298.patch
>
>
> In many cases where an ExpressionTree could not be made from a filter (e.g. 
> parser fails to parse a filter etc.) and its value is null. But this null is 
> passed around and used by a couple of HMS methods which can cause 
> NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14298) NPE could be thrown in HMS when an ExpressionTree could not be made from a filter

2016-08-11 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417959#comment-15417959
 ] 

Chaoyu Tang commented on HIVE-14298:


It might need a little bit efforts since qtest is using embedded HMS with its 
property hive.metastore.limit.partition.request default disabled, and Unit 
tests need convert the query predicates to expression to pass in. Maybe we can 
raise a JIRA for it.

> NPE could be thrown in HMS when an ExpressionTree could not be made from a 
> filter
> -
>
> Key: HIVE-14298
> URL: https://issues.apache.org/jira/browse/HIVE-14298
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14298.patch, HIVE-14298.patch, HIVE-14298.patch
>
>
> In many cases where an ExpressionTree could not be made from a filter (e.g. 
> parser fails to parse a filter etc.) and its value is null. But this null is 
> passed around and used by a couple of HMS methods which can cause 
> NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14522) CBO: Calcite Operator To Hive Operator(Calcite Return Path): Fix test failure for auto_join_filters

2016-08-11 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417946#comment-15417946
 ] 

Vineet Garg commented on HIVE-14522:


Right outer and full outer joins have wrong result as well

> CBO: Calcite Operator To Hive Operator(Calcite Return Path): Fix test failure 
> for auto_join_filters
> ---
>
> Key: HIVE-14522
> URL: https://issues.apache.org/jira/browse/HIVE-14522
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>
> {code}
> CREATE TABLE smb_input1(key int, value int) CLUSTERED BY (key) SORTED BY 
> (key) INTO 2 BUCKETS; 
> CREATE TABLE smb_input2(key int, value int) CLUSTERED BY (value) SORTED BY 
> (value) INTO 2 BUCKETS; 
> LOAD DATA LOCAL INPATH '../../data/files/in1.txt' into table smb_input1;
> LOAD DATA LOCAL INPATH '../../data/files/in2.txt' into table smb_input1;
> LOAD DATA LOCAL INPATH '../../data/files/in1.txt' into table smb_input2;
> LOAD DATA LOCAL INPATH '../../data/files/in2.txt' into table smb_input2;
> SET hive.optimize.bucketmapjoin = true;
> SET hive.optimize.bucketmapjoin.sortedmerge = true;
> SET hive.input.format = 
> org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> SET hive.outerjoin.supports.filters = false;
> {code}
> {code} SELECT sum(hash(a.key,a.value,b.key,b.value)) FROM myinput1 a LEFT 
> OUTER JOIN myinput1 b on a.key > 40 AND a.value > 50 AND a.key = a.value AND 
> b.key > 40 AND b.value > 50 AND b.key = b.value; {code}
> {code} Expected result: 3078400 Actual result: 4937935 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14396) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver count.q failure

2016-08-11 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417920#comment-15417920
 ] 

Vineet Garg commented on HIVE-14396:


Created: https://reviews.apache.org/r/51006/

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> count.q failure
> ---
>
> Key: HIVE-14396
> URL: https://issues.apache.org/jira/browse/HIVE-14396
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14396.1.patch
>
>
> Currently there are three different failures
> Set hive.cbo.returnpath.hiveop=true for all cases.
> 1) First case is wrong result for following query
> {code:title=failure 1 Wrong result}
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> This occurs due to a bug in HiveCalciteUtil.getExprNodes. While looking for 
> corresponding expression for a aggregate function's argument wrong index is 
> being used.
> 2) Out of bound exception for following
> {code}
> set hive.map.aggr=false
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> The above happens while converting Calcite Aggregation to Hive's group by 
> operator.
> 3) Once the above case with exception is fixed same query with 
> hive.map.aggr=false give wrong results. Problem in this case is that while 
> creating expression for aggregate function's argument we end up with wrong 
> column info from underlying reduce sink operator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417915#comment-15417915
 ] 

Ashutosh Chauhan commented on HIVE-14511:
-

[~pattipaka]  I agree with what [~pxiong] is saying above. Can you be explicit 
what your table definition is and what is dir structure is on filesystem?

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14396) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver count.q failure

2016-08-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417905#comment-15417905
 ] 

Ashutosh Chauhan commented on HIVE-14396:
-

Can you create a RB for this?

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> count.q failure
> ---
>
> Key: HIVE-14396
> URL: https://issues.apache.org/jira/browse/HIVE-14396
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14396.1.patch
>
>
> Currently there are three different failures
> Set hive.cbo.returnpath.hiveop=true for all cases.
> 1) First case is wrong result for following query
> {code:title=failure 1 Wrong result}
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> This occurs due to a bug in HiveCalciteUtil.getExprNodes. While looking for 
> corresponding expression for a aggregate function's argument wrong index is 
> being used.
> 2) Out of bound exception for following
> {code}
> set hive.map.aggr=false
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> The above happens while converting Calcite Aggregation to Hive's group by 
> operator.
> 3) Once the above case with exception is fixed same query with 
> hive.map.aggr=false give wrong results. Problem in this case is that while 
> creating expression for aggregate function's argument we end up with wrong 
> column info from underlying reduce sink operator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12924) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver groupby_ppr_multi_distinct.q failure

2016-08-11 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-12924:
---
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

Marking this as duplicate since this is same issue as HIVE-14396 (issue 3)

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> groupby_ppr_multi_distinct.q failure
> 
>
> Key: HIVE-12924
> URL: https://issues.apache.org/jira/browse/HIVE-12924
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Vineet Garg
> Attachments: HIVE-12924.1.patch, HIVE-12924.2.patch, 
> HIVE-12924.3.patch
>
>
> {code}
> EXPLAIN EXTENDED
> FROM srcpart src
> INSERT OVERWRITE TABLE dest1
> SELECT substr(src.key,1,1), count(DISTINCT substr(src.value,5)), 
> concat(substr(src.key,1,1),sum(substr(src.value,5))), sum(DISTINCT 
> substr(src.value, 5)), count(DISTINCT src.value)
> WHERE src.ds = '2008-04-08'
> GROUP BY substr(src.key,1,1)
> {code}
> Ended Job = job_local968043618_0742 with errors
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-12803) CBO: Calcite Operator To Hive Operator (Calcite Return Path): MiniTezCliDriver count.q failure

2016-08-11 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg resolved HIVE-12803.

Resolution: Duplicate

Same issue is captured by HIVE-14396

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): 
> MiniTezCliDriver count.q failure
> --
>
> Key: HIVE-12803
> URL: https://issues.apache.org/jira/browse/HIVE-12803
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Vineet Garg
>
> {code}
> select a, count(distinct b), count(distinct c), sum(d) from abcd group by a;
> {code}
> Set hive.cbo.returnpath.hiveop=true;
> {code}
> java.lang.IndexOutOfBoundsException: Index: 5, Size: 5
> at java.util.ArrayList.rangeCheck(ArrayList.java:635) ~[?:1.7.0_79]
> at java.util.ArrayList.get(ArrayList.java:411) ~[?:1.7.0_79]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.HiveGBOpConvUtil.genReduceSideGB1NoMapGB(HiveGBOpConvUtil.java:1060)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.HiveGBOpConvUtil.genNoMapSideGBNoSkew(HiveGBOpConvUtil.java:473)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.HiveGBOpConvUtil.translateGB(HiveGBOpConvUtil.java:304)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.HiveOpConverter.visit(HiveOpConverter.java:398)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.HiveOpConverter.dispatch(HiveOpConverter.java:181)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.HiveOpConverter.convert(HiveOpConverter.java:154)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedHiveOPDag(CalcitePlanner.java:688)
>  ~[hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:266)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10094)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:231)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>  [hive-exec-2.1.0-SNAPSHOT.jar:2.1.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:471) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1149) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1237) 
> [hive-exec-2.1.0-SNAPSHOT.jar:?]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3

2016-08-11 Thread Illya Yalovyy (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417898#comment-15417898
 ] 

Illya Yalovyy commented on HIVE-14373:
--

Hey [~ayousufi],

I can provide a patch that contains out test framework (very similar to what 
you have implement already). We are using it in production to test Hive on s3. 
Unfortunately I cannot attach the file to the ticket. Most likely because it is 
not assigned to me. I think it will be useful for you and can be used as a 
reference.

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Abdullah Yousufi
> Attachments: HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-11 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417859#comment-15417859
 ] 

Pengcheng Xiong commented on HIVE-14511:


[~pattipaka], as you said, the 
{code}
!directoryFound
{code}
is used to check if current folder is a leaf folder (which does not contain any 
folder). It was inherited from the original code. We need to discuss if your 
claim that "Any path you find at this level will qualify to be a partition." is 
valid to all the existing applications. For example, let assume that we only 
have 2 partition specifications, now we have /p1=1/p2=1/p3=1/file.txt, so you 
mean we should also add /p1=1/p2=1/p3=1 ? I think we should just ignore this 
according to my discussion with [~ashutoshc]. thanks.

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14298) NPE could be thrown in HMS when an ExpressionTree could not be made from a filter

2016-08-11 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417858#comment-15417858
 ] 

Chaoyu Tang commented on HIVE-14298:


[~thejas] Following is the stacktrace. You can easily reproduce it using the 
query like "select * from sample_pt where code in ('53-5022', '53-5023') and 
dummy like '%1';" with hive.metastore.limit.partition.request > 0 and directsql 
is enabled:
{code}
2016-08-11T16:05:30,384 ERROR [main]: metastore.ObjectStore (:()) - 
java.lang.NullPointerException
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql$PartitionFilterGenerator.generateSqlFilter(MetaStoreDirectSql.java:987)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql$PartitionFilterGenerator.access$700(MetaStoreDirectSql.java:956)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.generateSqlFilterForPushdown(MetaStoreDirectSql.java:396)
at 
org.apache.hadoop.hive.metastore.ObjectStore$6.canUseDirectSql(ObjectStore.java:2937)
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.start(ObjectStore.java:2737)
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2703)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getNumPartitionsByExpr(ObjectStore.java:2928)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at com.sun.proxy.$Proxy27.getNumPartitionsByExpr(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_num_partitions_by_expr(HiveMetaStore.java:4853)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.checkLimitNumberOfPartitionsByExpr(HiveMetaStore.java:3239)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_expr(HiveMetaStore.java:4799)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
at com.sun.proxy.$Proxy28.get_partitions_by_expr(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByExpr(HiveMetaStoreClient.java:1251)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
at com.sun.proxy.$Proxy29.listPartitionsByExpr(Unknown Source)
at 
org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByExpr(Hive.java:2479)
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.getPartitionsFromServer(PartitionPruner.java:424)
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:223)
at 
org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.computePartitionList(RelOptHiveTable.java:256)
at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HivePartitionPruneRule.perform(HivePartitionPruneRule.java:55)
at 
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HivePartitionPruneRule.onMatch(HivePartitionPruneRule.java:43)
at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:318)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:514)
at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:392)
at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:255)
at 
org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:125)
at 
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207)
at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:1320)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:1274)
at

[jira] [Commented] (HIVE-14521) codahale metrics exceptions

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417849#comment-15417849
 ] 

Sergey Shelukhin commented on HIVE-14521:
-

[~szehon] fyi

> codahale metrics exceptions
> ---
>
> Key: HIVE-14521
> URL: https://issues.apache.org/jira/browse/HIVE-14521
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> One some random setup, I see bazillions of errors like this in HS2 log, Gb-s 
> of logs worth:
> {noformat}
> 2016-08-08 04:52:18,619 WARN  [HiveServer2-Handler-Pool: Thread-101]: 
> log.PerfLogger (PerfLogger.java:beginMetrics(226)) - Error recording metrics
> java.io.IOException: Scope named api_Driver.run is not closed, cannot be 
> opened.
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics$CodahaleMetricsScope.open(CodahaleMetrics.java:133)
> at 
> org.apache.hadoop.hive.common.metrics.metrics2.CodahaleMetrics.startStoredScope(CodahaleMetrics.java:220)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.beginMetrics(PerfLogger.java:223)
> at 
> org.apache.hadoop.hive.ql.log.PerfLogger.PerfLogBegin(PerfLogger.java:143)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:378)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:320)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1214)
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1208)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:146)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:226)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:276)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:468)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:456)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> {noformat}
> I suspect that either, just like the metastore deadline, this needs better 
> error handling when whatever the metrics surround fails; or, it is just not 
> thread safe.
> But I actually haven't looked at the code yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14504) tez_join_hash.q test is slow

2016-08-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417810#comment-15417810
 ] 

Hive QA commented on HIVE-14504:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823317/HIVE-14504.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10419 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/855/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/855/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-855/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823317 - PreCommit-HIVE-MASTER-Build

> tez_join_hash.q test is slow
> 
>
> Key: HIVE-14504
> URL: https://issues.apache.org/jira/browse/HIVE-14504
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14504.1.patch, HIVE-14504.1.patch, 
> HIVE-14504.1.patch
>
>
> tez_join_hash.q also explicitly sets execution engine to mr which slows down 
> the entire test. Test takes around 7 mins. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-11 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417758#comment-15417758
 ] 

Naveen Gangam commented on HIVE-14513:
--

There is just ONE test failure which had not failed in the prior build. Looking 
at the test, I do not see how this fix could cause that failure. There are 
bunch of exceptions when running the test, so I suspect it is a result of 
those. running locally fails with different output that shown here.


> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417754#comment-15417754
 ] 

Sergey Shelukhin commented on HIVE-14035:
-

+1 pending others' comments

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.16.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417748#comment-15417748
 ] 

Sergey Shelukhin commented on HIVE-14511:
-

The thing is that msck is intended for Hive table repair, not for creating 
partitions over user data in ETL. It should be moved into a separate proper 
feature if we want to add functionality to it and make it a supported feature. 
Otherwise we are just encouraging an unsupported hack...

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14448) Queries with predicate fail when ETL split strategy is chosen for ACID tables

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417743#comment-15417743
 ] 

Sergey Shelukhin commented on HIVE-14448:
-

Test failures are related...

> Queries with predicate fail when ETL split strategy is chosen for ACID tables
> -
>
> Key: HIVE-14448
> URL: https://issues.apache.org/jira/browse/HIVE-14448
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-14448.01.patch, HIVE-14448.patch
>
>
> When ETL split strategy is applied to ACID tables with predicate pushdown 
> (SARG enabled), split generation fails for ACID. This bug will be usually 
> exposed when working with data at scale, because in most otherwise cases only 
> BI split strategy is chosen. My guess is that this is happening because the 
> correct readerSchema is not being picked up when we try to extract SARG 
> column names.
> Quickest way to reproduce is to add the following unit test to 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
> {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
>  @Test
>   public void testETLSplitStrategyForACID() throws Exception {
> hiveConf.setVar(HiveConf.ConfVars.HIVE_ORC_SPLIT_STRATEGY, "ETL");
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVEOPTINDEXFILTER, true);
> runStatementOnDriver("insert into " + Table.ACIDTBL + " values(1,2)");
> runStatementOnDriver("alter table " + Table.ACIDTBL + " compact 'MAJOR'");
> runWorker(hiveConf);
> List rs = runStatementOnDriver("select * from " +  Table.ACIDTBL  
> + " where a = 1");
> int[][] resultData = new int[][] {{1,2}};
> Assert.assertEquals(stringifyValues(resultData), rs);
>   }
> {code}
> Back-trace for this failed test is as follows:
> {code}
> exec.Task: Job Submission failed with exception 
> 'java.lang.RuntimeException(ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException)'
> java.lang.RuntimeException: ORC split generation failed with exception: 
> java.lang.NegativeArraySizeException
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1570)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1656)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:370)
>   at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:488)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:321)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:417)
>   at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:141)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1962)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1653)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1389)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1131)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1119)
>   at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.runStatementOnDriver(TestTxnCommands2.java:1292)
>   at 
> org.apache.hadoop.hive.ql.TestTxnCommands2.testETLSplitStrategyForACID(TestTxnCommands2.java:280)
>   at

[jira] [Commented] (HIVE-14483) java.lang.ArrayIndexOutOfBoundsException org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417742#comment-15417742
 ] 

Sergey Shelukhin commented on HIVE-14483:
-

+1, these 2 tests are unstable, it appears

>  java.lang.ArrayIndexOutOfBoundsException 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays
> --
>
> Key: HIVE-14483
> URL: https://issues.apache.org/jira/browse/HIVE-14483
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Sergey Zadoroshnyak
>Assignee: Sergey Zadoroshnyak
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-14483.01.patch
>
>
> Error message:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
> at 
> org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
> at 
> org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
> at 
> org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
> at 
> org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
> at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
> at 
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
> at 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
> ... 22 more
> How to reproduce?
> Configure StringTreeReader  which contains StringDirectTreeReader as 
> TreeReader (DIRECT or DIRECT_V2 column encoding)
> batchSize = 1026;
> invoke method nextVector(ColumnVector previousVector,boolean[] isNull, final 
> int batchSize)
> scratchlcv is LongColumnVector with long[] vector  (length 1024)
>  which execute BytesColumnVectorUtil.readOrcByteArrays(stream, lengths, 
> scratchlcv,result, batchSize);
> as result in method commonReadByteArrays(stream, lengths, scratchlcv,
> result, (int) batchSize) we received 
> ArrayIndexOutOfBoundsException.
> If we use StringDictionaryTreeReader, then there is no exception, as we have 
> a verification  scratchlcv.ensureSize((int) batchSize, false) before 
> reader.nextVector(scratchlcv, scratchlcv.vector, batchSize);
> These changes were made for Hive 2.1.0 by corresponding commit 
> https://github.com/apache/hive/commit/0ac424f0a17b341efe299da167791112e4a953e9#diff-a1cec556fb2db4b69a1a4127a6908177R1467
>  for task  https://issues.apache.org/jira/browse/HIVE-12159 by Owen O'Malley
> How to fix?
> add  only one line :
> scratchlcv.ensureSize((int) batchSize, false) ;
> in method 
> org.apache.orc.impl.TreeReaderFactory#BytesColumnVectorUtil#commonReadByteArrays(InStream
>  stream, IntegerReader lengths,
> LongColumnVector scratchlcv,
> BytesColumnVector result, final int batchSize) before invocation 
> lengths.nextVector(scratchlcv, scratchlcv.vector, batchSize);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-11 Thread Subramanyam Pattipaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417738#comment-15417738
 ] 

Subramanyam Pattipaka commented on HIVE-14511:
--

[~pxiong], As your change stops at depth same as number of partition columns, 
your current code has a bug at

if (!directoryFound && maxDepth == 0) {

This again assumes that you don't have directory at maxDepth. You are 
terminating your search here anyway. Any path you find at this level will 
qualify to be a partition. I think you should remove check !directoryFound. 
Same at other locations.

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14480) ORC ETLSplitStrategy should use thread pool when computing splits

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417735#comment-15417735
 ] 

Sergey Shelukhin commented on HIVE-14480:
-

+1

> ORC ETLSplitStrategy should use thread pool when computing splits
> -
>
> Key: HIVE-14480
> URL: https://issues.apache.org/jira/browse/HIVE-14480
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14480.1.patch, HIVE-14480.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases

2016-08-11 Thread Subramanyam Pattipaka (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417732#comment-15417732
 ] 

Subramanyam Pattipaka commented on HIVE-14511:
--

[~sershe], some users have their large data in structure with format 
data/partlevel=0/partlevel2=0/partlevel3=0/partleve4=0//partleveln=0/file1

Given this structure, using configs mapred.input.dir.recursive and 
hive.mapred.supports.subdirectories set to true, the expectation is that we can 
create partitions at any level and query data. 

Users can generate data considering various tools in mind. Asking them to 
reorganize data and create a copy for Hive may put hurdle for trying out Hive 
as data could be very huge and it may not always be possible.

This fix will ensure that we add appropriate partitions for above case when 
user tries to create partitions with any number of levels.

> Improve MSCK for partitioned table to deal with special cases
> -
>
> Key: HIVE-14511
> URL: https://issues.apache.org/jira/browse/HIVE-14511
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14511.01.patch
>
>
> Some users will have a folder rather than a file under the last partition 
> folder. However, msck is going to search for the leaf folder rather than the 
> last partition folder. We need to improve that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14396) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver count.q failure

2016-08-11 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417726#comment-15417726
 ] 

Vineet Garg edited comment on HIVE-14396 at 8/11/16 6:40 PM:
-

None of the test failures are reproducible on my local machine


was (Author: vgarg):
None of the tests are reproducible on my local machine

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> count.q failure
> ---
>
> Key: HIVE-14396
> URL: https://issues.apache.org/jira/browse/HIVE-14396
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14396.1.patch
>
>
> Currently there are three different failures
> Set hive.cbo.returnpath.hiveop=true for all cases.
> 1) First case is wrong result for following query
> {code:title=failure 1 Wrong result}
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> This occurs due to a bug in HiveCalciteUtil.getExprNodes. While looking for 
> corresponding expression for a aggregate function's argument wrong index is 
> being used.
> 2) Out of bound exception for following
> {code}
> set hive.map.aggr=false
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> The above happens while converting Calcite Aggregation to Hive's group by 
> operator.
> 3) Once the above case with exception is fixed same query with 
> hive.map.aggr=false give wrong results. Problem in this case is that while 
> creating expression for aggregate function's argument we end up with wrong 
> column info from underlying reduce sink operator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14396) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver count.q failure

2016-08-11 Thread Vineet Garg (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417726#comment-15417726
 ] 

Vineet Garg commented on HIVE-14396:


None of the tests are reproducible on my local machine

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> count.q failure
> ---
>
> Key: HIVE-14396
> URL: https://issues.apache.org/jira/browse/HIVE-14396
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14396.1.patch
>
>
> Currently there are three different failures
> Set hive.cbo.returnpath.hiveop=true for all cases.
> 1) First case is wrong result for following query
> {code:title=failure 1 Wrong result}
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> This occurs due to a bug in HiveCalciteUtil.getExprNodes. While looking for 
> corresponding expression for a aggregate function's argument wrong index is 
> being used.
> 2) Out of bound exception for following
> {code}
> set hive.map.aggr=false
> explain select count(1), count(*), count(a), count(b), count(c), count(d), 
> count(distinct a), count(distinct b), count(distinct c), count(distinct d), 
> count(distinct a,b), count(distinct b,c), count(distinct c,d), count(distinct 
> a,d), count(distinct a,c), count(distinct b,d), count(distinct a,b,c), 
> count(distinct b,c,d), count(distinct a,c,d), count(distinct a,b,d), 
> count(distinct a,b,c,d) from abcd;
> {code}
> The above happens while converting Calcite Aggregation to Hive's group by 
> operator.
> 3) Once the above case with exception is fixed same query with 
> hive.map.aggr=false give wrong results. Problem in this case is that while 
> creating expression for aggregate function's argument we end up with wrong 
> column info from underlying reduce sink operator. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12546) Hive beeline doesn't support arrow keys and tab

2016-08-11 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417724#comment-15417724
 ] 

Sergey Shelukhin commented on HIVE-12546:
-

Hmm.. sure. I may reopen if I get a repro.

> Hive beeline doesn't support arrow keys and tab
> ---
>
> Key: HIVE-12546
> URL: https://issues.apache.org/jira/browse/HIVE-12546
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Sergey Shelukhin
>
> On CLI, up/down arrows navigate history, tab auto-completes, and left/right 
> arrows move around the command text.
> Trying to use beeline, I see that these just print key codes or the tab into 
> the command text. 
> This should be fixed before removing CLI in favor of beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Attachment: HIVE-14035.16.patch

Rebase with master and fix the two failing UTs due to a trivial case mismatch 
error.

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.16.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Status: Patch Available  (was: Open)

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.16.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Status: Open  (was: Patch Available)

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14520) We should set a timeout for the blocking calls in TestMsgBusConnection

2016-08-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417704#comment-15417704
 ] 

Ashutosh Chauhan commented on HIVE-14520:
-

+1

> We should set a timeout for the blocking calls in TestMsgBusConnection
> --
>
> Key: HIVE-14520
> URL: https://issues.apache.org/jira/browse/HIVE-14520
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14520.1.patch
>
>
> consumer.receive() is a blocking call and if it fails, it will block for 
> ever. Need to set timeout at the bare minimum to force the test to fail 
> incase of failure rather than timing out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11693) CommonMergeJoinOperator throws exception with tez

2016-08-11 Thread Mithun Radhakrishnan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417699#comment-15417699
 ] 

Mithun Radhakrishnan commented on HIVE-11693:
-

[~selinazh], could we please post our solution to this JIRA?

> CommonMergeJoinOperator throws exception with tez
> -
>
> Key: HIVE-11693
> URL: https://issues.apache.org/jira/browse/HIVE-11693
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Selina Zhang
> Attachments: HIVE-11693.1.patch
>
>
> Got this when executing a simple query with latest hive build + tez latest 
> version.
> {noformat}
> Error: Failure while running task: 
> attempt_1439860407967_0291_2_03_45_0:java.lang.RuntimeException: 
> java.lang.RuntimeException: Hive Runtime Error while closing operators: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:349)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:71)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:60)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:60)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: Hive Runtime Error while closing 
> operators: java.lang.RuntimeException: java.io.IOException: Please check if 
> you are invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:316)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
> ... 14 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.RuntimeException: java.io.IOException: Please check if you are 
> invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:412)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchNextGroup(CommonMergeJoinOperator.java:375)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.doFirstFetchIfNeeded(CommonMergeJoinOperator.java:482)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:434)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:384)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:292)
> ... 15 more
> Caused by: java.lang.RuntimeException: java.io.IOException: Please check if 
> you are invoking moveToNext() even after it returned false.
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:291)
> at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:400)
> ... 21 more
> Caused by: java.io.IOException: Please check if you are invoking moveToNext() 
> even after it returned false.
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.hasCompletedProcessing(ValuesIterator.java:223)
> at 
> org.apache.tez.runtime.library.common.ValuesIterator.moveToNext(ValuesIterator.java:105)
> at 
> org.apache.tez.runtime.library.input.OrderedGroupedKVInput$OrderedGroupedKeyValuesReader.next(OrderedGroupedKVInput.java:308)
> at 
> org.apache.hadoop.hive.ql.exec.tez.KeyValuesFromKeyValues.next(KeyValuesFromKeyValues.java:46)
> at 
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:249)
> ... 22 more
> {noformat}
> Not sure if this is related to HIVE-11016. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14504) tez_join_hash.q test is slow

2016-08-11 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417668#comment-15417668
 ] 

Siddharth Seth commented on HIVE-14504:
---

I'm not sure if pre-commit is configured to ignore patches which only change 
tests?

> tez_join_hash.q test is slow
> 
>
> Key: HIVE-14504
> URL: https://issues.apache.org/jira/browse/HIVE-14504
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14504.1.patch, HIVE-14504.1.patch, 
> HIVE-14504.1.patch
>
>
> tez_join_hash.q also explicitly sets execution engine to mr which slows down 
> the entire test. Test takes around 7 mins. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14520) We should set a timeout for the blocking calls in TestMsgBusConnection

2016-08-11 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14520:
-
Status: Patch Available  (was: Open)

> We should set a timeout for the blocking calls in TestMsgBusConnection
> --
>
> Key: HIVE-14520
> URL: https://issues.apache.org/jira/browse/HIVE-14520
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14520.1.patch
>
>
> consumer.receive() is a blocking call and if it fails, it will block for 
> ever. Need to set timeout at the bare minimum to force the test to fail 
> incase of failure rather than timing out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14520) We should set a timeout for the blocking calls in TestMsgBusConnection

2016-08-11 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-14520:
-
Attachment: HIVE-14520.1.patch

cc [~ashutoshc] for review.

> We should set a timeout for the blocking calls in TestMsgBusConnection
> --
>
> Key: HIVE-14520
> URL: https://issues.apache.org/jira/browse/HIVE-14520
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14520.1.patch
>
>
> consumer.receive() is a blocking call and if it fails, it will block for 
> ever. Need to set timeout at the bare minimum to force the test to fail 
> incase of failure rather than timing out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14298) NPE could be thrown in HMS when an ExpressionTree could not be made from a filter

2016-08-11 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417635#comment-15417635
 ] 

Thejas M Nair commented on HIVE-14298:
--

[~ctang.ma]
I am having trouble understanding why the NPE is related to 
hive.metastore.limit.partition.request setting. 
Do you have the full stack trace for this ?


> NPE could be thrown in HMS when an ExpressionTree could not be made from a 
> filter
> -
>
> Key: HIVE-14298
> URL: https://issues.apache.org/jira/browse/HIVE-14298
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14298.patch, HIVE-14298.patch, HIVE-14298.patch
>
>
> In many cases where an ExpressionTree could not be made from a filter (e.g. 
> parser fails to parse a filter etc.) and its value is null. But this null is 
> passed around and used by a couple of HMS methods which can cause 
> NullPointerException.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417631#comment-15417631
 ] 

Hive QA commented on HIVE-14513:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823145/HIVE-14513.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10408 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/854/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/854/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-854/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823145 - PreCommit-HIVE-MASTER-Build

> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14039) HiveServer2: Make the usage of server with JDBC thirft serde enabled, backward compatible for older clients

2016-08-11 Thread Ziyang Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417629#comment-15417629
 ] 

Ziyang Zhao commented on HIVE-14039:


Failed test cases are not related.

> HiveServer2: Make the usage of server with JDBC thirft serde enabled, 
> backward compatible for older clients
> ---
>
> Key: HIVE-14039
> URL: https://issues.apache.org/jira/browse/HIVE-14039
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Affects Versions: 2.0.1
>Reporter: Vaibhav Gumashta
>Assignee: Ziyang Zhao
> Attachments: HIVE-14039.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14504) tez_join_hash.q test is slow

2016-08-11 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14504:
-
Attachment: HIVE-14504.1.patch

For some reason this patch is being kicked out of precommit queue. Will give 
one more shot. 

> tez_join_hash.q test is slow
> 
>
> Key: HIVE-14504
> URL: https://issues.apache.org/jira/browse/HIVE-14504
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14504.1.patch, HIVE-14504.1.patch, 
> HIVE-14504.1.patch
>
>
> tez_join_hash.q also explicitly sets execution engine to mr which slows down 
> the entire test. Test takes around 7 mins. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14426) Extensive logging on info level in WebHCat

2016-08-11 Thread Peter Vary (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417584#comment-15417584
 ] 

Peter Vary commented on HIVE-14426:
---

[~leftylev] I tried to rephrase the comment:

{noformat}
Dumping every environment variable value and the entire configuration 
to the
log could be damaging, so this parameter makes it possible to 
explicitly turn it off
by setting its value to false.
{noformat}

What do you think?
It this good, or further changes are needed?

Thanks,
Peter

> Extensive logging on info level in WebHCat
> --
>
> Key: HIVE-14426
> URL: https://issues.apache.org/jira/browse/HIVE-14426
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14426.2.patch, HIVE-14426.3.patch, HIVE-14426.patch
>
>
> There is an extensive logging in WebHCat at info level, and even some 
> sensitive information could be logged



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14342) Beeline output is garbled when executed from a remote shell

2016-08-11 Thread Mohit Sabharwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417553#comment-15417553
 ] 

Mohit Sabharwal commented on HIVE-14342:


Thanks, [~ngangam], LGTM +1

> Beeline output is garbled when executed from a remote shell
> ---
>
> Key: HIVE-14342
> URL: https://issues.apache.org/jira/browse/HIVE-14342
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14342.2.patch, HIVE-14342.patch, HIVE-14342.patch
>
>
> {code}
> use default;
> create table clitest (key int, name String, value String);
> insert into table clitest values 
> (1,"TRUE","1"),(2,"TRUE","1"),(3,"TRUE","1"),(4,"TRUE","1"),(5,"FALSE","0"),(6,"FALSE","0"),(7,"FALSE","0");
> {code}
> then run a select query
> {code} 
> # cat /tmp/select.sql 
> set hive.execution.engine=mr;
> select key,name,value 
> from clitest 
> where value="1" limit 1;
> {code}
> Then run beeline via a remote shell, for example
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/select.sql" 
> root@'s password: 
> 16/07/12 14:59:22 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> nullkey,name,value 
> 1,TRUE,1
> null   
> $
> {code}
> In older releases that the output is as follows
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/run.sql" 
> Are you sure you want to continue connecting (yes/no)? yes
> root@'s password: 
> 16/07/12 14:57:55 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> key,name,value
> 1,TRUE,1
> $
> {code}
> The output contains nulls instead of blank lines. This is due to the use of 
> -Djline.terminal=jline.UnsupportedTerminal introduced in HIVE-6758 to be able 
> to run beeline as a background process. But this is the unfortunate side 
> effect of that fix.
> Running beeline in background also produces garbled output.
> {code}
> # beeline -u "jdbc:hive2://localhost:1" -n hive -p hive --silent=true 
> --outputformat=csv2 --showHeader=false -f /tmp/run.sql 2>&1 > 
> /tmp/beeline.txt &
> # cat /tmp/beeline.txt 
> null1,TRUE,1   
> #
> {code}
> So I think the use of jline.UnsupportedTerminal should be documented but not 
> used automatically by beeline under the covers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14358) Add metrics for number of queries executed for each execution engine (mr, spark, tez)

2016-08-11 Thread Barna Zsombor Klara (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara reassigned HIVE-14358:
--

Assignee: Barna Zsombor Klara

> Add metrics for number of queries executed for each execution engine (mr, 
> spark, tez)
> -
>
> Key: HIVE-14358
> URL: https://issues.apache.org/jira/browse/HIVE-14358
> Project: Hive
>  Issue Type: Task
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Lenni Kuff
>Assignee: Barna Zsombor Klara
>
> HiveServer2 currently has a metric for the total number of queries ran since 
> last restart, but it would be useful to also have metrics for number of 
> queries ran for each execution engine. This would improve supportability by 
> allowing users to get a high-level understanding of what workloads had been 
> running on the server. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HIVE-13328) Cannot Quey Hive External Returning 0 values

2016-08-11 Thread Arun Kumar Srinivasan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Kumar Srinivasan updated HIVE-13328:
-
Comment: was deleted

(was: Checked the hive external partition table as text file and ORC. Check the 
steps below. 

HIVE PARTITION 
-

// Unix prompt

vi input_2015.txt

1,Name1,Dep1,1981
3,Name3,Dep3,1982

vi input_2016.txt

2,Name2,Dep2,1986
2,Name4,Dep4,1988

hdfs dfs -rm /user/yarn/hive/issue/input*

hdfs dfs -copyFromLocal input*.txt /user/yarn/hive/issue

hdfs dfs -ls -h /user/yarn/hive/issue/input*


// HIVE prompt

drop table test_partition;

CREATE EXTERNAL TABLE IF NOT EXISTS test_partition ( id int, name String, 
department String, dob String)
PARTITIONED BY (yoj INT, moj STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
location '/user/yarn/hive/issue';

LOAD DATA INPATH '/user/yarn/hive/issue/input_2015.txt' INTO TABLE 
test_partition PARTITION (yoj=2015, moj='01');

LOAD DATA INPATH '/user/yarn/hive/issue/input_2016.txt' INTO TABLE 
test_partition PARTITION (yoj=2016, moj='01');

SELECT * FROM test_partition;

// ORC TABLE

CREATE EXTERNAL TABLE IF NOT EXISTS test_partition_orc ( id int, name String, 
department String, dob String)
PARTITIONED BY (yoj INT, moj STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS ORC
location '/user/yarn/hive/issue';

LOAD DATA INPATH '/user/yarn/hive/issue/input_2015.txt' INTO TABLE 
test_partition_orc PARTITION (yoj=2015, moj='01');

LOAD DATA INPATH '/user/yarn/hive/issue/input_2016.txt' INTO TABLE 
test_partition_orc PARTITION (yoj=2016, moj='01');


SELECT * FROM test_partition_orc;
)

> Cannot Quey Hive External Returning 0 values
> 
>
> Key: HIVE-13328
> URL: https://issues.apache.org/jira/browse/HIVE-13328
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.13.0
> Environment: MAPRFS
>Reporter: bharath kumar
>Assignee: Arun Kumar Srinivasan
>Priority: Blocker
>
> Having Issues with Involving ORC format and external table.Below are the 
> sequence of steps followed.
>  
> created an external table
> using insert overwrite data is populated to external table with partition.
> insert overwrite table external_table partition (data_date='2016-22-03')
> select from (select * from db3.table1
> where data_date = '2016-22-03') i
> left join (select * from db3.table2 where data_date = '2016-22-03') th on 
> i.column1 = th.column1 and
> i.column2 = th.column2;
>  
> But when i query the external table data is not present. Tried below 
> procedures like 
> ALTER TABLE NAME ADD PARTITION(DATA_DATE='2016-22-03');
> MSCK REPAIR TABLE TABLENAME;
> OK
> Partitions not in metastore:   
> What could be the issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-08-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417513#comment-15417513
 ] 

Sergio Peña commented on HIVE-14270:


[~leftylev] is there a wiki section about S3 or blobstore tables?

> Write temporary data to HDFS when doing inserts on tables located on S3
> ---
>
> Key: HIVE-14270
> URL: https://issues.apache.org/jira/browse/HIVE-14270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch, HIVE-14270.4.patch, HIVE-14270.5.patch, HIVE-14270.6.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-08-11 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-14270:
---
Labels: TODOC2.2  (was: )

> Write temporary data to HDFS when doing inserts on tables located on S3
> ---
>
> Key: HIVE-14270
> URL: https://issues.apache.org/jira/browse/HIVE-14270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch, HIVE-14270.4.patch, HIVE-14270.5.patch, HIVE-14270.6.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-08-11 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-14270:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Thanks [~leftylev] [~ashutoshc] for your review. I just committed this to 
master. 

[~leftylev] I will add the notes to the wiki about new blobstore variables.

> Write temporary data to HDFS when doing inserts on tables located on S3
> ---
>
> Key: HIVE-14270
> URL: https://issues.apache.org/jira/browse/HIVE-14270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Fix For: 2.2.0
>
> Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch, HIVE-14270.4.patch, HIVE-14270.5.patch, HIVE-14270.6.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-08-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417459#comment-15417459
 ] 

Ashutosh Chauhan commented on HIVE-14270:
-

+1
Thanks for your persistence on this one, Sergio! Much appreciated.

> Write temporary data to HDFS when doing inserts on tables located on S3
> ---
>
> Key: HIVE-14270
> URL: https://issues.apache.org/jira/browse/HIVE-14270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch, HIVE-14270.4.patch, HIVE-14270.5.patch, HIVE-14270.6.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-11 Thread Chaoyu Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417435#comment-15417435
 ] 

Chaoyu Tang commented on HIVE-14513:


+1 Pending on tests

> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-08-11 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417406#comment-15417406
 ] 

Sergio Peña commented on HIVE-14270:


[~ashutoshc] All tests are passing now, The ones failing are not related.
TestMiniLlapCliDriver and TestMiniTezCliDriver are failing on other patches 
too, and TestJdbcWithMiniHS2 fails on master.

> Write temporary data to HDFS when doing inserts on tables located on S3
> ---
>
> Key: HIVE-14270
> URL: https://issues.apache.org/jira/browse/HIVE-14270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch, HIVE-14270.4.patch, HIVE-14270.5.patch, HIVE-14270.6.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-08-11 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417389#comment-15417389
 ] 

Hive QA commented on HIVE-14035:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12823208/HIVE-14035.15.patch

{color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10445 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.ql.TestTxnCommands2.testACIDwithSchemaEvolution
org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdate.testACIDwithSchemaEvolution
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/853/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/853/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-853/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12823208 - PreCommit-HIVE-MASTER-Build

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.09.patch, 
> HIVE-14035.10.patch, HIVE-14035.11.patch, HIVE-14035.12.patch, 
> HIVE-14035.13.patch, HIVE-14035.14.patch, HIVE-14035.15.patch, 
> HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-11 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-14513:
-
Status: Patch Available  (was: Open)

> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14513) Enhance custom query feature in LDAP atn to support resultset of ldap groups

2016-08-11 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417364#comment-15417364
 ] 

Naveen Gangam commented on HIVE-14513:
--

review posted to reviewboard at https://reviews.apache.org/r/50970/

> Enhance custom query feature in LDAP atn to support resultset of ldap groups
> 
>
> Key: HIVE-14513
> URL: https://issues.apache.org/jira/browse/HIVE-14513
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14513.patch
>
>
> LDAP Authenticator can be configured to use a result set from a LDAP query to 
> authenticate. However, is it expected that this LDAP query would only result 
> a set of users (aka full DNs for the users in LDAP).
> However, its not always straightforward to be able to author queries that 
> return users. For example, say you would like to allow "all users from group1 
> and group2" to be authenticated. The LDAP query has to return a union of all 
> members of the group1 and group2.
> For example, one common configuration is that groups contain a list of its 
> users
>   "dn: uid=group1,ou=Groups,dc=example,dc=com",
>   "distinguishedName: uid=group1,ou=Groups,dc=example,dc=com",
>   "objectClass: top",
>   "objectClass: groupOfNames",
>   "objectClass: ExtensibleObject",
>   "cn: group1",
>   "ou: Groups",
>   "sn: group1",
>   "member: uid=user1,ou=People,dc=example,dc=com",
> The query 
> {{(&(objectClass=groupOfNames)(|(cn=group1)(cn=group2)))}}
> will return the entries
> uid=group1,ou=Groups,dc=example,dc=com
> uid=group2,ou=Groups,dc=example,dc=com
> but there is no means to form a query that would return just the values of 
> "member" attributes. (ldap client tools are able to do by filtering out the 
> attributes on these entries.
> So it will be useful to have such support to be able to specify queries that 
> return groups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 131 matches

Mail list logo