[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180288#comment-16180288
 ] 

liyunzhang_intel commented on HIVE-17545:
-

[~lirui]: thanks for explanation. If disabled cache, even equivalent works are 
combined, the computation for the same work are still executed.

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17585) Improve thread safety when loading dynamic partitions in parallel

2017-09-25 Thread Tao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180289#comment-16180289
 ] 

Tao Li commented on HIVE-17585:
---

[~sershe] Thanks for the comments. I thought about that and was a little 
concerned with the memory overhead to have a new Hive instance per thread 
(especially the embedded metastore scenario), so I chose to go with the 
singleton approach. But I agree that having the thread local will give us the 
best safety.

Regarding the latency, the major advantage of the loading partitions in 
parallel is making the HDFS calls in parallel, so synchronizing on the 
metastore client should not be a big concern.

> Improve thread safety when loading dynamic partitions in parallel
> -
>
> Key: HIVE-17585
> URL: https://issues.apache.org/jira/browse/HIVE-17585
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-17585.1.patch, HIVE-17585.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17605) VectorizedOrcInputFormat initialization is expensive in populating partition values

2017-09-25 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-17605:

Attachment: VectorizedOrcInputFormat_init.png

> VectorizedOrcInputFormat initialization is expensive in populating partition 
> values
> ---
>
> Key: HIVE-17605
> URL: https://issues.apache.org/jira/browse/HIVE-17605
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Priority: Minor
> Attachments: VectorizedOrcInputFormat_init.png
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17604) Add druid properties to conf white list

2017-09-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17604:

Status: Patch Available  (was: Open)

> Add druid properties to conf white list
> ---
>
> Key: HIVE-17604
> URL: https://issues.apache.org/jira/browse/HIVE-17604
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Druid integration
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-17604.patch
>
>
> Currently throws:
> Error: Error while processing statement: Cannot modify 
> hive.druid.select.distribute at runtime. It is not in list of params that are 
> allowed to be modified at runtime (state=42000,code=1)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17604) Add druid properties to conf white list

2017-09-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17604:

Attachment: HIVE-17604.patch

[~thejas] Can you please review?

> Add druid properties to conf white list
> ---
>
> Key: HIVE-17604
> URL: https://issues.apache.org/jira/browse/HIVE-17604
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Druid integration
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-17604.patch
>
>
> Currently throws:
> Error: Error while processing statement: Cannot modify 
> hive.druid.select.distribute at runtime. It is not in list of params that are 
> allowed to be modified at runtime (state=42000,code=1)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17604) Add druid properties to conf white list

2017-09-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan reassigned HIVE-17604:
---


> Add druid properties to conf white list
> ---
>
> Key: HIVE-17604
> URL: https://issues.apache.org/jira/browse/HIVE-17604
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration, Druid integration
>Affects Versions: 2.2.0, 2.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>
> Currently throws:
> Error: Error while processing statement: Cannot modify 
> hive.druid.select.distribute at runtime. It is not in list of params that are 
> allowed to be modified at runtime (state=42000,code=1)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch

2017-09-25 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-17568:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> HiveJoinPushTransitivePredicatesRule may exchange predicates which are not 
> valid on the other branch
> 
>
> Key: HIVE-17568
> URL: https://issues.apache.org/jira/browse/HIVE-17568
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, 
> HIVE-17568.03.patch
>
>
> Joining 2 tables on at least 1 column which is not the same type ; 
> (integer/double for example).
> The calcite expressions require double/integer inputs which will became 
> invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other 
> branch.
> query:
> {code}
> create table t1 (v string, k int);
> insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30);
> create table t2 (v string, k double);
> insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30);
> select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and 
> t1.k<15;
> {code}
> results in:
> {code}
> java.lang.AssertionError: type mismatch:
> type1:
> DOUBLE
> type2:
> INTEGER
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153)
>   at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:104)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180251#comment-16180251
 ] 

Thejas M Nair commented on HIVE-17483:
--

+1 pending tests


> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, 
> HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, 
> HIVE-17483.8.patch, HIVE-17483.9.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17603) LLAP: Print counters for llap_text.q for validating LLAP IO usage

2017-09-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180241#comment-16180241
 ] 

Prasanth Jayachandran commented on HIVE-17603:
--

cc/ [~sershe]


> LLAP: Print counters for llap_text.q for validating LLAP IO usage
> -
>
> Key: HIVE-17603
> URL: https://issues.apache.org/jira/browse/HIVE-17603
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> LLAP text cache test is not included in minillap test suite, also could add 
> printing llap IO counters as part of q file output to validate llap io usage 
> and catch regressions. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17400) Estimate stats in absence of stats for complex types

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180237#comment-16180237
 ] 

Hive QA commented on HIVE-17400:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888987/HIVE-17400.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11061 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] 
(batchId=59)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lateral_view]
 (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lvj_mapjoin]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_complex]
 (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_complex]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_complex]
 (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_complex]
 (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_join_result_complex]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
 (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_join]
 (batchId=154)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout (batchId=288)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6983/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6983/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6983/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888987 - PreCommit-HIVE-Build

> Estimate stats in absence of stats for complex types
> 
>
> Key: HIVE-17400
> URL: https://issues.apache.org/jira/browse/HIVE-17400
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17400.1.patch
>
>
> HIVE-16811 adds support for estimation of stats for primitive types if it 
> doesn't exist. This JIRA is to extend that support for complex data types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17602) Explain plan not working

2017-09-25 Thread Vineet Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180225#comment-16180225
 ] 

Vineet Garg commented on HIVE-17602:


cc [~jcamachorodriguez]

> Explain plan not working
> 
>
> Key: HIVE-17602
> URL: https://issues.apache.org/jira/browse/HIVE-17602
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 3.0.0
>
>
> {code:sql}
> hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') STORED AS TEXTFILE;
> hive> explain select * from src where key > '4';
> Failed with exception wrong number of arguments
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.ExplainTask
> {code}
> Error stack in hive.log
> {noformat}
> 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] 
> exec.Task: Failed with exception wrong number of arguments
> java.lang.IllegalArgumentException: wrong number of arguments
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17602) Explain plan not working

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17602:
---
Attachment: HIVE-17602.1.patch

> Explain plan not working
> 
>
> Key: HIVE-17602
> URL: https://issues.apache.org/jira/browse/HIVE-17602
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17602.1.patch
>
>
> {code:sql}
> hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') STORED AS TEXTFILE;
> hive> explain select * from src where key > '4';
> Failed with exception wrong number of arguments
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.ExplainTask
> {code}
> Error stack in hive.log
> {noformat}
> 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] 
> exec.Task: Failed with exception wrong number of arguments
> java.lang.IllegalArgumentException: wrong number of arguments
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17602) Explain plan not working

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17602:
---
Status: Patch Available  (was: Open)

> Explain plan not working
> 
>
> Key: HIVE-17602
> URL: https://issues.apache.org/jira/browse/HIVE-17602
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17602.1.patch
>
>
> {code:sql}
> hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') STORED AS TEXTFILE;
> hive> explain select * from src where key > '4';
> Failed with exception wrong number of arguments
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.ExplainTask
> {code}
> Error stack in hive.log
> {noformat}
> 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] 
> exec.Task: Failed with exception wrong number of arguments
> java.lang.IllegalArgumentException: wrong number of arguments
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17602) Explain plan not working

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-17602:
--


> Explain plan not working
> 
>
> Key: HIVE-17602
> URL: https://issues.apache.org/jira/browse/HIVE-17602
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 3.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Critical
> Fix For: 3.0.0
>
>
> {code:sql}
> hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') STORED AS TEXTFILE;
> hive> explain select * from src where key > '4';
> Failed with exception wrong number of arguments
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.ExplainTask
> {code}
> Error stack in hive.log
> {noformat}
> 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] 
> exec.Task: Failed with exception wrong number of arguments
> java.lang.IllegalArgumentException: wrong number of arguments
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220)
>   at 
> org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180195#comment-16180195
 ] 

Sergey Shelukhin commented on HIVE-17502:
-

The config option sounds good to me, esp. if we can limit it to HS2 pool 
sessions that are not ever directly reused anyway. [~thejas] wdyt?
Also, do we have a list or a notion of why sessionstate/hivesessionimpl object 
couldn't used in parallel?
ThreadLocal is not an obstacle in itself but rather an artifact on not having a 
good dependency injection-type logic for most of Hive compile, similar to other 
globals.

> Reuse of default session should not throw an exception in LLAP w/ Tez
> -
>
> Key: HIVE-17502
> URL: https://issues.apache.org/jira/browse/HIVE-17502
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 2.1.1, 2.2.0
> Environment: HDP 2.6.1.0-129, Hue 4
>Reporter: Thai Bui
>Assignee: Thai Bui
>
> Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be 
> skipped mostly because of this line 
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365.
> However, some clients such as Hue 4, allow multiple sessions to be used per 
> user. Under this configuration, a Thrift client will send a request to either 
> reuse or open a new session. The reuse request could include the session id 
> of a currently used snippet being executed in Hue, this causes HS2 to throw 
> an exception:
> {noformat}
> 2017-09-10T17:51:36,548 INFO  [Thread-89]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user: 
> hive, session user: hive
> 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task 
> (TezTask.java:execute(232)) - Failed to execute tez graph.
> org.apache.hadoop.hive.ql.metadata.HiveException: The pool session 
> sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive, 
> doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have 
> been returned to the pool
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544)
>  ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) 
> [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129]
> {noformat}
> Note that every query is issued as a single 'hive' user to share the LLAP 
> daemon pool, a set of pre-determined number of AMs is initialized at setup 
> time. Thus, HS2 should allow new sessions from a Thrift client to be used out 
> of the pool, or an existing session to be skipped and an unused session from 
> the pool to be returned. The logic to throw an exception in the  
> `canWorkWithSameSession` doesn't make sense to me.
> I have a solution to fix this issue in my local branch at 
> https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70.
>  When applied, the log will become like so
> {noformat}
> 2017-09-10T09:15:33,578 INFO  [Thread-239]: tez.TezSessionPoolManager 
> (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default 
> session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap, 
> user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms 
> since it is being used.
> {noformat}
> A test case is provided in my branch to demonstrate how it works. If possible 
> I would like this patch to be applied to version 2.1, 2.2 and master. Since 
> we are using 2.1 LLAP in production with Hue 4, this patch is critical to our 
> success.
> Alternatively, if this patch is too broad in scope, I propose adding an 
> option to allow "skipping of currently used default sessions". With this new 
> option default to "false", existing behavior won't change unless the option 
> is turned on.
> I will prepare an official path if this change to master &/ the other 
> branches is acceptable. I'm not an contributor &/ committer, this will be my 
> first time contributing to Hive and the Apache foundation. Any early review 
> is greatly appreciated, thanks!



--
This message was sent by Atlassian JIRA

[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-25 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Attachment: HIVE-17483.9.patch

Fixed failing unit tests

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, 
> HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, 
> HIVE-17483.8.patch, HIVE-17483.9.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed

2017-09-25 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-17586:
---
Attachment: HIVE-17586.1.patch

> Make HS2 BackgroundOperationPool not fixed
> --
>
> Key: HIVE-17586
> URL: https://issues.apache.org/jira/browse/HIVE-17586
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-17586.1.patch, HIVE-17586.patch
>
>
> Currently the threadpool for background asynchronous operatons has a fixed 
> size controled by {{hive.server2.async.exec.threads}}. However, the thread 
> factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} 
> which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the 
> thread is actually never killed, defecting the purpose of garbage cleanup as 
> noted in the thread class name. On the other hand, since these threads never 
> go away, significant resources such as threadlocal variables (classloaders, 
> hiveconfs, etc) are holding up even if there is no operation running. This 
> can lead to escalated HS2 memory usage.
> Ideally, the threadpool should not be fixed, allowing thread to die out so 
> resources can be reclaimed. The existing config 
> {{hive.server2.async.exec.threads}} is treated as the max, and we can add a 
> min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value 
> for this configure is -1, which keeps the existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17111) TestSparkCliDriver does not use LocalHiveSparkClient

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180154#comment-16180154
 ] 

Hive QA commented on HIVE-17111:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888976/HIVE-17111.1.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11063 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] 
(batchId=64)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6982/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6982/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6982/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888976 - PreCommit-HIVE-Build

> TestSparkCliDriver does not use LocalHiveSparkClient
> 
>
> Key: HIVE-17111
> URL: https://issues.apache.org/jira/browse/HIVE-17111
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17111.1.patch
>
>
> The TestSparkCliDriver sets the spark.master to local-cluster[2,2,1024] but 
> the HoS still uses decides to use the RemoteHiveSparkClient rather than the 
> LocalHiveSparkClient.
> The issue is with the following check in HiveSparkClientFactory:
> {code}
> if (master.equals("local") || master.startsWith("local[")) {
>   // With local spark context, all user sessions share the same spark 
> context.
>   return LocalHiveSparkClient.getInstance(generateSparkConf(sparkConf));
> } else {
>   return new RemoteHiveSparkClient(hiveconf, sparkConf);
> }
> {code}
> When {{master.startsWith("local[")}} it checks the value of spark.master and 
> sees that it doesn't start with {{local[}} and then decides to use the 
> RemoteHiveSparkClient.
> We should fix this so that the LocalHiveSparkClient is used. It should speed 
> up some of the tests, and also makes qtests easier to debug since everything 
> will now be run in the same process.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180152#comment-16180152
 ] 

Rui Li commented on HIVE-17545:
---

[~kellyzly], if you apply two different transformations to an RDD, that RDD 
will be evaluated twice when we compute the child RDDs. To avoid this, you need 
to cache the RDD. So if we combine equivalent works w/o caching them, then we 
can't get rid of duplicated computations. The descriptions of HIVE-10550 and 
HIVE-10844 also mentioned how combing works depend on RDD caching.

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180135#comment-16180135
 ] 

liyunzhang_intel commented on HIVE-17545:
-

[~lirui]:  {quote}

if user turns on combining equivalent works and turns off RDD caching, then 
there won't be perf improvement right?
{quote}
if users turns on combining equivalent, duplicated map/reduce work will be 
removed. The performance will not change whether rdd caching is enabled or not. 
 
 In HoS, cache will be enabled only when the parent spark work have more than 
[1 
children|https://github.com/kellyzly/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java#L264].
 
If my understanding is not right, tell me.




> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16426) Query cancel: improve the way to handle files

2017-09-25 Thread Yongzhi Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180123#comment-16180123
 ] 

Yongzhi Chen commented on HIVE-16426:
-

When Cancel happens, releaseDriverContext() will be called. It calls 
driverCxt.shutdown(); This methods will shutdown every related running task by 
calling task's shutdown method. How to implement it depend on each task, for 
example for MapReduce task its shutdown majorly kill the job. 

> Query cancel: improve the way to handle files
> -
>
> Key: HIVE-16426
> URL: https://issues.apache.org/jira/browse/HIVE-16426
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 3.0.0
>
> Attachments: HIVE-16426.1.patch
>
>
> 1. Add data structure support to make it is easy to check query cancel status.
> 2. Handle query cancel more gracefully. Remove possible file leaks caused by 
> query cancel as shown in following stack:
> {noformat}
> 2017-04-11 09:57:30,727 WARN  org.apache.hadoop.hive.ql.exec.Utilities: 
> [HiveServer2-Background-Pool: Thread-149]: Failed to clean-up tmp directories.
> java.io.InterruptedIOException: Call interrupted
> at org.apache.hadoop.ipc.Client.call(Client.java:1496)
> at org.apache.hadoop.ipc.Client.call(Client.java:1439)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy20.delete(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy21.delete(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.clearWork(Utilities.java:277)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:463)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:142)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1978)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1691)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1423)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1202)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:303)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:316)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> 3. Add checkpoints to related file operations to improve response time for 
> query cancelling. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17373) Upgrade some dependency versions

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180122#comment-16180122
 ] 

Sergey Shelukhin commented on HIVE-17373:
-

An astute observation ;)
Should we revert the upgrade and re-do it with the test fixed? I don't think it 
makes sense to upgrade accumulo if that breaks all accumulo test.
Alternatively we can remove the test if it's not needed.

> Upgrade some dependency versions
> 
>
> Key: HIVE-17373
> URL: https://issues.apache.org/jira/browse/HIVE-17373
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 3.0.0
>
> Attachments: HIVE-17373.1.patch, HIVE-17373.2.patch
>
>
> Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and 
> commons-httpclient to 3.1. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17400) Estimate stats in absence of stats for complex types

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17400:
---
Attachment: HIVE-17400.1.patch

> Estimate stats in absence of stats for complex types
> 
>
> Key: HIVE-17400
> URL: https://issues.apache.org/jira/browse/HIVE-17400
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17400.1.patch
>
>
> HIVE-16811 adds support for estimation of stats for primitive types if it 
> doesn't exist. This JIRA is to extend that support for complex data types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17400) Estimate stats in absence of stats for complex types

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17400:
---
Status: Patch Available  (was: Open)

> Estimate stats in absence of stats for complex types
> 
>
> Key: HIVE-17400
> URL: https://issues.apache.org/jira/browse/HIVE-17400
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17400.1.patch
>
>
> HIVE-16811 adds support for estimation of stats for primitive types if it 
> doesn't exist. This JIRA is to extend that support for complex data types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180111#comment-16180111
 ] 

Rui Li commented on HIVE-17545:
---

Hi [~stakiar], since we have a switch to turn off combing equivalent works, why 
do we need another config to turn off RDD caching? More importantly, if user 
turns on combining equivalent works and turns off RDD caching, then there won't 
be perf improvement right?

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17474) Poor Performance about subquery like DS/query70 on HoS

2017-09-25 Thread liyunzhang_intel (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyunzhang_intel resolved HIVE-17474.
-
Resolution: Not A Problem

> Poor Performance about subquery like DS/query70 on HoS
> --
>
> Key: HIVE-17474
> URL: https://issues.apache.org/jira/browse/HIVE-17474
> Project: Hive
>  Issue Type: Bug
>Reporter: liyunzhang_intel
> Attachments: explain.70.after.analyze, explain.70.before.analyze, 
> explain.70.vec
>
>
> in 
> [DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql].
>  {code}
> select  
> sum(ss_net_profit) as total_sum
>,s_state
>,s_county
>,grouping__id as lochierarchy
>, rank() over(partition by grouping__id, case when grouping__id == 2 then 
> s_state end order by sum(ss_net_profit)) as rank_within_parent
> from
> store_sales ss join date_dim d1 on d1.d_date_sk = ss.ss_sold_date_sk
> join store s on s.s_store_sk  = ss.ss_store_sk
>  where
> d1.d_month_seq between 1193 and 1193+11
>  and s.s_state in
>  ( select s_state
>from  (select s_state as s_state, sum(ss_net_profit),
>  rank() over ( partition by s_state order by 
> sum(ss_net_profit) desc) as ranking
>   from   store_sales, store, date_dim
>   where  d_month_seq between 1193 and 1193+11
> and date_dim.d_date_sk = 
> store_sales.ss_sold_date_sk
> and store.s_store_sk  = store_sales.ss_store_sk
>   group by s_state
>  ) tmp1 
>where ranking <= 5
>  )
>  group by s_state,s_county with rollup
> order by
>lochierarchy desc
>   ,case when lochierarchy = 0 then s_state end
>   ,rank_within_parent
>  limit 100;
> {code}
>  let's analyze the query,
> part1: it calculates the sub-query and get the result of the state which 
> ss_net_profit is less than 5.
> part2: big table store_sales join small tables date_dim, store and get the 
> result.
> part3: part1 join part2
> the problem is on the part3, this is common join. The cardinality of part1 
> and part2 is low as there are not very different values about states( 
> actually there are 30 different values in the table store).  If use common 
> join, big data will go to the 30 reducers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17585) Improve thread safety when loading dynamic partitions in parallel

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180102#comment-16180102
 ] 

Sergey Shelukhin commented on HIVE-17585:
-

Hmm... wouldn't a simpler solution be to run Hive.get() instead of 
synchronizing a set of methods called by loadPartition, given that Hive object 
is not thread safe and so the original code uses it incorrectly by calling 
loadPartition from multiple threads?
If someone changes what loadPartition calls, this will break again as far as I 
can tell. And it's not good to change every method to use synchronized MSC, 
that will just be a perf hit.
Unless I'm missing something.

> Improve thread safety when loading dynamic partitions in parallel
> -
>
> Key: HIVE-17585
> URL: https://issues.apache.org/jira/browse/HIVE-17585
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-17585.1.patch, HIVE-17585.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17585) Improve thread safety when loading dynamic partitions in parallel

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180102#comment-16180102
 ] 

Sergey Shelukhin edited comment on HIVE-17585 at 9/26/17 2:22 AM:
--

Hmm... wouldn't a simpler solution be to run Hive.get() on the callable (make 
callable static to make sure it cannot access "this"), given that Hive object 
is not thread safe and so the original code uses it incorrectly by calling 
loadPartition from multiple threads?
Right now it's synchronizing a set of methods called by loadPartition; if 
someone changes what loadPartition calls, this will break again as far as I can 
tell. And it's not good to change every method to use synchronized MSC, that 
will just be a perf hit.
Unless I'm missing something.


was (Author: sershe):
Hmm... wouldn't a simpler solution be to run Hive.get() instead of 
synchronizing a set of methods called by loadPartition, given that Hive object 
is not thread safe and so the original code uses it incorrectly by calling 
loadPartition from multiple threads?
If someone changes what loadPartition calls, this will break again as far as I 
can tell. And it's not good to change every method to use synchronized MSC, 
that will just be a perf hit.
Unless I'm missing something.

> Improve thread safety when loading dynamic partitions in parallel
> -
>
> Key: HIVE-17585
> URL: https://issues.apache.org/jira/browse/HIVE-17585
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Fix For: 3.0.0
>
> Attachments: HIVE-17585.1.patch, HIVE-17585.2.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17594) Unit format error in Copy.java

2017-09-25 Thread Saijin Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180097#comment-16180097
 ] 

Saijin Huang commented on HIVE-17594:
-

[~dmtolpeko],the examples is list above.Can you plz take a quick review and 
quick commit?

> Unit format error in Copy.java
> --
>
> Key: HIVE-17594
> URL: https://issues.apache.org/jira/browse/HIVE-17594
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Attachments: HIVE-17594.1.patch
>
>
> In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual 
> value "rows/elapsed/1000.0".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17601) improve error handling in LlapServiceDriver

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180087#comment-16180087
 ] 

Hive QA commented on HIVE-17601:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888972/HIVE-17601.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11055 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=143)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] 
(batchId=100)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=242)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6981/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6981/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6981/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888972 - PreCommit-HIVE-Build

> improve error handling in LlapServiceDriver
> ---
>
> Key: HIVE-17601
> URL: https://issues.apache.org/jira/browse/HIVE-17601
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17601.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17587) Remove unnecessary filter from getPartitionsFromPartitionIds call

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180071#comment-16180071
 ] 

Sergey Shelukhin commented on HIVE-17587:
-

+1

> Remove unnecessary filter from getPartitionsFromPartitionIds call
> -
>
> Key: HIVE-17587
> URL: https://issues.apache.org/jira/browse/HIVE-17587
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-17587.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread liyunzhang_intel (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180063#comment-16180063
 ] 

liyunzhang_intel commented on HIVE-17545:
-

[~stakiar]: sounds good.  But i don't know why cache optimization was not 
configurable before. [~lirui]: As you are more familiar with the code, can you 
take some time to look?

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module

2017-09-25 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180020#comment-16180020
 ] 

Vihang Karajgaonkar commented on HIVE-17371:


Thanks [~alangates]. [~thejas] or [~vgumashta] Can you please take a look at 
this change and confirm if this approach makes sense to you? Thanks!

> Move tokenstores to metastore module
> 
>
> Key: HIVE-17371
> URL: https://issues.apache.org/jira/browse/HIVE-17371
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17371.01.patch
>
>
> The {{getTokenStore}} method will not work for the {{DBTokenStore}} and 
> {{ZKTokenStore}} since they implement 
> {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of  
> {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}}
> {code}
> private DelegationTokenStore getTokenStore(Configuration conf) throws 
> IOException {
> String tokenStoreClassName =
> MetastoreConf.getVar(conf, 
> MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, "");
> // The second half of this if is to catch cases where users are passing 
> in a HiveConf for
> // configuration.  It will have set the default value of
> // "hive.cluster.delegation.token.store .class" to
> // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its 
> construction.  But this is
> // the hive-shims version of the memory store.  We want to convert this 
> to our default value.
> if (StringUtils.isBlank(tokenStoreClassName) ||
> 
> "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) 
> {
>   return new MemoryTokenStore();
> }
> try {
>   Class storeClass =
>   
> Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class);
>   return ReflectionUtils.newInstance(storeClass, conf);
> } catch (ClassNotFoundException e) {
>   throw new IOException("Error initializing delegation token store: " + 
> tokenStoreClassName, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16426) Query cancel: improve the way to handle files

2017-09-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180002#comment-16180002
 ] 

Prasanth Jayachandran commented on HIVE-16426:
--

I don't understand how this patch handles the already running background task? 
When a query timeout is set, timeout monitor will set the operation state to 
TIMEOUT. With this patch, only the client is provided with SQLTimeoutException 
but the task that is actually executing on the cluster is not 
interrupted/cleaned up. Same is the case when user Cancels the query by ctrl + 
c. Isn't it?

> Query cancel: improve the way to handle files
> -
>
> Key: HIVE-16426
> URL: https://issues.apache.org/jira/browse/HIVE-16426
> Project: Hive
>  Issue Type: Improvement
>Reporter: Yongzhi Chen
>Assignee: Yongzhi Chen
> Fix For: 3.0.0
>
> Attachments: HIVE-16426.1.patch
>
>
> 1. Add data structure support to make it is easy to check query cancel status.
> 2. Handle query cancel more gracefully. Remove possible file leaks caused by 
> query cancel as shown in following stack:
> {noformat}
> 2017-04-11 09:57:30,727 WARN  org.apache.hadoop.hive.ql.exec.Utilities: 
> [HiveServer2-Background-Pool: Thread-149]: Failed to clean-up tmp directories.
> java.io.InterruptedIOException: Call interrupted
> at org.apache.hadoop.ipc.Client.call(Client.java:1496)
> at org.apache.hadoop.ipc.Client.call(Client.java:1439)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
> at com.sun.proxy.$Proxy20.delete(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy21.delete(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671)
> at 
> org.apache.hadoop.hive.ql.exec.Utilities.clearWork(Utilities.java:277)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:463)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:142)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1978)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1691)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1423)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1202)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:303)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:316)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> 3. Add checkpoints to related file operations to improve response time for 
> query cancelling. 



--
This message was sent by Atlassian JIRA

[jira] [Commented] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180001#comment-16180001
 ] 

Hive QA commented on HIVE-17600:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888960/HIVE-17600.1-branch-2.2.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 9937 tests 
executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) 
(batchId=244)
TestJdbcDriver2 - did not produce a TEST-*.xml file (likely timed out) 
(batchId=225)
TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=167)
[acid_globallimit.q,alter_merge_2_orc.q]
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=173)

[infer_bucket_sort_reducers_power_two.q,list_bucket_dml_10.q,orc_merge9.q,orc_merge6.q,leftsemijoin_mr.q,bucket6.q,bucketmapjoin7.q,uber_reduce.q,empty_dir_in_table.q,vector_outer_join3.q,index_bitmap_auto.q,vector_outer_join2.q,vector_outer_join1.q,orc_merge1.q,orc_merge_diff_fs.q,load_hdfs_file_with_space_in_the_name.q,scriptfile1_win.q,quotedid_smb.q,truncate_column_buckets.q,orc_merge3.q]
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=174)

[infer_bucket_sort_num_buckets.q,gen_udf_example_add10.q,insert_overwrite_directory2.q,orc_merge5.q,bucketmapjoin6.q,import_exported_table.q,vector_outer_join0.q,orc_merge4.q,temp_table_external.q,orc_merge_incompat1.q,root_dir_external_table.q,constprog_semijoin.q,auto_sortmerge_join_16.q,schemeAuthority.q,index_bitmap3.q,external_table_with_space_in_location_path.q,parallel_orderby.q,infer_bucket_sort_map_operators.q,bucketizedhiveinputformat.q,remote_script.q]
TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed 
out) (batchId=175)

[scriptfile1.q,vector_outer_join5.q,file_with_header_footer.q,bucket4.q,input16_cc.q,bucket5.q,infer_bucket_sort_merge.q,constprog_partitioner.q,orc_merge2.q,reduce_deduplicate.q,schemeAuthority2.q,load_fs2.q,orc_merge8.q,orc_merge_incompat2.q,infer_bucket_sort_bucketed_table.q,vector_outer_join4.q,disable_merge_for_bucketing.q,vector_inner_join.q,orc_merge7.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=118)

[bucketmapjoin4.q,bucket_map_join_spark4.q,union21.q,groupby2_noskew.q,timestamp_2.q,date_join1.q,mergejoins.q,smb_mapjoin_11.q,auto_sortmerge_join_3.q,mapjoin_test_outer.q,vectorization_9.q,merge2.q,groupby6_noskew.q,auto_join_without_localtask.q,multi_join_union.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=119)

[join_cond_pushdown_unqual4.q,union_remove_7.q,join13.q,join_vc.q,groupby_cube1.q,bucket_map_join_spark2.q,sample3.q,smb_mapjoin_19.q,stats16.q,union23.q,union.q,union31.q,cbo_udf_udaf.q,ptf_decimal.q,bucketmapjoin2.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=120)

[parallel_join1.q,union27.q,union12.q,groupby7_map_multi_single_reducer.q,varchar_join1.q,join7.q,join_reorder4.q,skewjoinopt2.q,bucketsortoptimize_insert_2.q,smb_mapjoin_17.q,script_env_var1.q,groupby7_map.q,groupby3.q,bucketsortoptimize_insert_8.q,union20.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=121)

[ptf_general_queries.q,auto_join_reordering_values.q,sample2.q,join1.q,decimal_join.q,mapjoin_subquery2.q,join32_lessSize.q,mapjoin1.q,order2.q,skewjoinopt18.q,union_remove_18.q,join25.q,groupby9.q,bucketsortoptimize_insert_6.q,ctas.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=122)

[groupby_map_ppr.q,nullgroup4_multi_distinct.q,join_rc.q,union14.q,smb_mapjoin_12.q,vector_cast_constant.q,union_remove_4.q,auto_join11.q,load_dyn_part7.q,udaf_collect_set.q,vectorization_12.q,groupby_sort_skew_1.q,groupby_sort_skew_1_23.q,smb_mapjoin_25.q,skewjoinopt12.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=123)

[skewjoinopt15.q,auto_join18.q,list_bucket_dml_2.q,input1_limit.q,load_dyn_part3.q,union_remove_14.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,union10.q,bucket_map_join_tez2.q,groupby5_map_skew.q,join_reorder.q,sample1.q,bucketmapjoin8.q,union34.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=124)

[avro_joins.q,skewjoinopt16.q,auto_join14.q,vectorization_14.q,auto_join26.q,stats1.q,cbo_stats.q,auto_sortmerge_join_6.q,union22.q,union_remove_24.q,union_view.q,smb_mapjoin_22.q,stats15.q,ptf_matchpath.q,transform_ppr1.q]
TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=125)


[jira] [Updated] (HIVE-17111) TestSparkCliDriver does not use LocalHiveSparkClient

2017-09-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17111:

Attachment: HIVE-17111.1.patch

Attaching a patch that creates a new CLI Driver called 
{{TestLocalSparkCliDriver}} which sets {{spark.master=local[*]}}. Adding a very 
simple q test, with a few basic queries. This provides some test coverage for 
{{LocalHiveSparkClient}}.

The main advantage is that this new CLI Driver runs the entire HoS query inside 
a single process. This makes debugging HoS much easier. Users can set 
breakpoints in portions of the HoS code that are only invoked at runtime. While 
this does provide some coverage for {{LocalHiveSparkClient}}, I think that main 
advantage is that it makes debugging HoS easier for developers, especially new 
developers who may not be as familiar with the HoS code and want to debug 
things via an IDE like IntelliJ.

The patch doesn't modify anything related to the other Spark CLI Drivers.

> TestSparkCliDriver does not use LocalHiveSparkClient
> 
>
> Key: HIVE-17111
> URL: https://issues.apache.org/jira/browse/HIVE-17111
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17111.1.patch
>
>
> The TestSparkCliDriver sets the spark.master to local-cluster[2,2,1024] but 
> the HoS still uses decides to use the RemoteHiveSparkClient rather than the 
> LocalHiveSparkClient.
> The issue is with the following check in HiveSparkClientFactory:
> {code}
> if (master.equals("local") || master.startsWith("local[")) {
>   // With local spark context, all user sessions share the same spark 
> context.
>   return LocalHiveSparkClient.getInstance(generateSparkConf(sparkConf));
> } else {
>   return new RemoteHiveSparkClient(hiveconf, sparkConf);
> }
> {code}
> When {{master.startsWith("local[")}} it checks the value of spark.master and 
> sees that it doesn't start with {{local[}} and then decides to use the 
> RemoteHiveSparkClient.
> We should fix this so that the LocalHiveSparkClient is used. It should speed 
> up some of the tests, and also makes qtests easier to debug since everything 
> will now be run in the same process.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17111) TestSparkCliDriver does not use LocalHiveSparkClient

2017-09-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17111:

Status: Patch Available  (was: Open)

> TestSparkCliDriver does not use LocalHiveSparkClient
> 
>
> Key: HIVE-17111
> URL: https://issues.apache.org/jira/browse/HIVE-17111
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17111.1.patch
>
>
> The TestSparkCliDriver sets the spark.master to local-cluster[2,2,1024] but 
> the HoS still uses decides to use the RemoteHiveSparkClient rather than the 
> LocalHiveSparkClient.
> The issue is with the following check in HiveSparkClientFactory:
> {code}
> if (master.equals("local") || master.startsWith("local[")) {
>   // With local spark context, all user sessions share the same spark 
> context.
>   return LocalHiveSparkClient.getInstance(generateSparkConf(sparkConf));
> } else {
>   return new RemoteHiveSparkClient(hiveconf, sparkConf);
> }
> {code}
> When {{master.startsWith("local[")}} it checks the value of spark.master and 
> sees that it doesn't start with {{local[}} and then decides to use the 
> RemoteHiveSparkClient.
> We should fix this so that the LocalHiveSparkClient is used. It should speed 
> up some of the tests, and also makes qtests easier to debug since everything 
> will now be run in the same process.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17386) support LLAP workload management in HS2 (low level only)

2017-09-25 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179966#comment-16179966
 ] 

Zhiyuan Yang commented on HIVE-17386:
-

+1 (non-binding). CC [~hagleitn]

> support LLAP workload management in HS2 (low level only)
> 
>
> Key: HIVE-17386
> URL: https://issues.apache.org/jira/browse/HIVE-17386
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17386.01.only.patch, HIVE-17386.01.patch, 
> HIVE-17386.01.patch, HIVE-17386.02.patch, HIVE-17386.03.patch, 
> HIVE-17386.04.patch, HIVE-17386.only.patch, HIVE-17386.patch
>
>
> This makes use of HIVE-17297 and creates building blocks for workload 
> management policies, etc.
> For now, there are no policies - a single yarn queue is designated for all 
> LLAP query AMs, and the capacity is distributed equally.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17566) Create schema required for workload management.

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179955#comment-16179955
 ] 

Sergey Shelukhin edited comment on HIVE-17566 at 9/25/17 11:36 PM:
---

Also can you generate the patch w/o generated code and post on RB?
I will review at some point, tomorrow probably


was (Author: sershe):
Also can you generate the patch w/o generated code and post on RB?

> Create schema required for workload management.
> ---
>
> Key: HIVE-17566
> URL: https://issues.apache.org/jira/browse/HIVE-17566
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17566.01.patch
>
>
> Schema + model changes required for workload management.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17566) Create schema required for workload management.

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179955#comment-16179955
 ] 

Sergey Shelukhin commented on HIVE-17566:
-

Also can you generate the patch w/o generated code and post on RB?

> Create schema required for workload management.
> ---
>
> Key: HIVE-17566
> URL: https://issues.apache.org/jira/browse/HIVE-17566
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17566.01.patch
>
>
> Schema + model changes required for workload management.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17566) Create schema required for workload management.

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179952#comment-16179952
 ] 

Sergey Shelukhin commented on HIVE-17566:
-

[~harishjp] TestSchemaTool failures might be related

> Create schema required for workload management.
> ---
>
> Key: HIVE-17566
> URL: https://issues.apache.org/jira/browse/HIVE-17566
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
> Attachments: HIVE-17566.01.patch
>
>
> Schema + model changes required for workload management.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15212) merge branch into master

2017-09-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179950#comment-16179950
 ] 

Sergey Shelukhin commented on HIVE-15212:
-

It looks like IOW works, but multi-IOW doesn't (see the mm_all test - where we 
I/IOW, IOW/IOW, etc. into the same or different tables).


> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch, HIVE-15212.14.patch, HIVE-15212.15.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17601) improve error handling in LlapServiceDriver

2017-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17601:

Status: Patch Available  (was: Open)

> improve error handling in LlapServiceDriver
> ---
>
> Key: HIVE-17601
> URL: https://issues.apache.org/jira/browse/HIVE-17601
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17601.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17601) improve error handling in LlapServiceDriver

2017-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17601:

Attachment: HIVE-17601.patch

[~prasanth_j] can you please take a look? This also cleans up some todo

> improve error handling in LlapServiceDriver
> ---
>
> Key: HIVE-17601
> URL: https://issues.apache.org/jira/browse/HIVE-17601
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17601.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17601) improve error handling in LlapServiceDriver

2017-09-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17601:
---


> improve error handling in LlapServiceDriver
> ---
>
> Key: HIVE-17601
> URL: https://issues.apache.org/jira/browse/HIVE-17601
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15212) merge branch into master

2017-09-25 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179929#comment-16179929
 ] 

Wei Zheng commented on HIVE-15212:
--

[~ekoifman] You at'ed the wrong person ;)

Sorry for the late update. I left a todo comment in 
HiveInputFormat.java:processForWriteIds()
{code}
// todo for IOW, we also need to count in base dir, if any
for (AcidUtils.ParsedDelta delta : dirInfo.getCurrentDirectories()) 
{
  Utilities.LOG14535.info("Adding input " + delta.getPath());
  finalPaths.add(delta.getPath());
}
{code}
Here we just need to count in base dir if any.

> merge branch into master
> 
>
> Key: HIVE-15212
> URL: https://issues.apache.org/jira/browse/HIVE-15212
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, 
> HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, 
> HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, 
> HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, 
> HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, 
> HIVE-15212.13.patch, HIVE-15212.14.patch, HIVE-15212.15.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179927#comment-16179927
 ] 

Hive QA commented on HIVE-17586:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888949/HIVE-17586.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11055 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[empty_join] (batchId=76)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=242)
org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics
 (batchId=197)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6979/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6979/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6979/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888949 - PreCommit-HIVE-Build

> Make HS2 BackgroundOperationPool not fixed
> --
>
> Key: HIVE-17586
> URL: https://issues.apache.org/jira/browse/HIVE-17586
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-17586.patch
>
>
> Currently the threadpool for background asynchronous operatons has a fixed 
> size controled by {{hive.server2.async.exec.threads}}. However, the thread 
> factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} 
> which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the 
> thread is actually never killed, defecting the purpose of garbage cleanup as 
> noted in the thread class name. On the other hand, since these threads never 
> go away, significant resources such as threadlocal variables (classloaders, 
> hiveconfs, etc) are holding up even if there is no operation running. This 
> can lead to escalated HS2 memory usage.
> Ideally, the threadpool should not be fixed, allowing thread to die out so 
> resources can be reclaimed. The existing config 
> {{hive.server2.async.exec.threads}} is treated as the max, and we can add a 
> min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value 
> for this configure is -1, which keeps the existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module

2017-09-25 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179923#comment-16179923
 ] 

Alan Gates commented on HIVE-17371:
---

I'm fine with this approach.  But we should get buy off from [~thejas] and 
[~vgumashta] as they spend the most time in HS2.

> Move tokenstores to metastore module
> 
>
> Key: HIVE-17371
> URL: https://issues.apache.org/jira/browse/HIVE-17371
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17371.01.patch
>
>
> The {{getTokenStore}} method will not work for the {{DBTokenStore}} and 
> {{ZKTokenStore}} since they implement 
> {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of  
> {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}}
> {code}
> private DelegationTokenStore getTokenStore(Configuration conf) throws 
> IOException {
> String tokenStoreClassName =
> MetastoreConf.getVar(conf, 
> MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, "");
> // The second half of this if is to catch cases where users are passing 
> in a HiveConf for
> // configuration.  It will have set the default value of
> // "hive.cluster.delegation.token.store .class" to
> // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its 
> construction.  But this is
> // the hive-shims version of the memory store.  We want to convert this 
> to our default value.
> if (StringUtils.isBlank(tokenStoreClassName) ||
> 
> "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) 
> {
>   return new MemoryTokenStore();
> }
> try {
>   Class storeClass =
>   
> Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class);
>   return ReflectionUtils.newInstance(storeClass, conf);
> } catch (ClassNotFoundException e) {
>   throw new IOException("Error initializing delegation token store: " + 
> tokenStoreClassName, e);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-25 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179879#comment-16179879
 ] 

Thejas M Nair commented on HIVE-17483:
--

Changes look good, but some of the UT failures look related (the jdbc and 
service package ones).
Can you please take a look ?


> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, 
> HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.

2017-09-25 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17600:

Status: Patch Available  (was: Open)

> Make OrcFile's "enforceBufferSize" user-settable.
> -
>
> Key: HIVE-17600
> URL: https://issues.apache.org/jira/browse/HIVE-17600
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17600.1-branch-2.2.patch
>
>
> This is a duplicate of ORC-238, but it applies to {{branch-2.2}}.
> Compression buffer-sizes in OrcFile are computed at runtime, except when 
> enforceBufferSize is set. The only snag here is that this flag can't be set 
> by the user.
> When runtime-computed buffer-sizes are not optimal (for some reason), the 
> user has no way to work around it by setting a custom value.
> I have a patch that we use at Yahoo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.

2017-09-25 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan updated HIVE-17600:

Attachment: HIVE-17600.1-branch-2.2.patch

> Make OrcFile's "enforceBufferSize" user-settable.
> -
>
> Key: HIVE-17600
> URL: https://issues.apache.org/jira/browse/HIVE-17600
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17600.1-branch-2.2.patch
>
>
> This is a duplicate of ORC-238, but it applies to {{branch-2.2}}.
> Compression buffer-sizes in OrcFile are computed at runtime, except when 
> enforceBufferSize is set. The only snag here is that this flag can't be set 
> by the user.
> When runtime-computed buffer-sizes are not optimal (for some reason), the 
> user has no way to work around it by setting a custom value.
> I have a patch that we use at Yahoo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS

2017-09-25 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179866#comment-16179866
 ] 

Sahil Takiar commented on HIVE-17543:
-

[~hsubramaniyan], [~pvary], [~lirui] could you review?

Main things of note:

* I renamed the current {{TestPerfCliDriver}} to {{TestTezPerfCliDriver}} and 
created a new {{TestSparkPerfCliDriver}}
* I set {{hive.auto.convert.join}} to {{true}} for the {{TestSparkCliDriver}} 
since thats closer to what is run in production
* I had to make some changes to {{SparkCrossProductCheck}} and {{SparkWork}} to 
avoid some flakiness in the tests
* There are two TPC-DS queries that I couldn't get to work with HoS - query14 
and query64 - I'll file follow up tasks for fixing them
* I haven't gone through the explain plans of every single TPC-DS query for 
HoS, mainly because that will take a really long time, but I plan to do it as a 
follow up task; committing this now will give us better regression testing

> Enable PerfCliDriver for HoS
> 
>
> Key: HIVE-17543
> URL: https://issues.apache.org/jira/browse/HIVE-17543
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, 
> HIVE-17543.3.patch, HIVE-17543.4.patch
>
>
> The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually 
> run them, but it does generate explains for them. It also tricks HMS into 
> thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain 
> optimizations.
> Right now this only runs of Hive-on-Tez, we should enable it for HoS too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS

2017-09-25 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179868#comment-16179868
 ] 

Sahil Takiar commented on HIVE-17543:
-

And {{TestTezPerfCliDriver.testCliDriver[query14]}} was already failing.

> Enable PerfCliDriver for HoS
> 
>
> Key: HIVE-17543
> URL: https://issues.apache.org/jira/browse/HIVE-17543
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, 
> HIVE-17543.3.patch, HIVE-17543.4.patch
>
>
> The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually 
> run them, but it does generate explains for them. It also tricks HMS into 
> thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain 
> optimizations.
> Right now this only runs of Hive-on-Tez, we should enable it for HoS too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179834#comment-16179834
 ] 

Hive QA commented on HIVE-17543:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888937/HIVE-17543.4.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11156 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat]
 (batchId=170)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=202)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6978/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6978/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6978/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888937 - PreCommit-HIVE-Build

> Enable PerfCliDriver for HoS
> 
>
> Key: HIVE-17543
> URL: https://issues.apache.org/jira/browse/HIVE-17543
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, 
> HIVE-17543.3.patch, HIVE-17543.4.patch
>
>
> The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually 
> run them, but it does generate explains for them. It also tricks HMS into 
> thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain 
> optimizations.
> Right now this only runs of Hive-on-Tez, we should enable it for HoS too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.

2017-09-25 Thread Mithun Radhakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mithun Radhakrishnan reassigned HIVE-17600:
---


> Make OrcFile's "enforceBufferSize" user-settable.
> -
>
> Key: HIVE-17600
> URL: https://issues.apache.org/jira/browse/HIVE-17600
> Project: Hive
>  Issue Type: Improvement
>  Components: ORC
>Affects Versions: 2.2.0, 3.0.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>
> This is a duplicate of ORC-238, but it applies to {{branch-2.2}}.
> Compression buffer-sizes in OrcFile are computed at runtime, except when 
> enforceBufferSize is set. The only snag here is that this flag can't be set 
> by the user.
> When runtime-computed buffer-sizes are not optimal (for some reason), the 
> user has no way to work around it by setting a custom value.
> I have a patch that we use at Yahoo.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, 
> HIVE-17538.3.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master, thanks for reviewing [~ashutoshc]

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17538:
---
Fix Version/s: 3.0.0

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, 
> HIVE-17538.3.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed

2017-09-25 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-17586:
---
Attachment: HIVE-17586.patch

> Make HS2 BackgroundOperationPool not fixed
> --
>
> Key: HIVE-17586
> URL: https://issues.apache.org/jira/browse/HIVE-17586
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-17586.patch
>
>
> Currently the threadpool for background asynchronous operatons has a fixed 
> size controled by {{hive.server2.async.exec.threads}}. However, the thread 
> factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} 
> which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the 
> thread is actually never killed, defecting the purpose of garbage cleanup as 
> noted in the thread class name. On the other hand, since these threads never 
> go away, significant resources such as threadlocal variables (classloaders, 
> hiveconfs, etc) are holding up even if there is no operation running. This 
> can lead to escalated HS2 memory usage.
> Ideally, the threadpool should not be fixed, allowing thread to die out so 
> resources can be reclaimed. The existing config 
> {{hive.server2.async.exec.threads}} is treated as the max, and we can add a 
> min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value 
> for this configure is -1, which keeps the existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed

2017-09-25 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-17586:
---
Status: Patch Available  (was: Open)

> Make HS2 BackgroundOperationPool not fixed
> --
>
> Key: HIVE-17586
> URL: https://issues.apache.org/jira/browse/HIVE-17586
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-17586.patch
>
>
> Currently the threadpool for background asynchronous operatons has a fixed 
> size controled by {{hive.server2.async.exec.threads}}. However, the thread 
> factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} 
> which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the 
> thread is actually never killed, defecting the purpose of garbage cleanup as 
> noted in the thread class name. On the other hand, since these threads never 
> go away, significant resources such as threadlocal variables (classloaders, 
> hiveconfs, etc) are holding up even if there is no operation running. This 
> can lead to escalated HS2 memory usage.
> Ideally, the threadpool should not be fixed, allowing thread to die out so 
> resources can be reclaimed. The existing config 
> {{hive.server2.async.exec.threads}} is treated as the max, and we can add a 
> min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value 
> for this configure is -1, which keeps the existing behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-25 Thread Vineet Garg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-17536:
---
Fix Version/s: 3.0.0

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Fix For: 3.0.0
>
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats

2017-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179745#comment-16179745
 ] 

Ashutosh Chauhan commented on HIVE-17536:
-

+1

> StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics 
> or zero stats
> ---
>
> Key: HIVE-17536
> URL: https://issues.apache.org/jira/browse/HIVE-17536
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, 
> HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch
>
>
> This method returns zero for both of the following cases:
> * Statistics are missing in metastore
> * Actual stats e.g. number of rows are zero
> It'll be good for this method to return e.g. -1 in absence of statistics 
> instead of assuming it to be zero.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats

2017-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179744#comment-16179744
 ] 

Ashutosh Chauhan commented on HIVE-17538:
-

+1

> Enhance estimation of stats to estimate even if only one column is missing 
> stats
> 
>
> Key: HIVE-17538
> URL: https://issues.apache.org/jira/browse/HIVE-17538
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, 
> HIVE-17538.3.patch
>
>
> HIVE-16811 provided support for estimating statistics in absence of stats. 
> But that estimation is done if and only if statistics are missing for all 
> columns. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17543) Enable PerfCliDriver for HoS

2017-09-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17543:

Attachment: HIVE-17543.4.patch

> Enable PerfCliDriver for HoS
> 
>
> Key: HIVE-17543
> URL: https://issues.apache.org/jira/browse/HIVE-17543
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, 
> HIVE-17543.3.patch, HIVE-17543.4.patch
>
>
> The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually 
> run them, but it does generate explains for them. It also tricks HMS into 
> thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain 
> optimizations.
> Right now this only runs of Hive-on-Tez, we should enable it for HoS too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179648#comment-16179648
 ] 

Hive QA commented on HIVE-17543:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888913/HIVE-17543.3.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11156 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query58] 
(batchId=241)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] 
(batchId=239)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] 
(batchId=239)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=202)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6977/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6977/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6977/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888913 - PreCommit-HIVE-Build

> Enable PerfCliDriver for HoS
> 
>
> Key: HIVE-17543
> URL: https://issues.apache.org/jira/browse/HIVE-17543
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, 
> HIVE-17543.3.patch
>
>
> The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually 
> run them, but it does generate explains for them. It also tricks HMS into 
> thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain 
> optimizations.
> Right now this only runs of Hive-on-Tez, we should enable it for HoS too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable

2017-09-25 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179533#comment-16179533
 ] 

Sahil Takiar commented on HIVE-17545:
-

I think the advantages here are (1) if there are any bugs in the RDD caching 
logic (right now it may be simple, but in the future the logic to cache things 
may be configurable), or (2) there may be scenarios where users don't want to 
cache the data - maybe they don't have much disk space available and would 
rather recompute the RDD vs storing it.

> Make HoS RDD Cacheing Optimization Configurable
> ---
>
> Key: HIVE-17545
> URL: https://issues.apache.org/jira/browse/HIVE-17545
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer, Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch
>
>
> The RDD cacheing optimization add in HIVE-10550 is enabled by default. We 
> should make it configurable in case users want to disable it. We can leave it 
> on by default to preserve backwards compatibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17588) LlapRowRecordReader doing name-based field lookup for every column of every row

2017-09-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179514#comment-16179514
 ] 

Prasanth Jayachandran commented on HIVE-17588:
--

lgtm, +1

> LlapRowRecordReader doing name-based field lookup for every column of every 
> row
> ---
>
> Key: HIVE-17588
> URL: https://issues.apache.org/jira/browse/HIVE-17588
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17588.1.patch
>
>
> setRowFromStruct() is using 
> StructObjectInspector.getStructFieldRef(fieldName), which does a name-based 
> lookup - this can be changed to do an index-based lookup which should be 
> faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17489) Separate client-facing and server-side Kerberos principals, to support HA

2017-09-25 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179487#comment-16179487
 ] 

Mithun Radhakrishnan commented on HIVE-17489:
-

Ok, I think I've fixed the failures related to this patch. Would [~thejas] mind 
taking a look at this one? :]

> Separate client-facing and server-side Kerberos principals, to support HA
> -
>
> Key: HIVE-17489
> URL: https://issues.apache.org/jira/browse/HIVE-17489
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Mithun Radhakrishnan
>Assignee: Thiruvel Thirumoolan
> Attachments: HIVE-17489.2-branch-2.patch, HIVE-17489.2.patch, 
> HIVE-17489.2.patch, HIVE-17489.3-branch-2.patch, HIVE-17489.3.patch, 
> HIVE-17489.4-branch-2.patch, HIVE-17489.4.patch
>
>
> On deployments of the Hive metastore where a farm of servers is fronted by a 
> VIP, the hostname of the VIP (e.g. {{mycluster-hcat.blue.myth.net}}) will 
> differ from the actual boxen in the farm (.e.g 
> {{mycluster-hcat-\[0..3\].blue.myth.net}}).
> Such a deployment messes up Kerberos auth, with principals like 
> {{hcat/mycluster-hcat.blue.myth@grid.myth.net}}. Host-based checks will 
> disallow servers behind the VIP from using the VIP's hostname in its 
> principal when accessing, say, HDFS.
> The solution would be to decouple the server-side principal (used to access 
> other services like HDFS as a client) from the client-facing principal (used 
> from Hive-client, BeeLine, etc.).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17576) Improve progress-reporting in TezProcessor

2017-09-25 Thread Mithun Radhakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179481#comment-16179481
 ] 

Mithun Radhakrishnan commented on HIVE-17576:
-

The test failures are unrelated. [~owen.omalley], [~thejas], what might be the 
best version of this patch to go in? With or without reflection? (It is 
foreseeable that there might be deploys with outdated Tez versions that don't 
include the {{ProgressHelper}} API.)

> Improve progress-reporting in TezProcessor
> --
>
> Key: HIVE-17576
> URL: https://issues.apache.org/jira/browse/HIVE-17576
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0, 3.0.0, 2.4.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: HIVE-17576.1.patch, HIVE-17576.2-branch-2.patch, 
> HIVE-17576.2.patch
>
>
> Another one on behalf of [~selinazh] and [~cdrome]. Following the example in 
> [Apache Tez's 
> {{MapProcessor}}|https://github.com/apache/tez/blob/247719d7314232f680f028f4e1a19370ffb7b1bb/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/processor/map/MapProcessor.java#L88],
>  {{TezProcessor}} ought to use {{ProgressHelper}} to report progress for a 
> Tez task. As per [~kshukla]'s advice,
> {quote}
> Tez... provides {{getProgress()}} API for {{AbstractLogicalInput(s)}} which 
> will give the correct progress value for a given Input. The TezProcessor(s) 
> in Hive should use this to do something similar to what MapProcessor in Tez 
> does today, which is use/override ProgressHelper to get the input progress 
> and then set the progress on the processorContext.
> ...
> The default behavior of the ProgressHelper class sets the processor progress 
> to be the average of progress values from all inputs.
> {quote}
> This code is -whacked from- *inspired by* {{MapProcessor}}'s use of 
> {{ProgressHelper}}.
> (For my reference, YHIVE-978.)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16455) ADD JAR command leaks JAR Files

2017-09-25 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-16455:

Resolution: Duplicate
Status: Resolved  (was: Patch Available)

This issue has been fixed by HIVE-11878

> ADD JAR command leaks JAR Files
> ---
>
> Key: HIVE-16455
> URL: https://issues.apache.org/jira/browse/HIVE-16455
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-16455.1.patch
>
>
> HiveServer2 is leaking file handles when using ADD JAR statement and the JAR 
> file added is not used in the query itself.
> {noformat}
> beeline> !connect jdbc:hive2://localhost:1 admin
> 0: jdbc:hive2://localhost:1> create table test_leak (a int);
> 0: jdbc:hive2://localhost:1> insert into test_leak Values (1);
> -- Exit beeline terminal; Find PID of HiveServer2
> [root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l
> 0
> [root@host-10-17-80-111 ~]# beeline -u jdbc:hive2://localhost:1/default 
> -n admin
> And run the command "ADD JAR hdfs:///tmp/hive-contrib.jar; select * from 
> test_leak"
> [root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l
> 1
> java29588 hive  391u   REG  252,3125987  2099944 
> /tmp/57d98f5b-1e53-44e2-876b-6b4323ac24db_resources/hive-contrib.jar (deleted)
> java29588 hive  392u   REG  252,3125987  2099946 
> /tmp/eb3184ad-7f15-4a77-a10d-87717ae634d1_resources/hive-contrib.jar (deleted)
> java29588 hive  393r   REG  252,3125987  2099825 
> /tmp/e29dccfc-5708-4254-addb-7a8988fc0500_resources/hive-contrib.jar (deleted)
> java29588 hive  394r   REG  252,3125987  2099833 
> /tmp/5153dd4a-a606-4f53-b02c-d606e7e56985_resources/hive-contrib.jar (deleted)
> java29588 hive  395r   REG  252,3125987  2099827 
> /tmp/ff3cdb05-917f-43c0-830a-b293bf397a23_resources/hive-contrib.jar (deleted)
> java29588 hive  396r   REG  252,3125987  2099822 
> /tmp/60531b66-5985-421e-8eb5-eeac31fdf964_resources/hive-contrib.jar (deleted)
> java29588 hive  397r   REG  252,3125987  2099831 
> /tmp/78878921-455c-438c-9735-447566ed8381_resources/hive-contrib.jar (deleted)
> java29588 hive  399r   REG  252,3125987  2099835 
> /tmp/0e5d7990-30cc-4248-9058-587f7f1ff211_resources/hive-contrib.jar (deleted)
> {noformat}
> You can see the the session directory (and therefore anything in it) is set 
> to delete only on exit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17588) LlapRowRecordReader doing name-based field lookup for every column of every row

2017-09-25 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179456#comment-16179456
 ] 

Jason Dere commented on HIVE-17588:
---

[~prasanth_j], can you review?

> LlapRowRecordReader doing name-based field lookup for every column of every 
> row
> ---
>
> Key: HIVE-17588
> URL: https://issues.apache.org/jira/browse/HIVE-17588
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17588.1.patch
>
>
> setRowFromStruct() is using 
> StructObjectInspector.getStructFieldRef(fieldName), which does a name-based 
> lookup - this can be changed to do an index-based lookup which should be 
> faster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17157) Add InterfaceAudience and InterfaceStability annotations for ObjectInspector APIs

2017-09-25 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179453#comment-16179453
 ] 

Aihua Xu commented on HIVE-17157:
-

+1.

> Add InterfaceAudience and InterfaceStability annotations for ObjectInspector 
> APIs
> -
>
> Key: HIVE-17157
> URL: https://issues.apache.org/jira/browse/HIVE-17157
> Project: Hive
>  Issue Type: Sub-task
>  Components: Serializers/Deserializers
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17157.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17543) Enable PerfCliDriver for HoS

2017-09-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17543:

Attachment: HIVE-17543.3.patch

> Enable PerfCliDriver for HoS
> 
>
> Key: HIVE-17543
> URL: https://issues.apache.org/jira/browse/HIVE-17543
> Project: Hive
>  Issue Type: Improvement
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, 
> HIVE-17543.3.patch
>
>
> The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually 
> run them, but it does generate explains for them. It also tricks HMS into 
> thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain 
> optimizations.
> Right now this only runs of Hive-on-Tez, we should enable it for HoS too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde

2017-09-25 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179431#comment-16179431
 ] 

Alan Gates commented on HIVE-17580:
---

I'm +1 on using the SARGs for this, but I think we will have to continue to 
provide the current version of get_fields_with_environment_context for 
backwards compatibility.  So we should find something that works for it as 
well.  It makes sense to do that first and add a SARGs version of the call 
later.

> Remove dependency of get_fields_with_environment_context API to serde
> -
>
> Key: HIVE-17580
> URL: https://issues.apache.org/jira/browse/HIVE-17580
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>
> {{get_fields_with_environment_context}} metastore API uses {{Deserializer}} 
> class to access the fields metadata for the cases where it is stored along 
> with the data files (avro tables). The problem is Deserializer classes is 
> defined in hive-serde module and in order to make metastore independent of 
> Hive we will have to remove this dependency (atleast we should change it to 
> runtime dependency instead of compile time).
> The other option is investigate if we can use SearchArgument to provide this 
> functionality.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch

2017-09-25 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179319#comment-16179319
 ] 

Ashutosh Chauhan commented on HIVE-17568:
-

+1

> HiveJoinPushTransitivePredicatesRule may exchange predicates which are not 
> valid on the other branch
> 
>
> Key: HIVE-17568
> URL: https://issues.apache.org/jira/browse/HIVE-17568
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, 
> HIVE-17568.03.patch
>
>
> Joining 2 tables on at least 1 column which is not the same type ; 
> (integer/double for example).
> The calcite expressions require double/integer inputs which will became 
> invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other 
> branch.
> query:
> {code}
> create table t1 (v string, k int);
> insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30);
> create table t2 (v string, k double);
> insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30);
> select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and 
> t1.k<15;
> {code}
> results in:
> {code}
> java.lang.AssertionError: type mismatch:
> type1:
> DOUBLE
> type2:
> INTEGER
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153)
>   at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:104)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17519) Transpose column stats display

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179192#comment-16179192
 ] 

Hive QA commented on HIVE-17519:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/1271/HIVE-17519.03.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11055 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join20] (batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_with_constraints] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[escape_comments] 
(batchId=73)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_indexes_syntax] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[unicode_comments] 
(batchId=37)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver
 (batchId=242)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6976/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6976/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6976/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 1271 - PreCommit-HIVE-Build

> Transpose column stats display
> --
>
> Key: HIVE-17519
> URL: https://issues.apache.org/jira/browse/HIVE-17519
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17519.01.patch, HIVE-17519.02.patch, 
> HIVE-17519.03.patch
>
>
> currently {{describe formatted table1 insert_num}} shows the column 
> informations in a table like format...which is very hard to read - because 
> there are to many columns
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment bitVector   
>   
>  
> insert_numint 
>   
>   
> from deserializer   
> {code}
> I think it would be better to show the same information like this:
> {code}
> col_name  insert_num  
> data_type int 
> min   
> max   
> num_nulls 
> distinct_count
> avg_col_len   
> max_col_len   
> num_trues 
> num_falses
> comment   from deserializer   
> bitVector 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17544) Add Parsed Tree as input for Authorization

2017-09-25 Thread Na Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179186#comment-16179186
 ] 

Na Li commented on HIVE-17544:
--

[Object [type=DATABASE, name=default], Object [type=TABLE_OR_VIEW, 
name=default.t10]] This is from create table t10(x int); 

> Add Parsed Tree as input for Authorization
> --
>
> Key: HIVE-17544
> URL: https://issues.apache.org/jira/browse/HIVE-17544
> Project: Hive
>  Issue Type: Task
>  Components: Authorization
>Affects Versions: 2.1.1
>Reporter: Na Li
>Assignee: Aihua Xu
>Priority: Critical
>
> Right now, for authorization 2, the 
> HiveAuthorizationValidator.checkPrivileges(HiveOperationType var1, 
> List var2, List var3, 
> HiveAuthzContext var4) does not contain the parsed sql command string as 
> input. Therefore, Sentry has to parse the command again.
> The API should be changed to include the parsed result as input, so Sentry 
> does not need to parse the sql command string again.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17594) Unit format error in Copy.java

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179084#comment-16179084
 ] 

Hive QA commented on HIVE-17594:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888784/HIVE-17594.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11061 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6975/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6975/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6975/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888784 - PreCommit-HIVE-Build

> Unit format error in Copy.java
> --
>
> Key: HIVE-17594
> URL: https://issues.apache.org/jira/browse/HIVE-17594
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Attachments: HIVE-17594.1.patch
>
>
> In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual 
> value "rows/elapsed/1000.0".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17519) Transpose column stats display

2017-09-25 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17519:

Attachment: HIVE-17519.03.patch

#3) fix minor issues: do not format output for jdbc clients - beeline was 
showing padded outputs, I'm not sure what format would be desired for beeline..

> Transpose column stats display
> --
>
> Key: HIVE-17519
> URL: https://issues.apache.org/jira/browse/HIVE-17519
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17519.01.patch, HIVE-17519.02.patch, 
> HIVE-17519.03.patch
>
>
> currently {{describe formatted table1 insert_num}} shows the column 
> informations in a table like format...which is very hard to read - because 
> there are to many columns
> {code}
> # col_namedata_type   min 
> max num_nulls   distinct_count  
> avg_col_len max_col_len num_trues   
> num_falses  comment bitVector   
>   
>  
> insert_numint 
>   
>   
> from deserializer   
> {code}
> I think it would be better to show the same information like this:
> {code}
> col_name  insert_num  
> data_type int 
> min   
> max   
> num_nulls 
> distinct_count
> avg_col_len   
> max_col_len   
> num_trues 
> num_falses
> comment   from deserializer   
> bitVector 
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17299) Cannot validate SerDe even if it is in Hadoop classpath

2017-09-25 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179019#comment-16179019
 ] 

Zoltan Haindrich commented on HIVE-17299:
-

[~bartimeux] yes...I think if you are using Spark-on-Hive then spark will be in 
control of the classpath-s so I think they might have better insights about 
what could possibly go wrong in this case.

> Cannot validate SerDe even if it is in Hadoop classpath
> ---
>
> Key: HIVE-17299
> URL: https://issues.apache.org/jira/browse/HIVE-17299
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
> Environment: HADOOP_CLASSPATH : 
> /usr/hdp/2.3.4.0-3485/atlas/hook/hive/*:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-server-extensions-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/webhcat/java-client/hive-webhcat-java-client-1.2.1.2.3.4.0-3485.jar
> 2017-08-10 15:26:38,924 INFO  [main]: zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client 
> 

[jira] [Assigned] (HIVE-17598) HS2/HPL Integration: Output wrapper class

2017-09-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko reassigned HIVE-17598:
-

Assignee: Dmitry Tolpeko

> HS2/HPL Integration: Output wrapper class
> -
>
> Key: HIVE-17598
> URL: https://issues.apache.org/jira/browse/HIVE-17598
> Project: Hive
>  Issue Type: Sub-task
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>
> When running in CLI mode, HPL/SQL outputs the final results to stdout, and 
> now when running in embedded mode it has to put them into a result set to be 
> further consumed by HiveServer2 clients. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17597) HS2/HPL Integration: Avoid direct JDBC calls in HPL/SQL

2017-09-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko reassigned HIVE-17597:
-


> HS2/HPL Integration: Avoid direct JDBC calls in HPL/SQL
> ---
>
> Key: HIVE-17597
> URL: https://issues.apache.org/jira/browse/HIVE-17597
> Project: Hive
>  Issue Type: Sub-task
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>
> HPL/SQL currently uses JDBC to interact with Hive through HiveServer2. This 
> option will remain for standalone mode (CLI mode), but when HPL/SQL is used 
> within HiveServer2 it will use internal Hive API for database access. This 
> task is to refactor JDBC API calls used in HP/SQL classes and move them to 
> wrapper classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17299) Cannot validate SerDe even if it is in Hadoop classpath

2017-09-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179007#comment-16179007
 ] 

Loïc C. Chanel commented on HIVE-17299:
---

Well, as I'm using Hive under Spark, I'm not sure of waht I should do to 
reproduce the issue, but SerDe is a part of Hive, isn't it ?
Still, if you think the issue isn't related to Hive as an execution engine for 
data requests I can migrate it to Spark Jira.

> Cannot validate SerDe even if it is in Hadoop classpath
> ---
>
> Key: HIVE-17299
> URL: https://issues.apache.org/jira/browse/HIVE-17299
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 1.2.1
> Environment: HADOOP_CLASSPATH : 
> /usr/hdp/2.3.4.0-3485/atlas/hook/hive/*:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-server-extensions-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/webhcat/java-client/hive-webhcat-java-client-1.2.1.2.3.4.0-3485.jar
> 2017-08-10 15:26:38,924 INFO  [main]: zookeeper.ZooKeeper 
> (Environment.java:logEnv(100)) - Client 
> 

[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch

2017-09-25 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179005#comment-16179005
 ] 

Zoltan Haindrich commented on HIVE-17568:
-

failures are unrelated.
relying on the sqltype prevents the regression. [~ashutoshc] could you take 
another look?

> HiveJoinPushTransitivePredicatesRule may exchange predicates which are not 
> valid on the other branch
> 
>
> Key: HIVE-17568
> URL: https://issues.apache.org/jira/browse/HIVE-17568
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, 
> HIVE-17568.03.patch
>
>
> Joining 2 tables on at least 1 column which is not the same type ; 
> (integer/double for example).
> The calcite expressions require double/integer inputs which will became 
> invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other 
> branch.
> query:
> {code}
> create table t1 (v string, k int);
> insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30);
> create table t2 (v string, k double);
> insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30);
> select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and 
> t1.k<15;
> {code}
> results in:
> {code}
> java.lang.AssertionError: type mismatch:
> type1:
> DOUBLE
> type2:
> INTEGER
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153)
>   at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:104)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17596) HiveServer2 and HPL/SQL Integration

2017-09-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-17596:
--
Attachment: HiveServer2 and HPLSQL Integration.pdf

> HiveServer2 and HPL/SQL Integration
> ---
>
> Key: HIVE-17596
> URL: https://issues.apache.org/jira/browse/HIVE-17596
> Project: Hive
>  Issue Type: New Feature
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
> Attachments: HiveServer2 and HPLSQL Integration.pdf
>
>
> The main task for HiveServer2 and HPL/SQL integration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17596) HiveServer2 and HPL/SQL Integration

2017-09-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko reassigned HIVE-17596:
-


> HiveServer2 and HPL/SQL Integration
> ---
>
> Key: HIVE-17596
> URL: https://issues.apache.org/jira/browse/HIVE-17596
> Project: Hive
>  Issue Type: New Feature
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>
> The main task for HiveServer2 and HPL/SQL integration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178990#comment-16178990
 ] 

Hive QA commented on HIVE-17568:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/1207/HIVE-17568.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11062 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=226)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6974/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6974/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6974/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 1207 - PreCommit-HIVE-Build

> HiveJoinPushTransitivePredicatesRule may exchange predicates which are not 
> valid on the other branch
> 
>
> Key: HIVE-17568
> URL: https://issues.apache.org/jira/browse/HIVE-17568
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, 
> HIVE-17568.03.patch
>
>
> Joining 2 tables on at least 1 column which is not the same type ; 
> (integer/double for example).
> The calcite expressions require double/integer inputs which will became 
> invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other 
> branch.
> query:
> {code}
> create table t1 (v string, k int);
> insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30);
> create table t2 (v string, k double);
> insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30);
> select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and 
> t1.k<15;
> {code}
> results in:
> {code}
> java.lang.AssertionError: type mismatch:
> type1:
> DOUBLE
> type2:
> INTEGER
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153)
>   at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:104)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178858#comment-16178858
 ] 

Hive QA commented on HIVE-17483:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12888793/HIVE-17483.8.patch

{color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11068 tests 
executed
*Failed tests:*
{noformat}
TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) 
(batchId=231)
TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
 (batchId=170)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] 
(batchId=235)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] 
(batchId=235)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=202)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testJoinThriftSerializeInTasks 
(batchId=228)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testMetadataQueriesWithSerializeThriftInTasks
 (batchId=228)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation (batchId=228)
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation2 (batchId=228)
org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle 
(batchId=223)
org.apache.hive.service.cli.session.TestQueryDisplay.testQueryDisplay 
(batchId=223)
org.apache.hive.service.cli.session.TestQueryDisplay.testWebUI (batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6973/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6973/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6973/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12888793 - PreCommit-HIVE-Build

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, 
> HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17595) Correct DAG for updating the last.repl.id for a database during bootstrap load

2017-09-25 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek reassigned HIVE-17595:
--


> Correct DAG for updating the last.repl.id for a database during bootstrap load
> --
>
> Key: HIVE-17595
> URL: https://issues.apache.org/jira/browse/HIVE-17595
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
>
> We update the last.repl.id as a database property. This is done after all the 
> bootstrap tasks to load the relevant data are done and is the last task to be 
> run. however we are currently not setting up the DAG correctly for this task. 
> This is getting added as the root task for now where as it should be the last 
> task to be run in a DAG. This becomes more important after the inclusion of 
> HIVE-17426 since this will lead to parallel execution and incorrect DAG's 
> will lead to incorrect results/state of the system. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch

2017-09-25 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17568:

Attachment: HIVE-17568.03.patch

#3) add also {{getSqlTypeName()}} comparision

> HiveJoinPushTransitivePredicatesRule may exchange predicates which are not 
> valid on the other branch
> 
>
> Key: HIVE-17568
> URL: https://issues.apache.org/jira/browse/HIVE-17568
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, 
> HIVE-17568.03.patch
>
>
> Joining 2 tables on at least 1 column which is not the same type ; 
> (integer/double for example).
> The calcite expressions require double/integer inputs which will became 
> invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other 
> branch.
> query:
> {code}
> create table t1 (v string, k int);
> insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30);
> create table t2 (v string, k double);
> insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30);
> select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and 
> t1.k<15;
> {code}
> results in:
> {code}
> java.lang.AssertionError: type mismatch:
> type1:
> DOUBLE
> type2:
> INTEGER
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153)
>   at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:104)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch

2017-09-25 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178773#comment-16178773
 ] 

Zoltan Haindrich commented on HIVE-17568:
-

the problem arises from the fact that the two types differ in nullability:

{code}
import static org.junit.Assert.assertEquals;

import org.apache.calcite.jdbc.JavaTypeFactoryImpl;
import org.apache.calcite.rel.type.RelDataType;
import org.apache.calcite.sql.type.SqlTypeName;
import org.junit.Ignore;
import org.junit.Test;

public class CalciteTypeCompare {
  JavaTypeFactoryImpl typeFactory = new JavaTypeFactoryImpl();
  RelDataType b0 = typeFactory.builder().add("t", 
SqlTypeName.BOOLEAN).nullable(true).build();
  RelDataType b1 = typeFactory.builder().add("t", 
SqlTypeName.BOOLEAN).nullable(false).build();
  RelDataType b1x = typeFactory.builder().add("x", 
SqlTypeName.BOOLEAN).nullable(false).build();

  @Test
  @Ignore("this test case will fail; because these types are different")
  public void compareTypesIgnoringNullability() {
assertEquals(b0, b1);
  }

  @Test
  public void typeSqlNameEquals() {
assertEquals(b0.getSqlTypeName(), b1.getSqlTypeName());
  }

  @Test
  public void typeSqlNameEqualsIgnoresFieldName() {
assertEquals(b0.getSqlTypeName(), b1x.getSqlTypeName());
  }
}
{code}

I complement the patch with an addition check for the {{sqlTypeName()}}

> HiveJoinPushTransitivePredicatesRule may exchange predicates which are not 
> valid on the other branch
> 
>
> Key: HIVE-17568
> URL: https://issues.apache.org/jira/browse/HIVE-17568
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch
>
>
> Joining 2 tables on at least 1 column which is not the same type ; 
> (integer/double for example).
> The calcite expressions require double/integer inputs which will became 
> invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other 
> branch.
> query:
> {code}
> create table t1 (v string, k int);
> insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30);
> create table t2 (v string, k double);
> insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30);
> select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and 
> t1.k<15;
> {code}
> results in:
> {code}
> java.lang.AssertionError: type mismatch:
> type1:
> DOUBLE
> type2:
> INTEGER
>   at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31)
>   at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919)
>   at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112)
>   at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153)
>   at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884)
>   at 
> org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882)
>   at org.apache.calcite.rex.RexCall.accept(RexCall.java:104)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296)
>   at 
> org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98)
>   at 
> org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67)
> [...]
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17594) Unit format error in Copy.java

2017-09-25 Thread Saijin Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178768#comment-16178768
 ] 

Saijin Huang edited comment on HIVE-17594 at 9/25/17 9:25 AM:
--

[~dmtolpeko], i repoduce the problem according to the test file 
copy_to_file.sql.
--
reporduce:

before modification
1.hive -e "create table src(id int);insert into src values(2)"
2.hplsql -e "copy src to src.txt;" 
3.the result is "
Ln:1 Query executed: 1 columns, output file: src.txt
Ln:1 COPY completed: 1 row(s), 2 bytes, 55 ms, 0 rows/sec
"
the speed is not correct.

after modification
1.hive -e "create table src(id int);insert into src values(2)"
2.hplsql -e "copy src to src.txt;" 
3.the result is "
Ln:1 Query executed: 1 columns, output file: src.txt
Ln:1 COPY completed: 1 row(s), 2 bytes, 457 ms, 2.19 rows/sec
" 
the speed is correct.


was (Author: txhsj):
[~dmtolpeko], i repoduce the problem according to the test file 
copy_to_file.sql.
--
reporduce:

before modification
1.hive -e "create table src(id int);insert into src values(2)"
2.hplsql -e "copy src to src.txt;" 
3.the result is "
Ln:1 Query executed: 1 columns, output file: src.txt
Ln:1 COPY completed: 1 row(s), 2 bytes, 55 ms, 0 rows/sec
"
the speed is not correct.

after modification
1.hive -e "create table src(id int);insert into src values(2)"
2.hplsql -e "copy src to src.txt;" 
3.the result is "
Ln:1 Query executed: 1 columns, output file: src1.txt
Ln:1 COPY completed: 1 row(s), 2 bytes, 457 ms, 2.19 rows/sec
" 
the speed is correct.

> Unit format error in Copy.java
> --
>
> Key: HIVE-17594
> URL: https://issues.apache.org/jira/browse/HIVE-17594
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Attachments: HIVE-17594.1.patch
>
>
> In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual 
> value "rows/elapsed/1000.0".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17594) Unit format error in Copy.java

2017-09-25 Thread Saijin Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178768#comment-16178768
 ] 

Saijin Huang commented on HIVE-17594:
-

[~dmtolpeko], i repoduce the problem according to the test file 
copy_to_file.sql.
--
reporduce:

before modification
1.hive -e "create table src(id int);insert into src values(2)"
2.hplsql -e "copy src to src.txt;" 
3.the result is "
Ln:1 Query executed: 1 columns, output file: src.txt
Ln:1 COPY completed: 1 row(s), 2 bytes, 55 ms, 0 rows/sec
"
the speed is not correct.

after modification
1.hive -e "create table src(id int);insert into src values(2)"
2.hplsql -e "copy src to src.txt;" 
3.the result is "
Ln:1 Query executed: 1 columns, output file: src1.txt
Ln:1 COPY completed: 1 row(s), 2 bytes, 457 ms, 2.19 rows/sec
" 
the speed is correct.

> Unit format error in Copy.java
> --
>
> Key: HIVE-17594
> URL: https://issues.apache.org/jira/browse/HIVE-17594
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Affects Versions: 3.0.0
>Reporter: Saijin Huang
>Assignee: Saijin Huang
>Priority: Minor
> Attachments: HIVE-17594.1.patch
>
>
> In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual 
> value "rows/elapsed/1000.0".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17452) HPL/SQL function variable block is not initialized

2017-09-25 Thread Dmitry Tolpeko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitry Tolpeko updated HIVE-17452:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> HPL/SQL function variable block is not initialized
> --
>
> Key: HIVE-17452
> URL: https://issues.apache.org/jira/browse/HIVE-17452
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17452.1.patch
>
>
> Variable inside declaration block are not initialized:
> {code}
> CREATE FUNCTION test1()
>   RETURNS STRING
> AS
>   ret string DEFAULT 'Initial value';
> BEGIN
>   print(ret);
>   ret := 'VALUE IS SET';
>   print(ret);
> END;
> test1();
> {code}
> Output:
> {code}
> ret  
> VALUE IS SET
> {code}
> Should be:
> {code}
> Initial value 
> VALUE IS SET
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17452) HPL/SQL function variable block is not initialized

2017-09-25 Thread Dmitry Tolpeko (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178736#comment-16178736
 ] 

Dmitry Tolpeko commented on HIVE-17452:
---

Committed. 

> HPL/SQL function variable block is not initialized
> --
>
> Key: HIVE-17452
> URL: https://issues.apache.org/jira/browse/HIVE-17452
> Project: Hive
>  Issue Type: Bug
>  Components: hpl/sql
>Reporter: Dmitry Tolpeko
>Assignee: Dmitry Tolpeko
>Priority: Critical
> Attachments: HIVE-17452.1.patch
>
>
> Variable inside declaration block are not initialized:
> {code}
> CREATE FUNCTION test1()
>   RETURNS STRING
> AS
>   ret string DEFAULT 'Initial value';
> BEGIN
>   print(ret);
>   ret := 'VALUE IS SET';
>   print(ret);
> END;
> test1();
> {code}
> Output:
> {code}
> ret  
> VALUE IS SET
> {code}
> Should be:
> {code}
> Initial value 
> VALUE IS SET
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17590) upgrade hadoop to 2.8.1

2017-09-25 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178733#comment-16178733
 ] 

Peter Vary commented on HIVE-17590:
---

+1

> upgrade hadoop to 2.8.1
> ---
>
> Key: HIVE-17590
> URL: https://issues.apache.org/jira/browse/HIVE-17590
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-17590.01.patch, HIVE-17590.01.patch
>
>
> seems like hadoop 2.8.0 has no source attachment:
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.8.0/
> however
> http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.8.1/
> has source.jar-s



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-25 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Attachment: HIVE-17483.8.patch

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, 
> HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-25 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Attachment: (was: HIVE-17483.8.patch)

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, 
> HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id

2017-09-25 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-17483:
--
Attachment: HIVE-17483.8.patch

> HS2 kill command to kill queries using query id
> ---
>
> Key: HIVE-17483
> URL: https://issues.apache.org/jira/browse/HIVE-17483
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Teddy Choi
> Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, 
> HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, 
> HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch
>
>
> For administrators, it is important to be able to kill queries if required. 
> Currently, there is no clean way to do it.
> It would help to have a "kill query " command that can be run using 
> odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid 
> running in that instance.
> Authorization will have to be done to ensure that the user that is invoking 
> the API is allowed to perform this action.
> In case of SQL std authorization, this would require admin role.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.

2017-09-25 Thread Junjie Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178697#comment-16178697
 ] 

Junjie Chen commented on HIVE-17593:


hive strip spaces for char(lengh) type,  and then store value to parquet.  
Other parquet reader may read striped value which is different from original.

   public void write(Object value) {
  String v = inspector.getPrimitiveJavaObject(value).getStrippedValue();
  recordConsumer.addBinary(Binary.fromString(v));
}

[~Ferd], do you think this is a valid case? Shouldn't it store the real value? 

> DataWritableWriter strip spaces for CHAR type before writing, but predicate 
> generator doesn't do same thing.
> 
>
> Key: HIVE-17593
> URL: https://issues.apache.org/jira/browse/HIVE-17593
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Junjie Chen
>
> DataWritableWriter strip spaces for CHAR type before writing. While when 
> generating predicate, it does NOT do same striping which should cause data 
> missing!
> In current version, it doesn't cause data missing since predicate is not well 
> push down to parquet due to HIVE-17261.
> Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as 
> same which will build a predicate with tail spaces.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >