[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180288#comment-16180288 ] liyunzhang_intel commented on HIVE-17545: - [~lirui]: thanks for explanation. If disabled cache, even equivalent works are combined, the computation for the same work are still executed. > Make HoS RDD Cacheing Optimization Configurable > --- > > Key: HIVE-17545 > URL: https://issues.apache.org/jira/browse/HIVE-17545 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer, Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch > > > The RDD cacheing optimization add in HIVE-10550 is enabled by default. We > should make it configurable in case users want to disable it. We can leave it > on by default to preserve backwards compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17585) Improve thread safety when loading dynamic partitions in parallel
[ https://issues.apache.org/jira/browse/HIVE-17585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180289#comment-16180289 ] Tao Li commented on HIVE-17585: --- [~sershe] Thanks for the comments. I thought about that and was a little concerned with the memory overhead to have a new Hive instance per thread (especially the embedded metastore scenario), so I chose to go with the singleton approach. But I agree that having the thread local will give us the best safety. Regarding the latency, the major advantage of the loading partitions in parallel is making the HDFS calls in parallel, so synchronizing on the metastore client should not be a big concern. > Improve thread safety when loading dynamic partitions in parallel > - > > Key: HIVE-17585 > URL: https://issues.apache.org/jira/browse/HIVE-17585 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Fix For: 3.0.0 > > Attachments: HIVE-17585.1.patch, HIVE-17585.2.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17605) VectorizedOrcInputFormat initialization is expensive in populating partition values
[ https://issues.apache.org/jira/browse/HIVE-17605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-17605: Attachment: VectorizedOrcInputFormat_init.png > VectorizedOrcInputFormat initialization is expensive in populating partition > values > --- > > Key: HIVE-17605 > URL: https://issues.apache.org/jira/browse/HIVE-17605 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Priority: Minor > Attachments: VectorizedOrcInputFormat_init.png > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17604) Add druid properties to conf white list
[ https://issues.apache.org/jira/browse/HIVE-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17604: Status: Patch Available (was: Open) > Add druid properties to conf white list > --- > > Key: HIVE-17604 > URL: https://issues.apache.org/jira/browse/HIVE-17604 > Project: Hive > Issue Type: Improvement > Components: Configuration, Druid integration >Affects Versions: 2.2.0, 2.3.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-17604.patch > > > Currently throws: > Error: Error while processing statement: Cannot modify > hive.druid.select.distribute at runtime. It is not in list of params that are > allowed to be modified at runtime (state=42000,code=1) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17604) Add druid properties to conf white list
[ https://issues.apache.org/jira/browse/HIVE-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17604: Attachment: HIVE-17604.patch [~thejas] Can you please review? > Add druid properties to conf white list > --- > > Key: HIVE-17604 > URL: https://issues.apache.org/jira/browse/HIVE-17604 > Project: Hive > Issue Type: Improvement > Components: Configuration, Druid integration >Affects Versions: 2.2.0, 2.3.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-17604.patch > > > Currently throws: > Error: Error while processing statement: Cannot modify > hive.druid.select.distribute at runtime. It is not in list of params that are > allowed to be modified at runtime (state=42000,code=1) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17604) Add druid properties to conf white list
[ https://issues.apache.org/jira/browse/HIVE-17604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-17604: --- > Add druid properties to conf white list > --- > > Key: HIVE-17604 > URL: https://issues.apache.org/jira/browse/HIVE-17604 > Project: Hive > Issue Type: Improvement > Components: Configuration, Druid integration >Affects Versions: 2.2.0, 2.3.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > > Currently throws: > Error: Error while processing statement: Cannot modify > hive.druid.select.distribute at runtime. It is not in list of params that are > allowed to be modified at runtime (state=42000,code=1) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch
[ https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17568: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master. > HiveJoinPushTransitivePredicatesRule may exchange predicates which are not > valid on the other branch > > > Key: HIVE-17568 > URL: https://issues.apache.org/jira/browse/HIVE-17568 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Fix For: 3.0.0 > > Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, > HIVE-17568.03.patch > > > Joining 2 tables on at least 1 column which is not the same type ; > (integer/double for example). > The calcite expressions require double/integer inputs which will became > invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other > branch. > query: > {code} > create table t1 (v string, k int); > insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30); > create table t2 (v string, k double); > insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30); > select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and > t1.k<15; > {code} > results in: > {code} > java.lang.AssertionError: type mismatch: > type1: > DOUBLE > type2: > INTEGER > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919) > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) > at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153) > at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882) > at org.apache.calcite.rex.RexCall.accept(RexCall.java:104) > at > org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296) > at > org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271) > at > org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98) > at > org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67) > [...] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17483) HS2 kill command to kill queries using query id
[ https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180251#comment-16180251 ] Thejas M Nair commented on HIVE-17483: -- +1 pending tests > HS2 kill command to kill queries using query id > --- > > Key: HIVE-17483 > URL: https://issues.apache.org/jira/browse/HIVE-17483 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Teddy Choi > Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, > HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, > HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, > HIVE-17483.8.patch, HIVE-17483.9.patch > > > For administrators, it is important to be able to kill queries if required. > Currently, there is no clean way to do it. > It would help to have a "kill query " command that can be run using > odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid > running in that instance. > Authorization will have to be done to ensure that the user that is invoking > the API is allowed to perform this action. > In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17603) LLAP: Print counters for llap_text.q for validating LLAP IO usage
[ https://issues.apache.org/jira/browse/HIVE-17603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180241#comment-16180241 ] Prasanth Jayachandran commented on HIVE-17603: -- cc/ [~sershe] > LLAP: Print counters for llap_text.q for validating LLAP IO usage > - > > Key: HIVE-17603 > URL: https://issues.apache.org/jira/browse/HIVE-17603 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Prasanth Jayachandran > > LLAP text cache test is not included in minillap test suite, also could add > printing llap IO counters as part of q file output to validate llap io usage > and catch regressions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17400) Estimate stats in absence of stats for complex types
[ https://issues.apache.org/jira/browse/HIVE-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180237#comment-16180237 ] Hive QA commented on HIVE-17400: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888987/HIVE-17400.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 11061 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lateral_view] (batchId=161) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lvj_mapjoin] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_complex] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_complex] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_complex] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_complex] (batchId=162) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_join_result_complex] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_join] (batchId=154) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout (batchId=288) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6983/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6983/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6983/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888987 - PreCommit-HIVE-Build > Estimate stats in absence of stats for complex types > > > Key: HIVE-17400 > URL: https://issues.apache.org/jira/browse/HIVE-17400 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17400.1.patch > > > HIVE-16811 adds support for estimation of stats for primitive types if it > doesn't exist. This JIRA is to extend that support for complex data types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17602) Explain plan not working
[ https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180225#comment-16180225 ] Vineet Garg commented on HIVE-17602: cc [~jcamachorodriguez] > Explain plan not working > > > Key: HIVE-17602 > URL: https://issues.apache.org/jira/browse/HIVE-17602 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Critical > Fix For: 3.0.0 > > > {code:sql} > hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT > 'default') STORED AS TEXTFILE; > hive> explain select * from src where key > '4'; > Failed with exception wrong number of arguments > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ExplainTask > {code} > Error stack in hive.log > {noformat} > 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] > exec.Task: Failed with exception wrong number of arguments > java.lang.IllegalArgumentException: wrong number of arguments > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17602) Explain plan not working
[ https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17602: --- Attachment: HIVE-17602.1.patch > Explain plan not working > > > Key: HIVE-17602 > URL: https://issues.apache.org/jira/browse/HIVE-17602 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17602.1.patch > > > {code:sql} > hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT > 'default') STORED AS TEXTFILE; > hive> explain select * from src where key > '4'; > Failed with exception wrong number of arguments > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ExplainTask > {code} > Error stack in hive.log > {noformat} > 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] > exec.Task: Failed with exception wrong number of arguments > java.lang.IllegalArgumentException: wrong number of arguments > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17602) Explain plan not working
[ https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17602: --- Status: Patch Available (was: Open) > Explain plan not working > > > Key: HIVE-17602 > URL: https://issues.apache.org/jira/browse/HIVE-17602 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17602.1.patch > > > {code:sql} > hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT > 'default') STORED AS TEXTFILE; > hive> explain select * from src where key > '4'; > Failed with exception wrong number of arguments > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ExplainTask > {code} > Error stack in hive.log > {noformat} > 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] > exec.Task: Failed with exception wrong number of arguments > java.lang.IllegalArgumentException: wrong number of arguments > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17602) Explain plan not working
[ https://issues.apache.org/jira/browse/HIVE-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg reassigned HIVE-17602: -- > Explain plan not working > > > Key: HIVE-17602 > URL: https://issues.apache.org/jira/browse/HIVE-17602 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Vineet Garg >Priority: Critical > Fix For: 3.0.0 > > > {code:sql} > hive> CREATE TABLE src (key STRING COMMENT 'default', value STRING COMMENT > 'default') STORED AS TEXTFILE; > hive> explain select * from src where key > '4'; > Failed with exception wrong number of arguments > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.ExplainTask > {code} > Error stack in hive.log > {noformat} > 2017-09-25T21:18:59,591 ERROR [726b5e51-f470-4a79-be8c-95b82a6aa85d main] > exec.Task: Failed with exception wrong number of arguments > java.lang.IllegalArgumentException: wrong number of arguments > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:896) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:774) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:797) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputList(ExplainTask.java:635) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:968) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputMap(ExplainTask.java:569) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:954) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:668) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:1052) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.outputStagePlans(ExplainTask.java:1197) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:275) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.getJSONPlan(ExplainTask.java:220) > at > org.apache.hadoop.hive.ql.exec.ExplainTask.execute(ExplainTask.java:368) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:204) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2190) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1832) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1549) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1304) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1294) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:234) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17502) Reuse of default session should not throw an exception in LLAP w/ Tez
[ https://issues.apache.org/jira/browse/HIVE-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180195#comment-16180195 ] Sergey Shelukhin commented on HIVE-17502: - The config option sounds good to me, esp. if we can limit it to HS2 pool sessions that are not ever directly reused anyway. [~thejas] wdyt? Also, do we have a list or a notion of why sessionstate/hivesessionimpl object couldn't used in parallel? ThreadLocal is not an obstacle in itself but rather an artifact on not having a good dependency injection-type logic for most of Hive compile, similar to other globals. > Reuse of default session should not throw an exception in LLAP w/ Tez > - > > Key: HIVE-17502 > URL: https://issues.apache.org/jira/browse/HIVE-17502 > Project: Hive > Issue Type: Bug > Components: llap, Tez >Affects Versions: 2.1.1, 2.2.0 > Environment: HDP 2.6.1.0-129, Hue 4 >Reporter: Thai Bui >Assignee: Thai Bui > > Hive2 w/ LLAP on Tez doesn't allow a currently used, default session to be > skipped mostly because of this line > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L365. > However, some clients such as Hue 4, allow multiple sessions to be used per > user. Under this configuration, a Thrift client will send a request to either > reuse or open a new session. The reuse request could include the session id > of a currently used snippet being executed in Hue, this causes HS2 to throw > an exception: > {noformat} > 2017-09-10T17:51:36,548 INFO [Thread-89]: tez.TezSessionPoolManager > (TezSessionPoolManager.java:canWorkWithSameSession(512)) - The current user: > hive, session user: hive > 2017-09-10T17:51:36,549 ERROR [Thread-89]: exec.Task > (TezTask.java:execute(232)) - Failed to execute tez graph. > org.apache.hadoop.hive.ql.metadata.HiveException: The pool session > sessionId=5b61a578-6336-41c5-860d-9838166f97fe, queueName=llap, user=hive, > doAs=false, isOpen=true, isDefault=true, expires in 591015330ms should have > been returned to the pool > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.canWorkWithSameSession(TezSessionPoolManager.java:534) > ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129] > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:544) > ~[hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129] > at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:147) > [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129] > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) > [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129] > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129] > at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) > [hive-exec-2.1.0.2.6.1.0-129.jar:2.1.0.2.6.1.0-129] > {noformat} > Note that every query is issued as a single 'hive' user to share the LLAP > daemon pool, a set of pre-determined number of AMs is initialized at setup > time. Thus, HS2 should allow new sessions from a Thrift client to be used out > of the pool, or an existing session to be skipped and an unused session from > the pool to be returned. The logic to throw an exception in the > `canWorkWithSameSession` doesn't make sense to me. > I have a solution to fix this issue in my local branch at > https://github.com/thaibui/hive/commit/078a521b9d0906fe6c0323b63e567f6eee2f3a70. > When applied, the log will become like so > {noformat} > 2017-09-10T09:15:33,578 INFO [Thread-239]: tez.TezSessionPoolManager > (TezSessionPoolManager.java:canWorkWithSameSession(533)) - Skipping default > session sessionId=6638b1da-0f8a-405e-85f0-9586f484e6de, queueName=llap, > user=hive, doAs=false, isOpen=true, isDefault=true, expires in 591868732ms > since it is being used. > {noformat} > A test case is provided in my branch to demonstrate how it works. If possible > I would like this patch to be applied to version 2.1, 2.2 and master. Since > we are using 2.1 LLAP in production with Hue 4, this patch is critical to our > success. > Alternatively, if this patch is too broad in scope, I propose adding an > option to allow "skipping of currently used default sessions". With this new > option default to "false", existing behavior won't change unless the option > is turned on. > I will prepare an official path if this change to master &/ the other > branches is acceptable. I'm not an contributor &/ committer, this will be my > first time contributing to Hive and the Apache foundation. Any early review > is greatly appreciated, thanks! -- This message was sent by Atlassian JIRA
[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id
[ https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-17483: -- Attachment: HIVE-17483.9.patch Fixed failing unit tests > HS2 kill command to kill queries using query id > --- > > Key: HIVE-17483 > URL: https://issues.apache.org/jira/browse/HIVE-17483 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Teddy Choi > Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, > HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, > HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, > HIVE-17483.8.patch, HIVE-17483.9.patch > > > For administrators, it is important to be able to kill queries if required. > Currently, there is no clean way to do it. > It would help to have a "kill query " command that can be run using > odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid > running in that instance. > Authorization will have to be done to ensure that the user that is invoking > the API is allowed to perform this action. > In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed
[ https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-17586: --- Attachment: HIVE-17586.1.patch > Make HS2 BackgroundOperationPool not fixed > -- > > Key: HIVE-17586 > URL: https://issues.apache.org/jira/browse/HIVE-17586 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-17586.1.patch, HIVE-17586.patch > > > Currently the threadpool for background asynchronous operatons has a fixed > size controled by {{hive.server2.async.exec.threads}}. However, the thread > factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} > which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the > thread is actually never killed, defecting the purpose of garbage cleanup as > noted in the thread class name. On the other hand, since these threads never > go away, significant resources such as threadlocal variables (classloaders, > hiveconfs, etc) are holding up even if there is no operation running. This > can lead to escalated HS2 memory usage. > Ideally, the threadpool should not be fixed, allowing thread to die out so > resources can be reclaimed. The existing config > {{hive.server2.async.exec.threads}} is treated as the max, and we can add a > min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value > for this configure is -1, which keeps the existing behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17111) TestSparkCliDriver does not use LocalHiveSparkClient
[ https://issues.apache.org/jira/browse/HIVE-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180154#comment-16180154 ] Hive QA commented on HIVE-17111: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888976/HIVE-17111.1.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11063 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[spark_local_queries] (batchId=64) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6982/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6982/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6982/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888976 - PreCommit-HIVE-Build > TestSparkCliDriver does not use LocalHiveSparkClient > > > Key: HIVE-17111 > URL: https://issues.apache.org/jira/browse/HIVE-17111 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17111.1.patch > > > The TestSparkCliDriver sets the spark.master to local-cluster[2,2,1024] but > the HoS still uses decides to use the RemoteHiveSparkClient rather than the > LocalHiveSparkClient. > The issue is with the following check in HiveSparkClientFactory: > {code} > if (master.equals("local") || master.startsWith("local[")) { > // With local spark context, all user sessions share the same spark > context. > return LocalHiveSparkClient.getInstance(generateSparkConf(sparkConf)); > } else { > return new RemoteHiveSparkClient(hiveconf, sparkConf); > } > {code} > When {{master.startsWith("local[")}} it checks the value of spark.master and > sees that it doesn't start with {{local[}} and then decides to use the > RemoteHiveSparkClient. > We should fix this so that the LocalHiveSparkClient is used. It should speed > up some of the tests, and also makes qtests easier to debug since everything > will now be run in the same process. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180152#comment-16180152 ] Rui Li commented on HIVE-17545: --- [~kellyzly], if you apply two different transformations to an RDD, that RDD will be evaluated twice when we compute the child RDDs. To avoid this, you need to cache the RDD. So if we combine equivalent works w/o caching them, then we can't get rid of duplicated computations. The descriptions of HIVE-10550 and HIVE-10844 also mentioned how combing works depend on RDD caching. > Make HoS RDD Cacheing Optimization Configurable > --- > > Key: HIVE-17545 > URL: https://issues.apache.org/jira/browse/HIVE-17545 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer, Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch > > > The RDD cacheing optimization add in HIVE-10550 is enabled by default. We > should make it configurable in case users want to disable it. We can leave it > on by default to preserve backwards compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180135#comment-16180135 ] liyunzhang_intel commented on HIVE-17545: - [~lirui]: {quote} if user turns on combining equivalent works and turns off RDD caching, then there won't be perf improvement right? {quote} if users turns on combining equivalent, duplicated map/reduce work will be removed. The performance will not change whether rdd caching is enabled or not. In HoS, cache will be enabled only when the parent spark work have more than [1 children|https://github.com/kellyzly/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java#L264]. If my understanding is not right, tell me. > Make HoS RDD Cacheing Optimization Configurable > --- > > Key: HIVE-17545 > URL: https://issues.apache.org/jira/browse/HIVE-17545 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer, Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch > > > The RDD cacheing optimization add in HIVE-10550 is enabled by default. We > should make it configurable in case users want to disable it. We can leave it > on by default to preserve backwards compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16426) Query cancel: improve the way to handle files
[ https://issues.apache.org/jira/browse/HIVE-16426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180123#comment-16180123 ] Yongzhi Chen commented on HIVE-16426: - When Cancel happens, releaseDriverContext() will be called. It calls driverCxt.shutdown(); This methods will shutdown every related running task by calling task's shutdown method. How to implement it depend on each task, for example for MapReduce task its shutdown majorly kill the job. > Query cancel: improve the way to handle files > - > > Key: HIVE-16426 > URL: https://issues.apache.org/jira/browse/HIVE-16426 > Project: Hive > Issue Type: Improvement >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 3.0.0 > > Attachments: HIVE-16426.1.patch > > > 1. Add data structure support to make it is easy to check query cancel status. > 2. Handle query cancel more gracefully. Remove possible file leaks caused by > query cancel as shown in following stack: > {noformat} > 2017-04-11 09:57:30,727 WARN org.apache.hadoop.hive.ql.exec.Utilities: > [HiveServer2-Background-Pool: Thread-149]: Failed to clean-up tmp directories. > java.io.InterruptedIOException: Call interrupted > at org.apache.hadoop.ipc.Client.call(Client.java:1496) > at org.apache.hadoop.ipc.Client.call(Client.java:1439) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy20.delete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy21.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671) > at > org.apache.hadoop.hive.ql.exec.Utilities.clearWork(Utilities.java:277) > at > org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:463) > at > org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:142) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1978) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1691) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1423) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1202) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:303) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:316) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > 3. Add checkpoints to related file operations to improve response time for > query cancelling. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17373) Upgrade some dependency versions
[ https://issues.apache.org/jira/browse/HIVE-17373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180122#comment-16180122 ] Sergey Shelukhin commented on HIVE-17373: - An astute observation ;) Should we revert the upgrade and re-do it with the test fixed? I don't think it makes sense to upgrade accumulo if that breaks all accumulo test. Alternatively we can remove the test if it's not needed. > Upgrade some dependency versions > > > Key: HIVE-17373 > URL: https://issues.apache.org/jira/browse/HIVE-17373 > Project: Hive > Issue Type: Improvement > Components: Hive >Affects Versions: 3.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Fix For: 3.0.0 > > Attachments: HIVE-17373.1.patch, HIVE-17373.2.patch > > > Upgrade some libraries including log4j to 2.8.2, accumulo to 1.8.1 and > commons-httpclient to 3.1. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17400) Estimate stats in absence of stats for complex types
[ https://issues.apache.org/jira/browse/HIVE-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17400: --- Attachment: HIVE-17400.1.patch > Estimate stats in absence of stats for complex types > > > Key: HIVE-17400 > URL: https://issues.apache.org/jira/browse/HIVE-17400 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17400.1.patch > > > HIVE-16811 adds support for estimation of stats for primitive types if it > doesn't exist. This JIRA is to extend that support for complex data types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17400) Estimate stats in absence of stats for complex types
[ https://issues.apache.org/jira/browse/HIVE-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17400: --- Status: Patch Available (was: Open) > Estimate stats in absence of stats for complex types > > > Key: HIVE-17400 > URL: https://issues.apache.org/jira/browse/HIVE-17400 > Project: Hive > Issue Type: Improvement > Components: Query Planning >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17400.1.patch > > > HIVE-16811 adds support for estimation of stats for primitive types if it > doesn't exist. This JIRA is to extend that support for complex data types. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180111#comment-16180111 ] Rui Li commented on HIVE-17545: --- Hi [~stakiar], since we have a switch to turn off combing equivalent works, why do we need another config to turn off RDD caching? More importantly, if user turns on combining equivalent works and turns off RDD caching, then there won't be perf improvement right? > Make HoS RDD Cacheing Optimization Configurable > --- > > Key: HIVE-17545 > URL: https://issues.apache.org/jira/browse/HIVE-17545 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer, Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch > > > The RDD cacheing optimization add in HIVE-10550 is enabled by default. We > should make it configurable in case users want to disable it. We can leave it > on by default to preserve backwards compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (HIVE-17474) Poor Performance about subquery like DS/query70 on HoS
[ https://issues.apache.org/jira/browse/HIVE-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel resolved HIVE-17474. - Resolution: Not A Problem > Poor Performance about subquery like DS/query70 on HoS > -- > > Key: HIVE-17474 > URL: https://issues.apache.org/jira/browse/HIVE-17474 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel > Attachments: explain.70.after.analyze, explain.70.before.analyze, > explain.70.vec > > > in > [DS/query70|https://github.com/kellyzly/hive-testbench/blob/hive14/sample-queries-tpcds/query70.sql]. > {code} > select > sum(ss_net_profit) as total_sum >,s_state >,s_county >,grouping__id as lochierarchy >, rank() over(partition by grouping__id, case when grouping__id == 2 then > s_state end order by sum(ss_net_profit)) as rank_within_parent > from > store_sales ss join date_dim d1 on d1.d_date_sk = ss.ss_sold_date_sk > join store s on s.s_store_sk = ss.ss_store_sk > where > d1.d_month_seq between 1193 and 1193+11 > and s.s_state in > ( select s_state >from (select s_state as s_state, sum(ss_net_profit), > rank() over ( partition by s_state order by > sum(ss_net_profit) desc) as ranking > from store_sales, store, date_dim > where d_month_seq between 1193 and 1193+11 > and date_dim.d_date_sk = > store_sales.ss_sold_date_sk > and store.s_store_sk = store_sales.ss_store_sk > group by s_state > ) tmp1 >where ranking <= 5 > ) > group by s_state,s_county with rollup > order by >lochierarchy desc > ,case when lochierarchy = 0 then s_state end > ,rank_within_parent > limit 100; > {code} > let's analyze the query, > part1: it calculates the sub-query and get the result of the state which > ss_net_profit is less than 5. > part2: big table store_sales join small tables date_dim, store and get the > result. > part3: part1 join part2 > the problem is on the part3, this is common join. The cardinality of part1 > and part2 is low as there are not very different values about states( > actually there are 30 different values in the table store). If use common > join, big data will go to the 30 reducers. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17585) Improve thread safety when loading dynamic partitions in parallel
[ https://issues.apache.org/jira/browse/HIVE-17585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180102#comment-16180102 ] Sergey Shelukhin commented on HIVE-17585: - Hmm... wouldn't a simpler solution be to run Hive.get() instead of synchronizing a set of methods called by loadPartition, given that Hive object is not thread safe and so the original code uses it incorrectly by calling loadPartition from multiple threads? If someone changes what loadPartition calls, this will break again as far as I can tell. And it's not good to change every method to use synchronized MSC, that will just be a perf hit. Unless I'm missing something. > Improve thread safety when loading dynamic partitions in parallel > - > > Key: HIVE-17585 > URL: https://issues.apache.org/jira/browse/HIVE-17585 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Fix For: 3.0.0 > > Attachments: HIVE-17585.1.patch, HIVE-17585.2.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17585) Improve thread safety when loading dynamic partitions in parallel
[ https://issues.apache.org/jira/browse/HIVE-17585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180102#comment-16180102 ] Sergey Shelukhin edited comment on HIVE-17585 at 9/26/17 2:22 AM: -- Hmm... wouldn't a simpler solution be to run Hive.get() on the callable (make callable static to make sure it cannot access "this"), given that Hive object is not thread safe and so the original code uses it incorrectly by calling loadPartition from multiple threads? Right now it's synchronizing a set of methods called by loadPartition; if someone changes what loadPartition calls, this will break again as far as I can tell. And it's not good to change every method to use synchronized MSC, that will just be a perf hit. Unless I'm missing something. was (Author: sershe): Hmm... wouldn't a simpler solution be to run Hive.get() instead of synchronizing a set of methods called by loadPartition, given that Hive object is not thread safe and so the original code uses it incorrectly by calling loadPartition from multiple threads? If someone changes what loadPartition calls, this will break again as far as I can tell. And it's not good to change every method to use synchronized MSC, that will just be a perf hit. Unless I'm missing something. > Improve thread safety when loading dynamic partitions in parallel > - > > Key: HIVE-17585 > URL: https://issues.apache.org/jira/browse/HIVE-17585 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Fix For: 3.0.0 > > Attachments: HIVE-17585.1.patch, HIVE-17585.2.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17594) Unit format error in Copy.java
[ https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180097#comment-16180097 ] Saijin Huang commented on HIVE-17594: - [~dmtolpeko],the examples is list above.Can you plz take a quick review and quick commit? > Unit format error in Copy.java > -- > > Key: HIVE-17594 > URL: https://issues.apache.org/jira/browse/HIVE-17594 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 3.0.0 >Reporter: Saijin Huang >Assignee: Saijin Huang >Priority: Minor > Attachments: HIVE-17594.1.patch > > > In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual > value "rows/elapsed/1000.0". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17601) improve error handling in LlapServiceDriver
[ https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180087#comment-16180087 ] Hive QA commented on HIVE-17601: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888972/HIVE-17601.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11055 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=242) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6981/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6981/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6981/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888972 - PreCommit-HIVE-Build > improve error handling in LlapServiceDriver > --- > > Key: HIVE-17601 > URL: https://issues.apache.org/jira/browse/HIVE-17601 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17601.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17587) Remove unnecessary filter from getPartitionsFromPartitionIds call
[ https://issues.apache.org/jira/browse/HIVE-17587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180071#comment-16180071 ] Sergey Shelukhin commented on HIVE-17587: - +1 > Remove unnecessary filter from getPartitionsFromPartitionIds call > - > > Key: HIVE-17587 > URL: https://issues.apache.org/jira/browse/HIVE-17587 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-17587.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180063#comment-16180063 ] liyunzhang_intel commented on HIVE-17545: - [~stakiar]: sounds good. But i don't know why cache optimization was not configurable before. [~lirui]: As you are more familiar with the code, can you take some time to look? > Make HoS RDD Cacheing Optimization Configurable > --- > > Key: HIVE-17545 > URL: https://issues.apache.org/jira/browse/HIVE-17545 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer, Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch > > > The RDD cacheing optimization add in HIVE-10550 is enabled by default. We > should make it configurable in case users want to disable it. We can leave it > on by default to preserve backwards compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module
[ https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180020#comment-16180020 ] Vihang Karajgaonkar commented on HIVE-17371: Thanks [~alangates]. [~thejas] or [~vgumashta] Can you please take a look at this change and confirm if this approach makes sense to you? Thanks! > Move tokenstores to metastore module > > > Key: HIVE-17371 > URL: https://issues.apache.org/jira/browse/HIVE-17371 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17371.01.patch > > > The {{getTokenStore}} method will not work for the {{DBTokenStore}} and > {{ZKTokenStore}} since they implement > {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of > {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}} > {code} > private DelegationTokenStore getTokenStore(Configuration conf) throws > IOException { > String tokenStoreClassName = > MetastoreConf.getVar(conf, > MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, ""); > // The second half of this if is to catch cases where users are passing > in a HiveConf for > // configuration. It will have set the default value of > // "hive.cluster.delegation.token.store .class" to > // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its > construction. But this is > // the hive-shims version of the memory store. We want to convert this > to our default value. > if (StringUtils.isBlank(tokenStoreClassName) || > > "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) > { > return new MemoryTokenStore(); > } > try { > Class storeClass = > > Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class); > return ReflectionUtils.newInstance(storeClass, conf); > } catch (ClassNotFoundException e) { > throw new IOException("Error initializing delegation token store: " + > tokenStoreClassName, e); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16426) Query cancel: improve the way to handle files
[ https://issues.apache.org/jira/browse/HIVE-16426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180002#comment-16180002 ] Prasanth Jayachandran commented on HIVE-16426: -- I don't understand how this patch handles the already running background task? When a query timeout is set, timeout monitor will set the operation state to TIMEOUT. With this patch, only the client is provided with SQLTimeoutException but the task that is actually executing on the cluster is not interrupted/cleaned up. Same is the case when user Cancels the query by ctrl + c. Isn't it? > Query cancel: improve the way to handle files > - > > Key: HIVE-16426 > URL: https://issues.apache.org/jira/browse/HIVE-16426 > Project: Hive > Issue Type: Improvement >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 3.0.0 > > Attachments: HIVE-16426.1.patch > > > 1. Add data structure support to make it is easy to check query cancel status. > 2. Handle query cancel more gracefully. Remove possible file leaks caused by > query cancel as shown in following stack: > {noformat} > 2017-04-11 09:57:30,727 WARN org.apache.hadoop.hive.ql.exec.Utilities: > [HiveServer2-Background-Pool: Thread-149]: Failed to clean-up tmp directories. > java.io.InterruptedIOException: Call interrupted > at org.apache.hadoop.ipc.Client.call(Client.java:1496) > at org.apache.hadoop.ipc.Client.call(Client.java:1439) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy20.delete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:535) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy21.delete(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:2059) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:675) > at > org.apache.hadoop.hdfs.DistributedFileSystem$13.doCall(DistributedFileSystem.java:671) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:671) > at > org.apache.hadoop.hive.ql.exec.Utilities.clearWork(Utilities.java:277) > at > org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:463) > at > org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:142) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1978) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1691) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1423) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1207) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1202) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238) > at > org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88) > at > org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:303) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920) > at > org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:316) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > {noformat} > 3. Add checkpoints to related file operations to improve response time for > query cancelling. -- This message was sent by Atlassian JIRA
[jira] [Commented] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.
[ https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180001#comment-16180001 ] Hive QA commented on HIVE-17600: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888960/HIVE-17600.1-branch-2.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 9937 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=244) TestJdbcDriver2 - did not produce a TEST-*.xml file (likely timed out) (batchId=225) TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=167) [acid_globallimit.q,alter_merge_2_orc.q] TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=173) [infer_bucket_sort_reducers_power_two.q,list_bucket_dml_10.q,orc_merge9.q,orc_merge6.q,leftsemijoin_mr.q,bucket6.q,bucketmapjoin7.q,uber_reduce.q,empty_dir_in_table.q,vector_outer_join3.q,index_bitmap_auto.q,vector_outer_join2.q,vector_outer_join1.q,orc_merge1.q,orc_merge_diff_fs.q,load_hdfs_file_with_space_in_the_name.q,scriptfile1_win.q,quotedid_smb.q,truncate_column_buckets.q,orc_merge3.q] TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=174) [infer_bucket_sort_num_buckets.q,gen_udf_example_add10.q,insert_overwrite_directory2.q,orc_merge5.q,bucketmapjoin6.q,import_exported_table.q,vector_outer_join0.q,orc_merge4.q,temp_table_external.q,orc_merge_incompat1.q,root_dir_external_table.q,constprog_semijoin.q,auto_sortmerge_join_16.q,schemeAuthority.q,index_bitmap3.q,external_table_with_space_in_location_path.q,parallel_orderby.q,infer_bucket_sort_map_operators.q,bucketizedhiveinputformat.q,remote_script.q] TestMiniSparkOnYarnCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=175) [scriptfile1.q,vector_outer_join5.q,file_with_header_footer.q,bucket4.q,input16_cc.q,bucket5.q,infer_bucket_sort_merge.q,constprog_partitioner.q,orc_merge2.q,reduce_deduplicate.q,schemeAuthority2.q,load_fs2.q,orc_merge8.q,orc_merge_incompat2.q,infer_bucket_sort_bucketed_table.q,vector_outer_join4.q,disable_merge_for_bucketing.q,vector_inner_join.q,orc_merge7.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=118) [bucketmapjoin4.q,bucket_map_join_spark4.q,union21.q,groupby2_noskew.q,timestamp_2.q,date_join1.q,mergejoins.q,smb_mapjoin_11.q,auto_sortmerge_join_3.q,mapjoin_test_outer.q,vectorization_9.q,merge2.q,groupby6_noskew.q,auto_join_without_localtask.q,multi_join_union.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=119) [join_cond_pushdown_unqual4.q,union_remove_7.q,join13.q,join_vc.q,groupby_cube1.q,bucket_map_join_spark2.q,sample3.q,smb_mapjoin_19.q,stats16.q,union23.q,union.q,union31.q,cbo_udf_udaf.q,ptf_decimal.q,bucketmapjoin2.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=120) [parallel_join1.q,union27.q,union12.q,groupby7_map_multi_single_reducer.q,varchar_join1.q,join7.q,join_reorder4.q,skewjoinopt2.q,bucketsortoptimize_insert_2.q,smb_mapjoin_17.q,script_env_var1.q,groupby7_map.q,groupby3.q,bucketsortoptimize_insert_8.q,union20.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=121) [ptf_general_queries.q,auto_join_reordering_values.q,sample2.q,join1.q,decimal_join.q,mapjoin_subquery2.q,join32_lessSize.q,mapjoin1.q,order2.q,skewjoinopt18.q,union_remove_18.q,join25.q,groupby9.q,bucketsortoptimize_insert_6.q,ctas.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=122) [groupby_map_ppr.q,nullgroup4_multi_distinct.q,join_rc.q,union14.q,smb_mapjoin_12.q,vector_cast_constant.q,union_remove_4.q,auto_join11.q,load_dyn_part7.q,udaf_collect_set.q,vectorization_12.q,groupby_sort_skew_1.q,groupby_sort_skew_1_23.q,smb_mapjoin_25.q,skewjoinopt12.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=123) [skewjoinopt15.q,auto_join18.q,list_bucket_dml_2.q,input1_limit.q,load_dyn_part3.q,union_remove_14.q,auto_sortmerge_join_14.q,auto_sortmerge_join_15.q,union10.q,bucket_map_join_tez2.q,groupby5_map_skew.q,join_reorder.q,sample1.q,bucketmapjoin8.q,union34.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=124) [avro_joins.q,skewjoinopt16.q,auto_join14.q,vectorization_14.q,auto_join26.q,stats1.q,cbo_stats.q,auto_sortmerge_join_6.q,union22.q,union_remove_24.q,union_view.q,smb_mapjoin_22.q,stats15.q,ptf_matchpath.q,transform_ppr1.q] TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=125)
[jira] [Updated] (HIVE-17111) TestSparkCliDriver does not use LocalHiveSparkClient
[ https://issues.apache.org/jira/browse/HIVE-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17111: Attachment: HIVE-17111.1.patch Attaching a patch that creates a new CLI Driver called {{TestLocalSparkCliDriver}} which sets {{spark.master=local[*]}}. Adding a very simple q test, with a few basic queries. This provides some test coverage for {{LocalHiveSparkClient}}. The main advantage is that this new CLI Driver runs the entire HoS query inside a single process. This makes debugging HoS much easier. Users can set breakpoints in portions of the HoS code that are only invoked at runtime. While this does provide some coverage for {{LocalHiveSparkClient}}, I think that main advantage is that it makes debugging HoS easier for developers, especially new developers who may not be as familiar with the HoS code and want to debug things via an IDE like IntelliJ. The patch doesn't modify anything related to the other Spark CLI Drivers. > TestSparkCliDriver does not use LocalHiveSparkClient > > > Key: HIVE-17111 > URL: https://issues.apache.org/jira/browse/HIVE-17111 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17111.1.patch > > > The TestSparkCliDriver sets the spark.master to local-cluster[2,2,1024] but > the HoS still uses decides to use the RemoteHiveSparkClient rather than the > LocalHiveSparkClient. > The issue is with the following check in HiveSparkClientFactory: > {code} > if (master.equals("local") || master.startsWith("local[")) { > // With local spark context, all user sessions share the same spark > context. > return LocalHiveSparkClient.getInstance(generateSparkConf(sparkConf)); > } else { > return new RemoteHiveSparkClient(hiveconf, sparkConf); > } > {code} > When {{master.startsWith("local[")}} it checks the value of spark.master and > sees that it doesn't start with {{local[}} and then decides to use the > RemoteHiveSparkClient. > We should fix this so that the LocalHiveSparkClient is used. It should speed > up some of the tests, and also makes qtests easier to debug since everything > will now be run in the same process. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17111) TestSparkCliDriver does not use LocalHiveSparkClient
[ https://issues.apache.org/jira/browse/HIVE-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17111: Status: Patch Available (was: Open) > TestSparkCliDriver does not use LocalHiveSparkClient > > > Key: HIVE-17111 > URL: https://issues.apache.org/jira/browse/HIVE-17111 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17111.1.patch > > > The TestSparkCliDriver sets the spark.master to local-cluster[2,2,1024] but > the HoS still uses decides to use the RemoteHiveSparkClient rather than the > LocalHiveSparkClient. > The issue is with the following check in HiveSparkClientFactory: > {code} > if (master.equals("local") || master.startsWith("local[")) { > // With local spark context, all user sessions share the same spark > context. > return LocalHiveSparkClient.getInstance(generateSparkConf(sparkConf)); > } else { > return new RemoteHiveSparkClient(hiveconf, sparkConf); > } > {code} > When {{master.startsWith("local[")}} it checks the value of spark.master and > sees that it doesn't start with {{local[}} and then decides to use the > RemoteHiveSparkClient. > We should fix this so that the LocalHiveSparkClient is used. It should speed > up some of the tests, and also makes qtests easier to debug since everything > will now be run in the same process. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17386) support LLAP workload management in HS2 (low level only)
[ https://issues.apache.org/jira/browse/HIVE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179966#comment-16179966 ] Zhiyuan Yang commented on HIVE-17386: - +1 (non-binding). CC [~hagleitn] > support LLAP workload management in HS2 (low level only) > > > Key: HIVE-17386 > URL: https://issues.apache.org/jira/browse/HIVE-17386 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17386.01.only.patch, HIVE-17386.01.patch, > HIVE-17386.01.patch, HIVE-17386.02.patch, HIVE-17386.03.patch, > HIVE-17386.04.patch, HIVE-17386.only.patch, HIVE-17386.patch > > > This makes use of HIVE-17297 and creates building blocks for workload > management policies, etc. > For now, there are no policies - a single yarn queue is designated for all > LLAP query AMs, and the capacity is distributed equally. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17566) Create schema required for workload management.
[ https://issues.apache.org/jira/browse/HIVE-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179955#comment-16179955 ] Sergey Shelukhin edited comment on HIVE-17566 at 9/25/17 11:36 PM: --- Also can you generate the patch w/o generated code and post on RB? I will review at some point, tomorrow probably was (Author: sershe): Also can you generate the patch w/o generated code and post on RB? > Create schema required for workload management. > --- > > Key: HIVE-17566 > URL: https://issues.apache.org/jira/browse/HIVE-17566 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17566.01.patch > > > Schema + model changes required for workload management. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17566) Create schema required for workload management.
[ https://issues.apache.org/jira/browse/HIVE-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179955#comment-16179955 ] Sergey Shelukhin commented on HIVE-17566: - Also can you generate the patch w/o generated code and post on RB? > Create schema required for workload management. > --- > > Key: HIVE-17566 > URL: https://issues.apache.org/jira/browse/HIVE-17566 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17566.01.patch > > > Schema + model changes required for workload management. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17566) Create schema required for workload management.
[ https://issues.apache.org/jira/browse/HIVE-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179952#comment-16179952 ] Sergey Shelukhin commented on HIVE-17566: - [~harishjp] TestSchemaTool failures might be related > Create schema required for workload management. > --- > > Key: HIVE-17566 > URL: https://issues.apache.org/jira/browse/HIVE-17566 > Project: Hive > Issue Type: Sub-task >Reporter: Harish Jaiprakash >Assignee: Harish Jaiprakash > Attachments: HIVE-17566.01.patch > > > Schema + model changes required for workload management. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15212) merge branch into master
[ https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179950#comment-16179950 ] Sergey Shelukhin commented on HIVE-15212: - It looks like IOW works, but multi-IOW doesn't (see the mm_all test - where we I/IOW, IOW/IOW, etc. into the same or different tables). > merge branch into master > > > Key: HIVE-15212 > URL: https://issues.apache.org/jira/browse/HIVE-15212 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, > HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, > HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, > HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, > HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, > HIVE-15212.13.patch, HIVE-15212.14.patch, HIVE-15212.15.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17601) improve error handling in LlapServiceDriver
[ https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17601: Status: Patch Available (was: Open) > improve error handling in LlapServiceDriver > --- > > Key: HIVE-17601 > URL: https://issues.apache.org/jira/browse/HIVE-17601 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17601.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17601) improve error handling in LlapServiceDriver
[ https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-17601: Attachment: HIVE-17601.patch [~prasanth_j] can you please take a look? This also cleans up some todo > improve error handling in LlapServiceDriver > --- > > Key: HIVE-17601 > URL: https://issues.apache.org/jira/browse/HIVE-17601 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-17601.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17601) improve error handling in LlapServiceDriver
[ https://issues.apache.org/jira/browse/HIVE-17601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-17601: --- > improve error handling in LlapServiceDriver > --- > > Key: HIVE-17601 > URL: https://issues.apache.org/jira/browse/HIVE-17601 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15212) merge branch into master
[ https://issues.apache.org/jira/browse/HIVE-15212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179929#comment-16179929 ] Wei Zheng commented on HIVE-15212: -- [~ekoifman] You at'ed the wrong person ;) Sorry for the late update. I left a todo comment in HiveInputFormat.java:processForWriteIds() {code} // todo for IOW, we also need to count in base dir, if any for (AcidUtils.ParsedDelta delta : dirInfo.getCurrentDirectories()) { Utilities.LOG14535.info("Adding input " + delta.getPath()); finalPaths.add(delta.getPath()); } {code} Here we just need to count in base dir if any. > merge branch into master > > > Key: HIVE-15212 > URL: https://issues.apache.org/jira/browse/HIVE-15212 > Project: Hive > Issue Type: Sub-task >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-15212.01.patch, HIVE-15212.02.patch, > HIVE-15212.03.patch, HIVE-15212.04.patch, HIVE-15212.05.patch, > HIVE-15212.06.patch, HIVE-15212.07.patch, HIVE-15212.08.patch, > HIVE-15212.09.patch, HIVE-15212.10.patch, HIVE-15212.11.patch, > HIVE-15212.12.patch, HIVE-15212.12.patch, HIVE-15212.13.patch, > HIVE-15212.13.patch, HIVE-15212.14.patch, HIVE-15212.15.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed
[ https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179927#comment-16179927 ] Hive QA commented on HIVE-17586: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888949/HIVE-17586.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11055 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[empty_join] (batchId=76) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=242) org.apache.hive.service.cli.session.TestSessionManagerMetrics.testThreadPoolMetrics (batchId=197) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6979/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6979/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6979/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888949 - PreCommit-HIVE-Build > Make HS2 BackgroundOperationPool not fixed > -- > > Key: HIVE-17586 > URL: https://issues.apache.org/jira/browse/HIVE-17586 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-17586.patch > > > Currently the threadpool for background asynchronous operatons has a fixed > size controled by {{hive.server2.async.exec.threads}}. However, the thread > factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} > which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the > thread is actually never killed, defecting the purpose of garbage cleanup as > noted in the thread class name. On the other hand, since these threads never > go away, significant resources such as threadlocal variables (classloaders, > hiveconfs, etc) are holding up even if there is no operation running. This > can lead to escalated HS2 memory usage. > Ideally, the threadpool should not be fixed, allowing thread to die out so > resources can be reclaimed. The existing config > {{hive.server2.async.exec.threads}} is treated as the max, and we can add a > min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value > for this configure is -1, which keeps the existing behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17371) Move tokenstores to metastore module
[ https://issues.apache.org/jira/browse/HIVE-17371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179923#comment-16179923 ] Alan Gates commented on HIVE-17371: --- I'm fine with this approach. But we should get buy off from [~thejas] and [~vgumashta] as they spend the most time in HS2. > Move tokenstores to metastore module > > > Key: HIVE-17371 > URL: https://issues.apache.org/jira/browse/HIVE-17371 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17371.01.patch > > > The {{getTokenStore}} method will not work for the {{DBTokenStore}} and > {{ZKTokenStore}} since they implement > {{org.apache.hadoop.hive.thrift.DelegationTokenStore}} instead of > {{org.apache.hadoop.hive.metastore.security.DelegationTokenStore}} > {code} > private DelegationTokenStore getTokenStore(Configuration conf) throws > IOException { > String tokenStoreClassName = > MetastoreConf.getVar(conf, > MetastoreConf.ConfVars.DELEGATION_TOKEN_STORE_CLS, ""); > // The second half of this if is to catch cases where users are passing > in a HiveConf for > // configuration. It will have set the default value of > // "hive.cluster.delegation.token.store .class" to > // "org.apache.hadoop.hive.thrift.MemoryTokenStore" as part of its > construction. But this is > // the hive-shims version of the memory store. We want to convert this > to our default value. > if (StringUtils.isBlank(tokenStoreClassName) || > > "org.apache.hadoop.hive.thrift.MemoryTokenStore".equals(tokenStoreClassName)) > { > return new MemoryTokenStore(); > } > try { > Class storeClass = > > Class.forName(tokenStoreClassName).asSubclass(DelegationTokenStore.class); > return ReflectionUtils.newInstance(storeClass, conf); > } catch (ClassNotFoundException e) { > throw new IOException("Error initializing delegation token store: " + > tokenStoreClassName, e); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17483) HS2 kill command to kill queries using query id
[ https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179879#comment-16179879 ] Thejas M Nair commented on HIVE-17483: -- Changes look good, but some of the UT failures look related (the jdbc and service package ones). Can you please take a look ? > HS2 kill command to kill queries using query id > --- > > Key: HIVE-17483 > URL: https://issues.apache.org/jira/browse/HIVE-17483 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Teddy Choi > Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, > HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, > HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch > > > For administrators, it is important to be able to kill queries if required. > Currently, there is no clean way to do it. > It would help to have a "kill query " command that can be run using > odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid > running in that instance. > Authorization will have to be done to ensure that the user that is invoking > the API is allowed to perform this action. > In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.
[ https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17600: Status: Patch Available (was: Open) > Make OrcFile's "enforceBufferSize" user-settable. > - > > Key: HIVE-17600 > URL: https://issues.apache.org/jira/browse/HIVE-17600 > Project: Hive > Issue Type: Improvement > Components: ORC >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-17600.1-branch-2.2.patch > > > This is a duplicate of ORC-238, but it applies to {{branch-2.2}}. > Compression buffer-sizes in OrcFile are computed at runtime, except when > enforceBufferSize is set. The only snag here is that this flag can't be set > by the user. > When runtime-computed buffer-sizes are not optimal (for some reason), the > user has no way to work around it by setting a custom value. > I have a patch that we use at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.
[ https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated HIVE-17600: Attachment: HIVE-17600.1-branch-2.2.patch > Make OrcFile's "enforceBufferSize" user-settable. > - > > Key: HIVE-17600 > URL: https://issues.apache.org/jira/browse/HIVE-17600 > Project: Hive > Issue Type: Improvement > Components: ORC >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-17600.1-branch-2.2.patch > > > This is a duplicate of ORC-238, but it applies to {{branch-2.2}}. > Compression buffer-sizes in OrcFile are computed at runtime, except when > enforceBufferSize is set. The only snag here is that this flag can't be set > by the user. > When runtime-computed buffer-sizes are not optimal (for some reason), the > user has no way to work around it by setting a custom value. > I have a patch that we use at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS
[ https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179866#comment-16179866 ] Sahil Takiar commented on HIVE-17543: - [~hsubramaniyan], [~pvary], [~lirui] could you review? Main things of note: * I renamed the current {{TestPerfCliDriver}} to {{TestTezPerfCliDriver}} and created a new {{TestSparkPerfCliDriver}} * I set {{hive.auto.convert.join}} to {{true}} for the {{TestSparkCliDriver}} since thats closer to what is run in production * I had to make some changes to {{SparkCrossProductCheck}} and {{SparkWork}} to avoid some flakiness in the tests * There are two TPC-DS queries that I couldn't get to work with HoS - query14 and query64 - I'll file follow up tasks for fixing them * I haven't gone through the explain plans of every single TPC-DS query for HoS, mainly because that will take a really long time, but I plan to do it as a follow up task; committing this now will give us better regression testing > Enable PerfCliDriver for HoS > > > Key: HIVE-17543 > URL: https://issues.apache.org/jira/browse/HIVE-17543 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, > HIVE-17543.3.patch, HIVE-17543.4.patch > > > The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually > run them, but it does generate explains for them. It also tricks HMS into > thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain > optimizations. > Right now this only runs of Hive-on-Tez, we should enable it for HoS too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS
[ https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179868#comment-16179868 ] Sahil Takiar commented on HIVE-17543: - And {{TestTezPerfCliDriver.testCliDriver[query14]}} was already failing. > Enable PerfCliDriver for HoS > > > Key: HIVE-17543 > URL: https://issues.apache.org/jira/browse/HIVE-17543 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, > HIVE-17543.3.patch, HIVE-17543.4.patch > > > The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually > run them, but it does generate explains for them. It also tricks HMS into > thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain > optimizations. > Right now this only runs of Hive-on-Tez, we should enable it for HoS too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS
[ https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179834#comment-16179834 ] Hive QA commented on HIVE-17543: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888937/HIVE-17543.4.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11156 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[bucketizedhiveinputformat] (batchId=170) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=239) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=202) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6978/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6978/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6978/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888937 - PreCommit-HIVE-Build > Enable PerfCliDriver for HoS > > > Key: HIVE-17543 > URL: https://issues.apache.org/jira/browse/HIVE-17543 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, > HIVE-17543.3.patch, HIVE-17543.4.patch > > > The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually > run them, but it does generate explains for them. It also tricks HMS into > thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain > optimizations. > Right now this only runs of Hive-on-Tez, we should enable it for HoS too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17600) Make OrcFile's "enforceBufferSize" user-settable.
[ https://issues.apache.org/jira/browse/HIVE-17600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan reassigned HIVE-17600: --- > Make OrcFile's "enforceBufferSize" user-settable. > - > > Key: HIVE-17600 > URL: https://issues.apache.org/jira/browse/HIVE-17600 > Project: Hive > Issue Type: Improvement > Components: ORC >Affects Versions: 2.2.0, 3.0.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > > This is a duplicate of ORC-238, but it applies to {{branch-2.2}}. > Compression buffer-sizes in OrcFile are computed at runtime, except when > enforceBufferSize is set. The only snag here is that this flag can't be set > by the user. > When runtime-computed buffer-sizes are not optimal (for some reason), the > user has no way to work around it by setting a custom value. > I have a patch that we use at Yahoo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master, thanks for reviewing [~ashutoshc] > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, > HIVE-17538.3.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to master, thanks for reviewing [~ashutoshc] > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17538: --- Fix Version/s: 3.0.0 > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, > HIVE-17538.3.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed
[ https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-17586: --- Attachment: HIVE-17586.patch > Make HS2 BackgroundOperationPool not fixed > -- > > Key: HIVE-17586 > URL: https://issues.apache.org/jira/browse/HIVE-17586 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-17586.patch > > > Currently the threadpool for background asynchronous operatons has a fixed > size controled by {{hive.server2.async.exec.threads}}. However, the thread > factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} > which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the > thread is actually never killed, defecting the purpose of garbage cleanup as > noted in the thread class name. On the other hand, since these threads never > go away, significant resources such as threadlocal variables (classloaders, > hiveconfs, etc) are holding up even if there is no operation running. This > can lead to escalated HS2 memory usage. > Ideally, the threadpool should not be fixed, allowing thread to die out so > resources can be reclaimed. The existing config > {{hive.server2.async.exec.threads}} is treated as the max, and we can add a > min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value > for this configure is -1, which keeps the existing behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17586) Make HS2 BackgroundOperationPool not fixed
[ https://issues.apache.org/jira/browse/HIVE-17586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-17586: --- Status: Patch Available (was: Open) > Make HS2 BackgroundOperationPool not fixed > -- > > Key: HIVE-17586 > URL: https://issues.apache.org/jira/browse/HIVE-17586 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-17586.patch > > > Currently the threadpool for background asynchronous operatons has a fixed > size controled by {{hive.server2.async.exec.threads}}. However, the thread > factory supplied for this threadpool is {{ThreadFactoryWithGarbageCleanup}} > which creates ThreadWithGarbageCleanup. Since this is a fixed threadpool, the > thread is actually never killed, defecting the purpose of garbage cleanup as > noted in the thread class name. On the other hand, since these threads never > go away, significant resources such as threadlocal variables (classloaders, > hiveconfs, etc) are holding up even if there is no operation running. This > can lead to escalated HS2 memory usage. > Ideally, the threadpool should not be fixed, allowing thread to die out so > resources can be reclaimed. The existing config > {{hive.server2.async.exec.threads}} is treated as the max, and we can add a > min for the threadpool {{hive.server2.async.exec.min.threads}}. Default value > for this configure is -1, which keeps the existing behavior. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-17536: --- Fix Version/s: 3.0.0 > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17536) StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics or zero stats
[ https://issues.apache.org/jira/browse/HIVE-17536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179745#comment-16179745 ] Ashutosh Chauhan commented on HIVE-17536: - +1 > StatsUtil::getBasicStatForTable doesn't distinguish b/w absence of statistics > or zero stats > --- > > Key: HIVE-17536 > URL: https://issues.apache.org/jira/browse/HIVE-17536 > Project: Hive > Issue Type: Improvement > Components: Statistics >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17536.1.patch, HIVE-17536.2.patch, > HIVE-17536.3.patch, HIVE-17536.4.patch, HIVE-17536.5.patch, HIVE-17536.6.patch > > > This method returns zero for both of the following cases: > * Statistics are missing in metastore > * Actual stats e.g. number of rows are zero > It'll be good for this method to return e.g. -1 in absence of statistics > instead of assuming it to be zero. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17538) Enhance estimation of stats to estimate even if only one column is missing stats
[ https://issues.apache.org/jira/browse/HIVE-17538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179744#comment-16179744 ] Ashutosh Chauhan commented on HIVE-17538: - +1 > Enhance estimation of stats to estimate even if only one column is missing > stats > > > Key: HIVE-17538 > URL: https://issues.apache.org/jira/browse/HIVE-17538 > Project: Hive > Issue Type: Improvement >Reporter: Vineet Garg >Assignee: Vineet Garg > Attachments: HIVE-17538.1.patch, HIVE-17538.2.patch, > HIVE-17538.3.patch > > > HIVE-16811 provided support for estimating statistics in absence of stats. > But that estimation is done if and only if statistics are missing for all > columns. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17543) Enable PerfCliDriver for HoS
[ https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17543: Attachment: HIVE-17543.4.patch > Enable PerfCliDriver for HoS > > > Key: HIVE-17543 > URL: https://issues.apache.org/jira/browse/HIVE-17543 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, > HIVE-17543.3.patch, HIVE-17543.4.patch > > > The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually > run them, but it does generate explains for them. It also tricks HMS into > thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain > optimizations. > Right now this only runs of Hive-on-Tez, we should enable it for HoS too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17543) Enable PerfCliDriver for HoS
[ https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179648#comment-16179648 ] Hive QA commented on HIVE-17543: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888913/HIVE-17543.3.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11156 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query58] (batchId=241) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=239) org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=239) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=202) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6977/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6977/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6977/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888913 - PreCommit-HIVE-Build > Enable PerfCliDriver for HoS > > > Key: HIVE-17543 > URL: https://issues.apache.org/jira/browse/HIVE-17543 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, > HIVE-17543.3.patch > > > The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually > run them, but it does generate explains for them. It also tricks HMS into > thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain > optimizations. > Right now this only runs of Hive-on-Tez, we should enable it for HoS too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17545) Make HoS RDD Cacheing Optimization Configurable
[ https://issues.apache.org/jira/browse/HIVE-17545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179533#comment-16179533 ] Sahil Takiar commented on HIVE-17545: - I think the advantages here are (1) if there are any bugs in the RDD caching logic (right now it may be simple, but in the future the logic to cache things may be configurable), or (2) there may be scenarios where users don't want to cache the data - maybe they don't have much disk space available and would rather recompute the RDD vs storing it. > Make HoS RDD Cacheing Optimization Configurable > --- > > Key: HIVE-17545 > URL: https://issues.apache.org/jira/browse/HIVE-17545 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer, Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17545.1.patch, HIVE-17545.2.patch > > > The RDD cacheing optimization add in HIVE-10550 is enabled by default. We > should make it configurable in case users want to disable it. We can leave it > on by default to preserve backwards compatibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17588) LlapRowRecordReader doing name-based field lookup for every column of every row
[ https://issues.apache.org/jira/browse/HIVE-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179514#comment-16179514 ] Prasanth Jayachandran commented on HIVE-17588: -- lgtm, +1 > LlapRowRecordReader doing name-based field lookup for every column of every > row > --- > > Key: HIVE-17588 > URL: https://issues.apache.org/jira/browse/HIVE-17588 > Project: Hive > Issue Type: Bug >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17588.1.patch > > > setRowFromStruct() is using > StructObjectInspector.getStructFieldRef(fieldName), which does a name-based > lookup - this can be changed to do an index-based lookup which should be > faster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17489) Separate client-facing and server-side Kerberos principals, to support HA
[ https://issues.apache.org/jira/browse/HIVE-17489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179487#comment-16179487 ] Mithun Radhakrishnan commented on HIVE-17489: - Ok, I think I've fixed the failures related to this patch. Would [~thejas] mind taking a look at this one? :] > Separate client-facing and server-side Kerberos principals, to support HA > - > > Key: HIVE-17489 > URL: https://issues.apache.org/jira/browse/HIVE-17489 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Mithun Radhakrishnan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-17489.2-branch-2.patch, HIVE-17489.2.patch, > HIVE-17489.2.patch, HIVE-17489.3-branch-2.patch, HIVE-17489.3.patch, > HIVE-17489.4-branch-2.patch, HIVE-17489.4.patch > > > On deployments of the Hive metastore where a farm of servers is fronted by a > VIP, the hostname of the VIP (e.g. {{mycluster-hcat.blue.myth.net}}) will > differ from the actual boxen in the farm (.e.g > {{mycluster-hcat-\[0..3\].blue.myth.net}}). > Such a deployment messes up Kerberos auth, with principals like > {{hcat/mycluster-hcat.blue.myth@grid.myth.net}}. Host-based checks will > disallow servers behind the VIP from using the VIP's hostname in its > principal when accessing, say, HDFS. > The solution would be to decouple the server-side principal (used to access > other services like HDFS as a client) from the client-facing principal (used > from Hive-client, BeeLine, etc.). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17576) Improve progress-reporting in TezProcessor
[ https://issues.apache.org/jira/browse/HIVE-17576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179481#comment-16179481 ] Mithun Radhakrishnan commented on HIVE-17576: - The test failures are unrelated. [~owen.omalley], [~thejas], what might be the best version of this patch to go in? With or without reflection? (It is foreseeable that there might be deploys with outdated Tez versions that don't include the {{ProgressHelper}} API.) > Improve progress-reporting in TezProcessor > -- > > Key: HIVE-17576 > URL: https://issues.apache.org/jira/browse/HIVE-17576 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0, 3.0.0, 2.4.0 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Attachments: HIVE-17576.1.patch, HIVE-17576.2-branch-2.patch, > HIVE-17576.2.patch > > > Another one on behalf of [~selinazh] and [~cdrome]. Following the example in > [Apache Tez's > {{MapProcessor}}|https://github.com/apache/tez/blob/247719d7314232f680f028f4e1a19370ffb7b1bb/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/processor/map/MapProcessor.java#L88], > {{TezProcessor}} ought to use {{ProgressHelper}} to report progress for a > Tez task. As per [~kshukla]'s advice, > {quote} > Tez... provides {{getProgress()}} API for {{AbstractLogicalInput(s)}} which > will give the correct progress value for a given Input. The TezProcessor(s) > in Hive should use this to do something similar to what MapProcessor in Tez > does today, which is use/override ProgressHelper to get the input progress > and then set the progress on the processorContext. > ... > The default behavior of the ProgressHelper class sets the processor progress > to be the average of progress values from all inputs. > {quote} > This code is -whacked from- *inspired by* {{MapProcessor}}'s use of > {{ProgressHelper}}. > (For my reference, YHIVE-978.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16455) ADD JAR command leaks JAR Files
[ https://issues.apache.org/jira/browse/HIVE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-16455: Resolution: Duplicate Status: Resolved (was: Patch Available) This issue has been fixed by HIVE-11878 > ADD JAR command leaks JAR Files > --- > > Key: HIVE-16455 > URL: https://issues.apache.org/jira/browse/HIVE-16455 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-16455.1.patch > > > HiveServer2 is leaking file handles when using ADD JAR statement and the JAR > file added is not used in the query itself. > {noformat} > beeline> !connect jdbc:hive2://localhost:1 admin > 0: jdbc:hive2://localhost:1> create table test_leak (a int); > 0: jdbc:hive2://localhost:1> insert into test_leak Values (1); > -- Exit beeline terminal; Find PID of HiveServer2 > [root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l > 0 > [root@host-10-17-80-111 ~]# beeline -u jdbc:hive2://localhost:1/default > -n admin > And run the command "ADD JAR hdfs:///tmp/hive-contrib.jar; select * from > test_leak" > [root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l > 1 > java29588 hive 391u REG 252,3125987 2099944 > /tmp/57d98f5b-1e53-44e2-876b-6b4323ac24db_resources/hive-contrib.jar (deleted) > java29588 hive 392u REG 252,3125987 2099946 > /tmp/eb3184ad-7f15-4a77-a10d-87717ae634d1_resources/hive-contrib.jar (deleted) > java29588 hive 393r REG 252,3125987 2099825 > /tmp/e29dccfc-5708-4254-addb-7a8988fc0500_resources/hive-contrib.jar (deleted) > java29588 hive 394r REG 252,3125987 2099833 > /tmp/5153dd4a-a606-4f53-b02c-d606e7e56985_resources/hive-contrib.jar (deleted) > java29588 hive 395r REG 252,3125987 2099827 > /tmp/ff3cdb05-917f-43c0-830a-b293bf397a23_resources/hive-contrib.jar (deleted) > java29588 hive 396r REG 252,3125987 2099822 > /tmp/60531b66-5985-421e-8eb5-eeac31fdf964_resources/hive-contrib.jar (deleted) > java29588 hive 397r REG 252,3125987 2099831 > /tmp/78878921-455c-438c-9735-447566ed8381_resources/hive-contrib.jar (deleted) > java29588 hive 399r REG 252,3125987 2099835 > /tmp/0e5d7990-30cc-4248-9058-587f7f1ff211_resources/hive-contrib.jar (deleted) > {noformat} > You can see the the session directory (and therefore anything in it) is set > to delete only on exit. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17588) LlapRowRecordReader doing name-based field lookup for every column of every row
[ https://issues.apache.org/jira/browse/HIVE-17588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179456#comment-16179456 ] Jason Dere commented on HIVE-17588: --- [~prasanth_j], can you review? > LlapRowRecordReader doing name-based field lookup for every column of every > row > --- > > Key: HIVE-17588 > URL: https://issues.apache.org/jira/browse/HIVE-17588 > Project: Hive > Issue Type: Bug >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17588.1.patch > > > setRowFromStruct() is using > StructObjectInspector.getStructFieldRef(fieldName), which does a name-based > lookup - this can be changed to do an index-based lookup which should be > faster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17157) Add InterfaceAudience and InterfaceStability annotations for ObjectInspector APIs
[ https://issues.apache.org/jira/browse/HIVE-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179453#comment-16179453 ] Aihua Xu commented on HIVE-17157: - +1. > Add InterfaceAudience and InterfaceStability annotations for ObjectInspector > APIs > - > > Key: HIVE-17157 > URL: https://issues.apache.org/jira/browse/HIVE-17157 > Project: Hive > Issue Type: Sub-task > Components: Serializers/Deserializers >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17157.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17543) Enable PerfCliDriver for HoS
[ https://issues.apache.org/jira/browse/HIVE-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sahil Takiar updated HIVE-17543: Attachment: HIVE-17543.3.patch > Enable PerfCliDriver for HoS > > > Key: HIVE-17543 > URL: https://issues.apache.org/jira/browse/HIVE-17543 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Sahil Takiar >Assignee: Sahil Takiar > Attachments: HIVE-17543.1.patch, HIVE-17543.2.patch, > HIVE-17543.3.patch > > > The PerfCliDriver contains .q files for TPC-DS queries. It doesn't actually > run them, but it does generate explains for them. It also tricks HMS into > thinking its a 30 TB TPC-DS dataset so that the explain plan triggers certain > optimizations. > Right now this only runs of Hive-on-Tez, we should enable it for HoS too. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17580) Remove dependency of get_fields_with_environment_context API to serde
[ https://issues.apache.org/jira/browse/HIVE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179431#comment-16179431 ] Alan Gates commented on HIVE-17580: --- I'm +1 on using the SARGs for this, but I think we will have to continue to provide the current version of get_fields_with_environment_context for backwards compatibility. So we should find something that works for it as well. It makes sense to do that first and add a SARGs version of the call later. > Remove dependency of get_fields_with_environment_context API to serde > - > > Key: HIVE-17580 > URL: https://issues.apache.org/jira/browse/HIVE-17580 > Project: Hive > Issue Type: Sub-task > Components: Metastore >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > > {{get_fields_with_environment_context}} metastore API uses {{Deserializer}} > class to access the fields metadata for the cases where it is stored along > with the data files (avro tables). The problem is Deserializer classes is > defined in hive-serde module and in order to make metastore independent of > Hive we will have to remove this dependency (atleast we should change it to > runtime dependency instead of compile time). > The other option is investigate if we can use SearchArgument to provide this > functionality. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch
[ https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179319#comment-16179319 ] Ashutosh Chauhan commented on HIVE-17568: - +1 > HiveJoinPushTransitivePredicatesRule may exchange predicates which are not > valid on the other branch > > > Key: HIVE-17568 > URL: https://issues.apache.org/jira/browse/HIVE-17568 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, > HIVE-17568.03.patch > > > Joining 2 tables on at least 1 column which is not the same type ; > (integer/double for example). > The calcite expressions require double/integer inputs which will became > invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other > branch. > query: > {code} > create table t1 (v string, k int); > insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30); > create table t2 (v string, k double); > insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30); > select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and > t1.k<15; > {code} > results in: > {code} > java.lang.AssertionError: type mismatch: > type1: > DOUBLE > type2: > INTEGER > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919) > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) > at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153) > at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882) > at org.apache.calcite.rex.RexCall.accept(RexCall.java:104) > at > org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296) > at > org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271) > at > org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98) > at > org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67) > [...] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17519) Transpose column stats display
[ https://issues.apache.org/jira/browse/HIVE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179192#comment-16179192 ] Hive QA commented on HIVE-17519: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/1271/HIVE-17519.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11055 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join20] (batchId=85) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_with_constraints] (batchId=66) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[escape_comments] (batchId=73) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[show_indexes_syntax] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[unicode_comments] (batchId=37) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_opt_vectorization] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver (batchId=242) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6976/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6976/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6976/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 1271 - PreCommit-HIVE-Build > Transpose column stats display > -- > > Key: HIVE-17519 > URL: https://issues.apache.org/jira/browse/HIVE-17519 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17519.01.patch, HIVE-17519.02.patch, > HIVE-17519.03.patch > > > currently {{describe formatted table1 insert_num}} shows the column > informations in a table like format...which is very hard to read - because > there are to many columns > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment bitVector > > > insert_numint > > > from deserializer > {code} > I think it would be better to show the same information like this: > {code} > col_name insert_num > data_type int > min > max > num_nulls > distinct_count > avg_col_len > max_col_len > num_trues > num_falses > comment from deserializer > bitVector > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17544) Add Parsed Tree as input for Authorization
[ https://issues.apache.org/jira/browse/HIVE-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179186#comment-16179186 ] Na Li commented on HIVE-17544: -- [Object [type=DATABASE, name=default], Object [type=TABLE_OR_VIEW, name=default.t10]] This is from create table t10(x int); > Add Parsed Tree as input for Authorization > -- > > Key: HIVE-17544 > URL: https://issues.apache.org/jira/browse/HIVE-17544 > Project: Hive > Issue Type: Task > Components: Authorization >Affects Versions: 2.1.1 >Reporter: Na Li >Assignee: Aihua Xu >Priority: Critical > > Right now, for authorization 2, the > HiveAuthorizationValidator.checkPrivileges(HiveOperationType var1, > List var2, List var3, > HiveAuthzContext var4) does not contain the parsed sql command string as > input. Therefore, Sentry has to parse the command again. > The API should be changed to include the parsed result as input, so Sentry > does not need to parse the sql command string again. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17594) Unit format error in Copy.java
[ https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179084#comment-16179084 ] Hive QA commented on HIVE-17594: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888784/HIVE-17594.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11061 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6975/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6975/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6975/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888784 - PreCommit-HIVE-Build > Unit format error in Copy.java > -- > > Key: HIVE-17594 > URL: https://issues.apache.org/jira/browse/HIVE-17594 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 3.0.0 >Reporter: Saijin Huang >Assignee: Saijin Huang >Priority: Minor > Attachments: HIVE-17594.1.patch > > > In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual > value "rows/elapsed/1000.0". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17519) Transpose column stats display
[ https://issues.apache.org/jira/browse/HIVE-17519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17519: Attachment: HIVE-17519.03.patch #3) fix minor issues: do not format output for jdbc clients - beeline was showing padded outputs, I'm not sure what format would be desired for beeline.. > Transpose column stats display > -- > > Key: HIVE-17519 > URL: https://issues.apache.org/jira/browse/HIVE-17519 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17519.01.patch, HIVE-17519.02.patch, > HIVE-17519.03.patch > > > currently {{describe formatted table1 insert_num}} shows the column > informations in a table like format...which is very hard to read - because > there are to many columns > {code} > # col_namedata_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment bitVector > > > insert_numint > > > from deserializer > {code} > I think it would be better to show the same information like this: > {code} > col_name insert_num > data_type int > min > max > num_nulls > distinct_count > avg_col_len > max_col_len > num_trues > num_falses > comment from deserializer > bitVector > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17299) Cannot validate SerDe even if it is in Hadoop classpath
[ https://issues.apache.org/jira/browse/HIVE-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179019#comment-16179019 ] Zoltan Haindrich commented on HIVE-17299: - [~bartimeux] yes...I think if you are using Spark-on-Hive then spark will be in control of the classpath-s so I think they might have better insights about what could possibly go wrong in this case. > Cannot validate SerDe even if it is in Hadoop classpath > --- > > Key: HIVE-17299 > URL: https://issues.apache.org/jira/browse/HIVE-17299 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 > Environment: HADOOP_CLASSPATH : > /usr/hdp/2.3.4.0-3485/atlas/hook/hive/*:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-server-extensions-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/webhcat/java-client/hive-webhcat-java-client-1.2.1.2.3.4.0-3485.jar > 2017-08-10 15:26:38,924 INFO [main]: zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client >
[jira] [Assigned] (HIVE-17598) HS2/HPL Integration: Output wrapper class
[ https://issues.apache.org/jira/browse/HIVE-17598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-17598: - Assignee: Dmitry Tolpeko > HS2/HPL Integration: Output wrapper class > - > > Key: HIVE-17598 > URL: https://issues.apache.org/jira/browse/HIVE-17598 > Project: Hive > Issue Type: Sub-task > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > > When running in CLI mode, HPL/SQL outputs the final results to stdout, and > now when running in embedded mode it has to put them into a result set to be > further consumed by HiveServer2 clients. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17597) HS2/HPL Integration: Avoid direct JDBC calls in HPL/SQL
[ https://issues.apache.org/jira/browse/HIVE-17597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-17597: - > HS2/HPL Integration: Avoid direct JDBC calls in HPL/SQL > --- > > Key: HIVE-17597 > URL: https://issues.apache.org/jira/browse/HIVE-17597 > Project: Hive > Issue Type: Sub-task > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > > HPL/SQL currently uses JDBC to interact with Hive through HiveServer2. This > option will remain for standalone mode (CLI mode), but when HPL/SQL is used > within HiveServer2 it will use internal Hive API for database access. This > task is to refactor JDBC API calls used in HP/SQL classes and move them to > wrapper classes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17299) Cannot validate SerDe even if it is in Hadoop classpath
[ https://issues.apache.org/jira/browse/HIVE-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179007#comment-16179007 ] Loïc C. Chanel commented on HIVE-17299: --- Well, as I'm using Hive under Spark, I'm not sure of waht I should do to reproduce the issue, but SerDe is a part of Hive, isn't it ? Still, if you think the issue isn't related to Hive as an execution engine for data requests I can migrate it to Spark Jira. > Cannot validate SerDe even if it is in Hadoop classpath > --- > > Key: HIVE-17299 > URL: https://issues.apache.org/jira/browse/HIVE-17299 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 1.2.1 > Environment: HADOOP_CLASSPATH : > /usr/hdp/2.3.4.0-3485/atlas/hook/hive/*:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/hcatalog/hive-hcatalog-server-extensions-1.2.1.2.3.4.0-3485.jar:/usr/hdp/2.3.4.0-3485/hive-hcatalog/share/webhcat/java-client/hive-webhcat-java-client-1.2.1.2.3.4.0-3485.jar > 2017-08-10 15:26:38,924 INFO [main]: zookeeper.ZooKeeper > (Environment.java:logEnv(100)) - Client >
[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch
[ https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179005#comment-16179005 ] Zoltan Haindrich commented on HIVE-17568: - failures are unrelated. relying on the sqltype prevents the regression. [~ashutoshc] could you take another look? > HiveJoinPushTransitivePredicatesRule may exchange predicates which are not > valid on the other branch > > > Key: HIVE-17568 > URL: https://issues.apache.org/jira/browse/HIVE-17568 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, > HIVE-17568.03.patch > > > Joining 2 tables on at least 1 column which is not the same type ; > (integer/double for example). > The calcite expressions require double/integer inputs which will became > invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other > branch. > query: > {code} > create table t1 (v string, k int); > insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30); > create table t2 (v string, k double); > insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30); > select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and > t1.k<15; > {code} > results in: > {code} > java.lang.AssertionError: type mismatch: > type1: > DOUBLE > type2: > INTEGER > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919) > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) > at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153) > at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882) > at org.apache.calcite.rex.RexCall.accept(RexCall.java:104) > at > org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296) > at > org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271) > at > org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98) > at > org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67) > [...] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17596) HiveServer2 and HPL/SQL Integration
[ https://issues.apache.org/jira/browse/HIVE-17596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-17596: -- Attachment: HiveServer2 and HPLSQL Integration.pdf > HiveServer2 and HPL/SQL Integration > --- > > Key: HIVE-17596 > URL: https://issues.apache.org/jira/browse/HIVE-17596 > Project: Hive > Issue Type: New Feature > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > Attachments: HiveServer2 and HPLSQL Integration.pdf > > > The main task for HiveServer2 and HPL/SQL integration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17596) HiveServer2 and HPL/SQL Integration
[ https://issues.apache.org/jira/browse/HIVE-17596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-17596: - > HiveServer2 and HPL/SQL Integration > --- > > Key: HIVE-17596 > URL: https://issues.apache.org/jira/browse/HIVE-17596 > Project: Hive > Issue Type: New Feature > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko > > The main task for HiveServer2 and HPL/SQL integration. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch
[ https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178990#comment-16178990 ] Hive QA commented on HIVE-17568: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/1207/HIVE-17568.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11062 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235) org.apache.hive.jdbc.TestJdbcDriver2.testSelectExecAsync2 (batchId=226) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6974/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6974/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6974/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 1207 - PreCommit-HIVE-Build > HiveJoinPushTransitivePredicatesRule may exchange predicates which are not > valid on the other branch > > > Key: HIVE-17568 > URL: https://issues.apache.org/jira/browse/HIVE-17568 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, > HIVE-17568.03.patch > > > Joining 2 tables on at least 1 column which is not the same type ; > (integer/double for example). > The calcite expressions require double/integer inputs which will became > invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other > branch. > query: > {code} > create table t1 (v string, k int); > insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30); > create table t2 (v string, k double); > insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30); > select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and > t1.k<15; > {code} > results in: > {code} > java.lang.AssertionError: type mismatch: > type1: > DOUBLE > type2: > INTEGER > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919) > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) > at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153) > at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882) > at org.apache.calcite.rex.RexCall.accept(RexCall.java:104) > at > org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296) > at > org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271) > at > org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98) > at > org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67) > [...] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17483) HS2 kill command to kill queries using query id
[ https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178858#comment-16178858 ] Hive QA commented on HIVE-17483: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12888793/HIVE-17483.8.patch {color:green}SUCCESS:{color} +1 due to 7 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 11068 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=231) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=231) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[union_fast_stats] (batchId=156) org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1] (batchId=170) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=235) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=235) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=202) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testJoinThriftSerializeInTasks (batchId=228) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testMetadataQueriesWithSerializeThriftInTasks (batchId=228) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation (batchId=228) org.apache.hive.jdbc.TestJdbcWithMiniHS2.testParallelCompilation2 (batchId=228) org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle (batchId=223) org.apache.hive.service.cli.session.TestQueryDisplay.testQueryDisplay (batchId=223) org.apache.hive.service.cli.session.TestQueryDisplay.testWebUI (batchId=223) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6973/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6973/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6973/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12888793 - PreCommit-HIVE-Build > HS2 kill command to kill queries using query id > --- > > Key: HIVE-17483 > URL: https://issues.apache.org/jira/browse/HIVE-17483 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Teddy Choi > Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, > HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, > HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch > > > For administrators, it is important to be able to kill queries if required. > Currently, there is no clean way to do it. > It would help to have a "kill query " command that can be run using > odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid > running in that instance. > Authorization will have to be done to ensure that the user that is invoking > the API is allowed to perform this action. > In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17595) Correct DAG for updating the last.repl.id for a database during bootstrap load
[ https://issues.apache.org/jira/browse/HIVE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek reassigned HIVE-17595: -- > Correct DAG for updating the last.repl.id for a database during bootstrap load > -- > > Key: HIVE-17595 > URL: https://issues.apache.org/jira/browse/HIVE-17595 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > > We update the last.repl.id as a database property. This is done after all the > bootstrap tasks to load the relevant data are done and is the last task to be > run. however we are currently not setting up the DAG correctly for this task. > This is getting added as the root task for now where as it should be the last > task to be run in a DAG. This becomes more important after the inclusion of > HIVE-17426 since this will lead to parallel execution and incorrect DAG's > will lead to incorrect results/state of the system. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch
[ https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich updated HIVE-17568: Attachment: HIVE-17568.03.patch #3) add also {{getSqlTypeName()}} comparision > HiveJoinPushTransitivePredicatesRule may exchange predicates which are not > valid on the other branch > > > Key: HIVE-17568 > URL: https://issues.apache.org/jira/browse/HIVE-17568 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch, > HIVE-17568.03.patch > > > Joining 2 tables on at least 1 column which is not the same type ; > (integer/double for example). > The calcite expressions require double/integer inputs which will became > invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other > branch. > query: > {code} > create table t1 (v string, k int); > insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30); > create table t2 (v string, k double); > insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30); > select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and > t1.k<15; > {code} > results in: > {code} > java.lang.AssertionError: type mismatch: > type1: > DOUBLE > type2: > INTEGER > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919) > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) > at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153) > at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882) > at org.apache.calcite.rex.RexCall.accept(RexCall.java:104) > at > org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296) > at > org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271) > at > org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98) > at > org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67) > [...] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17568) HiveJoinPushTransitivePredicatesRule may exchange predicates which are not valid on the other branch
[ https://issues.apache.org/jira/browse/HIVE-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178773#comment-16178773 ] Zoltan Haindrich commented on HIVE-17568: - the problem arises from the fact that the two types differ in nullability: {code} import static org.junit.Assert.assertEquals; import org.apache.calcite.jdbc.JavaTypeFactoryImpl; import org.apache.calcite.rel.type.RelDataType; import org.apache.calcite.sql.type.SqlTypeName; import org.junit.Ignore; import org.junit.Test; public class CalciteTypeCompare { JavaTypeFactoryImpl typeFactory = new JavaTypeFactoryImpl(); RelDataType b0 = typeFactory.builder().add("t", SqlTypeName.BOOLEAN).nullable(true).build(); RelDataType b1 = typeFactory.builder().add("t", SqlTypeName.BOOLEAN).nullable(false).build(); RelDataType b1x = typeFactory.builder().add("x", SqlTypeName.BOOLEAN).nullable(false).build(); @Test @Ignore("this test case will fail; because these types are different") public void compareTypesIgnoringNullability() { assertEquals(b0, b1); } @Test public void typeSqlNameEquals() { assertEquals(b0.getSqlTypeName(), b1.getSqlTypeName()); } @Test public void typeSqlNameEqualsIgnoresFieldName() { assertEquals(b0.getSqlTypeName(), b1x.getSqlTypeName()); } } {code} I complement the patch with an addition check for the {{sqlTypeName()}} > HiveJoinPushTransitivePredicatesRule may exchange predicates which are not > valid on the other branch > > > Key: HIVE-17568 > URL: https://issues.apache.org/jira/browse/HIVE-17568 > Project: Hive > Issue Type: Bug > Components: Logical Optimizer >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17568.01.patch, HIVE-17568.02.patch > > > Joining 2 tables on at least 1 column which is not the same type ; > (integer/double for example). > The calcite expressions require double/integer inputs which will became > invalid if {{HiveJoinPushTransitivePredicatesRule}} pushes them to the other > branch. > query: > {code} > create table t1 (v string, k int); > insert into t1 values ('people', 10), ('strangers', 20), ('parents', 30); > create table t2 (v string, k double); > insert into t2 values ('people', 10), ('strangers', 20), ('parents', 30); > select * from t1 where t1.k in (select t2.k from t2 where t2.v='people') and > t1.k<15; > {code} > results in: > {code} > java.lang.AssertionError: type mismatch: > type1: > DOUBLE > type2: > INTEGER > at org.apache.calcite.util.Litmus$1.fail(Litmus.java:31) > at org.apache.calcite.plan.RelOptUtil.eq(RelOptUtil.java:1841) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:941) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterInputShuttle.visitInputRef(RexProgramBuilder.java:919) > at org.apache.calcite.rex.RexInputRef.accept(RexInputRef.java:112) > at org.apache.calcite.rex.RexShuttle.visitList(RexShuttle.java:153) > at org.apache.calcite.rex.RexShuttle.visitCall(RexShuttle.java:102) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:884) > at > org.apache.calcite.rex.RexProgramBuilder$RegisterShuttle.visitCall(RexProgramBuilder.java:882) > at org.apache.calcite.rex.RexCall.accept(RexCall.java:104) > at > org.apache.calcite.rex.RexProgramBuilder.registerInput(RexProgramBuilder.java:296) > at > org.apache.calcite.rex.RexProgramBuilder.addCondition(RexProgramBuilder.java:271) > at > org.apache.calcite.rel.rules.FilterMergeRule.createProgram(FilterMergeRule.java:98) > at > org.apache.calcite.rel.rules.FilterMergeRule.onMatch(FilterMergeRule.java:67) > [...] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17594) Unit format error in Copy.java
[ https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178768#comment-16178768 ] Saijin Huang edited comment on HIVE-17594 at 9/25/17 9:25 AM: -- [~dmtolpeko], i repoduce the problem according to the test file copy_to_file.sql. -- reporduce: before modification 1.hive -e "create table src(id int);insert into src values(2)" 2.hplsql -e "copy src to src.txt;" 3.the result is " Ln:1 Query executed: 1 columns, output file: src.txt Ln:1 COPY completed: 1 row(s), 2 bytes, 55 ms, 0 rows/sec " the speed is not correct. after modification 1.hive -e "create table src(id int);insert into src values(2)" 2.hplsql -e "copy src to src.txt;" 3.the result is " Ln:1 Query executed: 1 columns, output file: src.txt Ln:1 COPY completed: 1 row(s), 2 bytes, 457 ms, 2.19 rows/sec " the speed is correct. was (Author: txhsj): [~dmtolpeko], i repoduce the problem according to the test file copy_to_file.sql. -- reporduce: before modification 1.hive -e "create table src(id int);insert into src values(2)" 2.hplsql -e "copy src to src.txt;" 3.the result is " Ln:1 Query executed: 1 columns, output file: src.txt Ln:1 COPY completed: 1 row(s), 2 bytes, 55 ms, 0 rows/sec " the speed is not correct. after modification 1.hive -e "create table src(id int);insert into src values(2)" 2.hplsql -e "copy src to src.txt;" 3.the result is " Ln:1 Query executed: 1 columns, output file: src1.txt Ln:1 COPY completed: 1 row(s), 2 bytes, 457 ms, 2.19 rows/sec " the speed is correct. > Unit format error in Copy.java > -- > > Key: HIVE-17594 > URL: https://issues.apache.org/jira/browse/HIVE-17594 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 3.0.0 >Reporter: Saijin Huang >Assignee: Saijin Huang >Priority: Minor > Attachments: HIVE-17594.1.patch > > > In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual > value "rows/elapsed/1000.0". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17594) Unit format error in Copy.java
[ https://issues.apache.org/jira/browse/HIVE-17594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178768#comment-16178768 ] Saijin Huang commented on HIVE-17594: - [~dmtolpeko], i repoduce the problem according to the test file copy_to_file.sql. -- reporduce: before modification 1.hive -e "create table src(id int);insert into src values(2)" 2.hplsql -e "copy src to src.txt;" 3.the result is " Ln:1 Query executed: 1 columns, output file: src.txt Ln:1 COPY completed: 1 row(s), 2 bytes, 55 ms, 0 rows/sec " the speed is not correct. after modification 1.hive -e "create table src(id int);insert into src values(2)" 2.hplsql -e "copy src to src.txt;" 3.the result is " Ln:1 Query executed: 1 columns, output file: src1.txt Ln:1 COPY completed: 1 row(s), 2 bytes, 457 ms, 2.19 rows/sec " the speed is correct. > Unit format error in Copy.java > -- > > Key: HIVE-17594 > URL: https://issues.apache.org/jira/browse/HIVE-17594 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 3.0.0 >Reporter: Saijin Huang >Assignee: Saijin Huang >Priority: Minor > Attachments: HIVE-17594.1.patch > > > In Copy.java,line 273,the unit "rows/sec" is inconsistent with the actual > value "rows/elapsed/1000.0". -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17452) HPL/SQL function variable block is not initialized
[ https://issues.apache.org/jira/browse/HIVE-17452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-17452: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) > HPL/SQL function variable block is not initialized > -- > > Key: HIVE-17452 > URL: https://issues.apache.org/jira/browse/HIVE-17452 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17452.1.patch > > > Variable inside declaration block are not initialized: > {code} > CREATE FUNCTION test1() > RETURNS STRING > AS > ret string DEFAULT 'Initial value'; > BEGIN > print(ret); > ret := 'VALUE IS SET'; > print(ret); > END; > test1(); > {code} > Output: > {code} > ret > VALUE IS SET > {code} > Should be: > {code} > Initial value > VALUE IS SET > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17452) HPL/SQL function variable block is not initialized
[ https://issues.apache.org/jira/browse/HIVE-17452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178736#comment-16178736 ] Dmitry Tolpeko commented on HIVE-17452: --- Committed. > HPL/SQL function variable block is not initialized > -- > > Key: HIVE-17452 > URL: https://issues.apache.org/jira/browse/HIVE-17452 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Dmitry Tolpeko >Assignee: Dmitry Tolpeko >Priority: Critical > Attachments: HIVE-17452.1.patch > > > Variable inside declaration block are not initialized: > {code} > CREATE FUNCTION test1() > RETURNS STRING > AS > ret string DEFAULT 'Initial value'; > BEGIN > print(ret); > ret := 'VALUE IS SET'; > print(ret); > END; > test1(); > {code} > Output: > {code} > ret > VALUE IS SET > {code} > Should be: > {code} > Initial value > VALUE IS SET > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17590) upgrade hadoop to 2.8.1
[ https://issues.apache.org/jira/browse/HIVE-17590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178733#comment-16178733 ] Peter Vary commented on HIVE-17590: --- +1 > upgrade hadoop to 2.8.1 > --- > > Key: HIVE-17590 > URL: https://issues.apache.org/jira/browse/HIVE-17590 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > Attachments: HIVE-17590.01.patch, HIVE-17590.01.patch > > > seems like hadoop 2.8.0 has no source attachment: > http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.8.0/ > however > http://central.maven.org/maven2/org/apache/hadoop/hadoop-common/2.8.1/ > has source.jar-s -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id
[ https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-17483: -- Attachment: HIVE-17483.8.patch > HS2 kill command to kill queries using query id > --- > > Key: HIVE-17483 > URL: https://issues.apache.org/jira/browse/HIVE-17483 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Teddy Choi > Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, > HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, > HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch > > > For administrators, it is important to be able to kill queries if required. > Currently, there is no clean way to do it. > It would help to have a "kill query " command that can be run using > odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid > running in that instance. > Authorization will have to be done to ensure that the user that is invoking > the API is allowed to perform this action. > In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id
[ https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-17483: -- Attachment: (was: HIVE-17483.8.patch) > HS2 kill command to kill queries using query id > --- > > Key: HIVE-17483 > URL: https://issues.apache.org/jira/browse/HIVE-17483 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Teddy Choi > Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, > HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, > HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch > > > For administrators, it is important to be able to kill queries if required. > Currently, there is no clean way to do it. > It would help to have a "kill query " command that can be run using > odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid > running in that instance. > Authorization will have to be done to ensure that the user that is invoking > the API is allowed to perform this action. > In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17483) HS2 kill command to kill queries using query id
[ https://issues.apache.org/jira/browse/HIVE-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teddy Choi updated HIVE-17483: -- Attachment: HIVE-17483.8.patch > HS2 kill command to kill queries using query id > --- > > Key: HIVE-17483 > URL: https://issues.apache.org/jira/browse/HIVE-17483 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Thejas M Nair >Assignee: Teddy Choi > Attachments: HIVE-17483.1.patch, HIVE-17483.2.patch, > HIVE-17483.2.patch, HIVE-17483.3.patch, HIVE-17483.4.patch, > HIVE-17483.5.patch, HIVE-17483.6.patch, HIVE-17483.7.patch, HIVE-17483.8.patch > > > For administrators, it is important to be able to kill queries if required. > Currently, there is no clean way to do it. > It would help to have a "kill query " command that can be run using > odbc/jdbc against a HiveServer2 instance, to kill a query with that queryid > running in that instance. > Authorization will have to be done to ensure that the user that is invoking > the API is allowed to perform this action. > In case of SQL std authorization, this would require admin role. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17593) DataWritableWriter strip spaces for CHAR type before writing, but predicate generator doesn't do same thing.
[ https://issues.apache.org/jira/browse/HIVE-17593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16178697#comment-16178697 ] Junjie Chen commented on HIVE-17593: hive strip spaces for char(lengh) type, and then store value to parquet. Other parquet reader may read striped value which is different from original. public void write(Object value) { String v = inspector.getPrimitiveJavaObject(value).getStrippedValue(); recordConsumer.addBinary(Binary.fromString(v)); } [~Ferd], do you think this is a valid case? Shouldn't it store the real value? > DataWritableWriter strip spaces for CHAR type before writing, but predicate > generator doesn't do same thing. > > > Key: HIVE-17593 > URL: https://issues.apache.org/jira/browse/HIVE-17593 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Junjie Chen > > DataWritableWriter strip spaces for CHAR type before writing. While when > generating predicate, it does NOT do same striping which should cause data > missing! > In current version, it doesn't cause data missing since predicate is not well > push down to parquet due to HIVE-17261. > Please see ConvertAstTosearchArg.java, getTypes treats CHAR and STRING as > same which will build a predicate with tail spaces. -- This message was sent by Atlassian JIRA (v6.4.14#64029)