[jira] [Commented] (HIVE-9334) PredicateTransitivePropagate optimizer should run after PredicatePushDown
[ https://issues.apache.org/jira/browse/HIVE-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272859#comment-14272859 ] Hive QA commented on HIVE-9334: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12691521/HIVE-9334.1.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6747 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2332/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2332/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2332/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12691521 - PreCommit-HIVE-TRUNK-Build PredicateTransitivePropagate optimizer should run after PredicatePushDown - Key: HIVE-9334 URL: https://issues.apache.org/jira/browse/HIVE-9334 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-9334.1.patch, HIVE-9334.patch This way PredicateTransitivePropagate will be more effective as it has more filters to push for other branches of joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272855#comment-14272855 ] Lefty Leverenz commented on HIVE-7209: -- Doc: *hive.security.metastore.authorization.manager* has been updated in the wiki, so I'm removing the TODOC14 label. (Additional documentation will be covered with HIVE-7759, as mentioned two comments back.) * [Configuration Properties -- hive.security.metastore.authorization.manager | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.security.metastore.authorization.manager] allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.14.0 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7209) allow metastore authorization api calls to be restricted to certain invokers
[ https://issues.apache.org/jira/browse/HIVE-7209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7209: - Labels: (was: TODOC14) allow metastore authorization api calls to be restricted to certain invokers Key: HIVE-7209 URL: https://issues.apache.org/jira/browse/HIVE-7209 Project: Hive Issue Type: Bug Components: Authentication, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.14.0 Attachments: HIVE-7209.1.patch, HIVE-7209.2.patch, HIVE-7209.3.patch, HIVE-7209.4.patch Any user who has direct access to metastore can make metastore api calls that modify the authorization policy. The users who can make direct metastore api calls in a secure cluster configuration are usually the 'cluster insiders' such as Pig and MR users, who are not (securely) covered by the metastore based authorization policy. But it makes sense to disallow access from such users as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7397) Set the default threshold for fetch task conversion to 1Gb
[ https://issues.apache.org/jira/browse/HIVE-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14272858#comment-14272858 ] Lefty Leverenz commented on HIVE-7397: -- Doc: The defaults for *hive.fetch.task.conversion* and *hive.fetch.task.conversion.threshold* have been updated in the wiki, so I'm removing the TODOC14 label. * [hive.fetch.task.conversion | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.fetch.task.conversion] * [hive.fetch.task.conversion.threshold | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.fetch.task.conversion.threshold] Set the default threshold for fetch task conversion to 1Gb -- Key: HIVE-7397 URL: https://issues.apache.org/jira/browse/HIVE-7397 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Gopal V Assignee: Gopal V Labels: Performance Fix For: 0.14.0 Attachments: HIVE-7397.1.patch, HIVE-7397.2.patch, HIVE-7397.3.patch, HIVE-7397.4.patch.txt, HIVE-7397.5.patch, HIVE-7397.6.patch.txt Currently, modifying the value of hive.fetch.task.conversion to more results in a dangerous setting where small scale queries function, but large scale queries crash. This occurs because the default threshold of -1 means apply this optimization for a petabyte table. I am testing a variety of queries with the setting more (to make it the default option as suggested by HIVE-887) change the default threshold for this feature to a reasonable 1Gb. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7397) Set the default threshold for fetch task conversion to 1Gb
[ https://issues.apache.org/jira/browse/HIVE-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-7397: - Labels: Performance (was: Performance TODOC14) Set the default threshold for fetch task conversion to 1Gb -- Key: HIVE-7397 URL: https://issues.apache.org/jira/browse/HIVE-7397 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Gopal V Assignee: Gopal V Labels: Performance Fix For: 0.14.0 Attachments: HIVE-7397.1.patch, HIVE-7397.2.patch, HIVE-7397.3.patch, HIVE-7397.4.patch.txt, HIVE-7397.5.patch, HIVE-7397.6.patch.txt Currently, modifying the value of hive.fetch.task.conversion to more results in a dangerous setting where small scale queries function, but large scale queries crash. This occurs because the default threshold of -1 means apply this optimization for a petabyte table. I am testing a variety of queries with the setting more (to make it the default option as suggested by HIVE-887) change the default threshold for this feature to a reasonable 1Gb. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Status: Open (was: Patch Available) Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Attachment: HIVE-9039.13.patch update golden files, e.g., union3 Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Status: Patch Available (was: Open) Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7674) Update to Spark 1.2 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-7674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273061#comment-14273061 ] Brock Noland commented on HIVE-7674: Thank you! Since I committed this I updated the wiki. Update to Spark 1.2 [Spark Branch] -- Key: HIVE-7674 URL: https://issues.apache.org/jira/browse/HIVE-7674 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Priority: Blocker Labels: TODOC-SPARK Fix For: spark-branch Attachments: HIVE-7674.1-spark.patch, HIVE-7674.2-spark.patch, HIVE-7674.3-spark.patch In HIVE-8160 we added a custom repo to use Spark 1.2. Once 1.2 is released we need to remove this repo. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 28964: HIVE-8121 Create micro-benchmarks for ParquetSerde and evaluate performance
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28964/#review67616 --- Nice work Sergio!! I know that it doesn't fit perfectly into the JMH model but I think we have to write a non-trival amount of records such as 1000 rows in order to get much benefit. Can we try that? itests/hive-jmh/pom.xml https://reviews.apache.org/r/28964/#comment111684 It looks like in this file 1 tab = 4 spaces whereas in Hive I think we typically say 1 tab = 2 spaces itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java https://reviews.apache.org/r/28964/#comment111683 During class initialization let's create an array of 100 random values for each type and then we can iterate through that array for each call to this method. Otherwise columnar formals will lead to unrealistic comppression for storing the same values over and over. For example both parquet and orc should be able to collapse a column consist of the integer 1 to a trivial amount of data. - Brock Noland On Jan. 9, 2015, 6:38 p.m., Sergio Pena wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28964/ --- (Updated Jan. 9, 2015, 6:38 p.m.) Review request for hive, Brock Noland and cheng xu. Bugs: HIVE-8121 https://issues.apache.org/jira/browse/HIVE-8121 Repository: hive-git Description --- This is a new tool used to test ORC PARQUET file format performance. Diffs - itests/hive-jmh/pom.xml PRE-CREATION itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java PRE-CREATION itests/pom.xml 0a154d6eb8c119e4e6419777c28b59b9d2108ba0 Diff: https://reviews.apache.org/r/28964/diff/ Testing --- Thanks, Sergio Pena
[jira] [Commented] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273075#comment-14273075 ] Hive QA commented on HIVE-9039: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12691565/HIVE-9039.13.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6753 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2333/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2333/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2333/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12691565 - PreCommit-HIVE-TRUNK-Build Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9337) Move more hive.spark.* configurations to HiveConf
Szehon Ho created HIVE-9337: --- Summary: Move more hive.spark.* configurations to HiveConf Key: HIVE-9337 URL: https://issues.apache.org/jira/browse/HIVE-9337 Project: Hive Issue Type: Task Components: Spark Reporter: Szehon Ho Some hive.spark configurations have been added to HiveConf, but there are some like hive.spark.log.dir that are not there. Also some configurations in RpcConfiguration.java might be eligible to be moved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9335) Address review items on HIVE-9257 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9335: --- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Thank you Xuefu! I have committed the patch to spark. Address review items on HIVE-9257 [Spark Branch] Key: HIVE-9335 URL: https://issues.apache.org/jira/browse/HIVE-9335 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Fix For: spark-branch Attachments: HIVE-9335.1-spark.patch, HIVE-9335.2-spark.patch I made a pass through HIVE-9257 and found the following issues: {{HashTableSinkOperator.java}} The fields EMPTY_OBJECT_ARRAY and EMPTY_ROW_CONTAINER are no longer constants and should not be in upper case. {{HivePairFlatMapFunction.java}} We share NumberFormat accross threads and it's not thread safe. {{KryoSerializer.java}} we eat the stack trace in deserializeJobConf {{SparkMapRecordHandler}} in processRow we should not be using {{StringUtils.stringifyException}} since LOG can handle stack traces. in close: {noformat} // signal new failure to map-reduce LOG.error(Hit error while closing operators - failing tree); throw new IllegalStateException(Error while closing operators, e); {noformat} Should be: {noformat} String msg = Error while closing operators: + e; throw new IllegalStateException(msg, e); {noformat} {{SparkSessionManagerImpl}} - the method {{canReuseSession}} is useless {{GenSparkSkewJoinProcessor}} {noformat} + // keep it as reference in case we need fetch work +//localPlan.getAliasToFetchWork().put(small_alias.toString(), +//new FetchWork(tblDir, tableDescList.get(small_alias))); {noformat} {{GenSparkWorkWalker}} trim ws {{SparkCompiler}} remote init {{SparkEdgeProperty}} trim ws {{CounterStatsPublisher}} eat exception {{Hadoop23Shims}} unused import of {{ResourceBundles}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8181) Upgrade JavaEWAH version to allow for unsorted bitset creation
[ https://issues.apache.org/jira/browse/HIVE-8181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273202#comment-14273202 ] Hive QA commented on HIVE-8181: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12691583/HIVE-8181.4.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7310 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2334/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2334/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2334/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12691583 - PreCommit-HIVE-TRUNK-Build Upgrade JavaEWAH version to allow for unsorted bitset creation -- Key: HIVE-8181 URL: https://issues.apache.org/jira/browse/HIVE-8181 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.14.0, 0.13.1 Reporter: Gopal V Assignee: Navis Attachments: HIVE-8181.1.patch, HIVE-8181.2.patch.txt, HIVE-8181.3.patch.txt, HIVE-8181.4.patch.txt JavaEWAH has removed the restriction that bitsets can only be set in order in the latest release. Currently the use of {{ewah_bitmap}} UDAF requires a {{SORT BY}}. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Can't set bits out of order with EWAHCompressedBitmap at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:824) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249) ... 7 more Caused by: java.lang.RuntimeException: Can't set bits out of order with EWAHCompressedBitmap at {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8181) Upgrade JavaEWAH version to allow for unsorted bitset creation
[ https://issues.apache.org/jira/browse/HIVE-8181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8181: Attachment: HIVE-8181.4.patch.txt Updated gold file. But cannot reproduce fail of udaf_percentile_approx_23 Upgrade JavaEWAH version to allow for unsorted bitset creation -- Key: HIVE-8181 URL: https://issues.apache.org/jira/browse/HIVE-8181 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.14.0, 0.13.1 Reporter: Gopal V Assignee: Navis Attachments: HIVE-8181.1.patch, HIVE-8181.2.patch.txt, HIVE-8181.3.patch.txt, HIVE-8181.4.patch.txt JavaEWAH has removed the restriction that bitsets can only be set in order in the latest release. Currently the use of {{ewah_bitmap}} UDAF requires a {{SORT BY}}. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Can't set bits out of order with EWAHCompressedBitmap at org.apache.hadoop.hive.ql.exec.GroupByOperator.processOp(GroupByOperator.java:824) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.ExtractOperator.processOp(ExtractOperator.java:45) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:249) ... 7 more Caused by: java.lang.RuntimeException: Can't set bits out of order with EWAHCompressedBitmap at {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9336) Fix Hive throws ParseException while handling Grouping-Sets clauses
[ https://issues.apache.org/jira/browse/HIVE-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaohm3 updated HIVE-9336: -- Status: Patch Available (was: Open) Fix Hive throws ParseException while handling Grouping-Sets clauses --- Key: HIVE-9336 URL: https://issues.apache.org/jira/browse/HIVE-9336 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.13.1 Reporter: zhaohm3 Fix For: 0.14.0 Currently, when Hive parses GROUPING SETS clauses, and if there are some expressions that were composed of two or more common subexpressions, then the first element of those expressions can only be a simple Identifier without any qualifications, otherwise Hive will throw ParseException during its parser stage. Therefore, Hive will throw ParseException while parsing the following HQLs: drop table test; create table test(tc1 int, tc2 int, tc3 int); explain select test.tc1, test.tc2 from test group by test.tc1, test.tc2 grouping sets(test.tc1, (test.tc1, test.tc2)); explain select tc1+tc2, tc2 from test group by tc1+tc2, tc2 grouping sets(tc2, (tc1 + tc2, tc2)); drop table test; The following contents show some ParseExctption stacktrace: 2015-01-07 09:53:34,718 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,719 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,721 INFO [main]: ql.Driver (Driver.java:checkConcurrency(158)) - Concurrency mode is disabled, not creating a lock manager 2015-01-07 09:53:34,721 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,724 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,724 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: explain select test.tc1, test.tc2 from test group by test.tc1, test.tc2 grouping sets(test.tc1, (test.tc1, test.tc2)) 2015-01-07 09:53:34,734 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: ParseException line 1:105 missing ) at ',' near 'EOF' line 1:116 extraneous input ')' expecting EOF near 'EOF' org.apache.hadoop.hive.ql.parse.ParseException: line 1:105 missing ) at ',' near 'EOF' line 1:116 extraneous input ')' expecting EOF near 'EOF' at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:210) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) 2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=compile start=1420595614721 end=1420595614745 duration=24 from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=releaseLocks start=1420595614745 end=1420595614746 duration=1 from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver 2015-01-07
[jira] [Commented] (HIVE-9335) Address review items on HIVE-9257 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273094#comment-14273094 ] Hive QA commented on HIVE-9335: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12691570/HIVE-9335.2-spark.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7301 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection org.apache.hive.hcatalog.streaming.TestStreaming.testInterleavedTransactionBatchCommits org.apache.hive.hcatalog.streaming.TestStreaming.testMultipleTransactionBatchCommits {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/631/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/631/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-631/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12691570 - PreCommit-HIVE-SPARK-Build Address review items on HIVE-9257 [Spark Branch] Key: HIVE-9335 URL: https://issues.apache.org/jira/browse/HIVE-9335 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9335.1-spark.patch, HIVE-9335.2-spark.patch I made a pass through HIVE-9257 and found the following issues: {{HashTableSinkOperator.java}} The fields EMPTY_OBJECT_ARRAY and EMPTY_ROW_CONTAINER are no longer constants and should not be in upper case. {{HivePairFlatMapFunction.java}} We share NumberFormat accross threads and it's not thread safe. {{KryoSerializer.java}} we eat the stack trace in deserializeJobConf {{SparkMapRecordHandler}} in processRow we should not be using {{StringUtils.stringifyException}} since LOG can handle stack traces. in close: {noformat} // signal new failure to map-reduce LOG.error(Hit error while closing operators - failing tree); throw new IllegalStateException(Error while closing operators, e); {noformat} Should be: {noformat} String msg = Error while closing operators: + e; throw new IllegalStateException(msg, e); {noformat} {{SparkSessionManagerImpl}} - the method {{canReuseSession}} is useless {{GenSparkSkewJoinProcessor}} {noformat} + // keep it as reference in case we need fetch work +//localPlan.getAliasToFetchWork().put(small_alias.toString(), +//new FetchWork(tblDir, tableDescList.get(small_alias))); {noformat} {{GenSparkWorkWalker}} trim ws {{SparkCompiler}} remote init {{SparkEdgeProperty}} trim ws {{CounterStatsPublisher}} eat exception {{Hadoop23Shims}} unused import of {{ResourceBundles}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7122) Storage format for create like table
[ https://issues.apache.org/jira/browse/HIVE-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7122: Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks Vasanth kumar RJ for the contribution. Storage format for create like table Key: HIVE-7122 URL: https://issues.apache.org/jira/browse/HIVE-7122 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Vasanth kumar RJ Assignee: Vasanth kumar RJ Fix For: 0.15.0 Attachments: HIVE-7122.1.patch, HIVE-7122.patch Using create like table user can specify the table storage format. Example: create table table1 like table2 stored as ORC; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9337) Move more hive.spark.* configurations to HiveConf
[ https://issues.apache.org/jira/browse/HIVE-9337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273175#comment-14273175 ] Szehon Ho commented on HIVE-9337: - [~chengxiang li] I wonder if you have any thoughts about this? I am not so familiar about the RPCConfiguration.java configurations to know, but I think it would be great to move them as well as 'hive.spark.log.dir' or any other ones, to HiveConf.java for documentation purpose as is consistent with other hive.* properties, if its possible Move more hive.spark.* configurations to HiveConf - Key: HIVE-9337 URL: https://issues.apache.org/jira/browse/HIVE-9337 Project: Hive Issue Type: Task Components: Spark Reporter: Szehon Ho Some hive.spark configurations have been added to HiveConf, but there are some like hive.spark.log.dir that are not there. Also some configurations in RpcConfiguration.java might be eligible to be moved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9340) Address review of HIVE-9257 (ii)
Szehon Ho created HIVE-9340: --- Summary: Address review of HIVE-9257 (ii) Key: HIVE-9340 URL: https://issues.apache.org/jira/browse/HIVE-9340 Project: Hive Issue Type: Task Components: Spark Affects Versions: spark-branch Reporter: Szehon Ho Assignee: Szehon Ho Some minor fixes: 1. Get rid of spark_test.q, which was used to test the sparkCliDriver test fw. 2. Get rid of spark-snapshot repository dep in pom 3. Cleanup ExplainTask to get rid of * in imports. 4. Reorder the scala/spark dependencies in pom to fit the main order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273209#comment-14273209 ] Szehon Ho commented on HIVE-9257: - Thanks I noticed, was collecting those as well as some other items I saw: HIVE-9340 Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC15 Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9340) Address review of HIVE-9257 (ii)
[ https://issues.apache.org/jira/browse/HIVE-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9340: Issue Type: Sub-task (was: Task) Parent: HIVE-7292 Address review of HIVE-9257 (ii) Key: HIVE-9340 URL: https://issues.apache.org/jira/browse/HIVE-9340 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Szehon Ho Assignee: Szehon Ho Some minor fixes: 1. Get rid of spark_test.q, which was used to test the sparkCliDriver test fw. 2. Get rid of spark-snapshot repository dep in pom 3. Cleanup ExplainTask to get rid of * in imports. 4. Reorder the scala/spark dependencies in pom to fit the main order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9336) Fix Hive throws ParseException while handling Grouping-Sets clauses
[ https://issues.apache.org/jira/browse/HIVE-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273152#comment-14273152 ] zhaohm3 commented on HIVE-9336: --- For more details, visit: https://www.zybuluo.com/Spongcer/note/61369 Fix Hive throws ParseException while handling Grouping-Sets clauses --- Key: HIVE-9336 URL: https://issues.apache.org/jira/browse/HIVE-9336 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.13.1 Reporter: zhaohm3 Fix For: 0.14.0 Currently, when Hive parses GROUPING SETS clauses, and if there are some expressions that were composed of two or more common subexpressions, then the first element of those expressions can only be a simple Identifier without any qualifications, otherwise Hive will throw ParseException during its parser stage. Therefore, Hive will throw ParseException while parsing the following HQLs: drop table test; create table test(tc1 int, tc2 int, tc3 int); explain select test.tc1, test.tc2 from test group by test.tc1, test.tc2 grouping sets(test.tc1, (test.tc1, test.tc2)); explain select tc1+tc2, tc2 from test group by tc1+tc2, tc2 grouping sets(tc2, (tc1 + tc2, tc2)); drop table test; The following contents show some ParseExctption stacktrace: 2015-01-07 09:53:34,718 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,719 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,721 INFO [main]: ql.Driver (Driver.java:checkConcurrency(158)) - Concurrency mode is disabled, not creating a lock manager 2015-01-07 09:53:34,721 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,724 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,724 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: explain select test.tc1, test.tc2 from test group by test.tc1, test.tc2 grouping sets(test.tc1, (test.tc1, test.tc2)) 2015-01-07 09:53:34,734 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: ParseException line 1:105 missing ) at ',' near 'EOF' line 1:116 extraneous input ')' expecting EOF near 'EOF' org.apache.hadoop.hive.ql.parse.ParseException: line 1:105 missing ) at ',' near 'EOF' line 1:116 extraneous input ')' expecting EOF near 'EOF' at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:210) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) 2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=compile start=1420595614721 end=1420595614745 duration=24 from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=releaseLocks start=1420595614745 end=1420595614746 duration=1 from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) -
[jira] [Resolved] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho resolved HIVE-9257. - Resolution: Fixed Fix Version/s: 0.15.0 Committed to trunk. Thanks Brock and Xuefu for detailed review! Thanks also to all the contributors to the spark branch for this milestone! Also modified build machine's default properties (trunk-mr2) to the new properties attached (trunk-mr2-spark-merge) which has configurations to run SparkCliDriver tests. Follow-ups will be taken care of in HIVE-9335, and subsequent JIRA's. In terms of doc, there is one property added to HiveConf (hive.spark.client.future.timeout), but there are potentially more (see HIVE-9337). These will have to be added in : [https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2|https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2] Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9335) Address review items on HIVE-9257 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9335: --- Attachment: HIVE-9335.2-spark.patch Address review items on HIVE-9257 [Spark Branch] Key: HIVE-9335 URL: https://issues.apache.org/jira/browse/HIVE-9335 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9335.1-spark.patch, HIVE-9335.2-spark.patch I made a pass through HIVE-9257 and found the following issues: {{HashTableSinkOperator.java}} The fields EMPTY_OBJECT_ARRAY and EMPTY_ROW_CONTAINER are no longer constants and should not be in upper case. {{HivePairFlatMapFunction.java}} We share NumberFormat accross threads and it's not thread safe. {{KryoSerializer.java}} we eat the stack trace in deserializeJobConf {{SparkMapRecordHandler}} in processRow we should not be using {{StringUtils.stringifyException}} since LOG can handle stack traces. in close: {noformat} // signal new failure to map-reduce LOG.error(Hit error while closing operators - failing tree); throw new IllegalStateException(Error while closing operators, e); {noformat} Should be: {noformat} String msg = Error while closing operators: + e; throw new IllegalStateException(msg, e); {noformat} {{SparkSessionManagerImpl}} - the method {{canReuseSession}} is useless {{GenSparkSkewJoinProcessor}} {noformat} + // keep it as reference in case we need fetch work +//localPlan.getAliasToFetchWork().put(small_alias.toString(), +//new FetchWork(tblDir, tableDescList.get(small_alias))); {noformat} {{GenSparkWorkWalker}} trim ws {{SparkCompiler}} remote init {{SparkEdgeProperty}} trim ws {{CounterStatsPublisher}} eat exception {{Hadoop23Shims}} unused import of {{ResourceBundles}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9310) CLI JLine does not flush history back to ~/.hivehistory
[ https://issues.apache.org/jira/browse/HIVE-9310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273145#comment-14273145 ] Navis commented on HIVE-9310: - Ok, I got it. But still can we flush history in signal handler(https://github.com/apache/hive/blob/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java#L325)? CLI JLine does not flush history back to ~/.hivehistory --- Key: HIVE-9310 URL: https://issues.apache.org/jira/browse/HIVE-9310 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.15.0 Reporter: Gopal V Assignee: Gopal V Priority: Minor Attachments: HIVE-9310.1.patch Hive CLI does not seem to be saving history anymore. In JLine with the PersistentHistory class, to keep history across sessions, you need to do {{reader.getHistory().flush()}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9341) Apply ColumnPrunning for noop PTFs
Navis created HIVE-9341: --- Summary: Apply ColumnPrunning for noop PTFs Key: HIVE-9341 URL: https://issues.apache.org/jira/browse/HIVE-9341 Project: Hive Issue Type: Improvement Components: PTF-Windowing Reporter: Navis Assignee: Navis Priority: Trivial Currently, PTF disables CP optimization, which can make a huge burden. For example, {noformat} select p_mfgr, p_name, p_size, rank() over (partition by p_mfgr order by p_name) as r, dense_rank() over (partition by p_mfgr order by p_name) as dr, sum(p_retailprice) over (partition by p_mfgr order by p_name rows between unbounded preceding and current row) as s1 from noop(on part partition by p_mfgr order by p_name ); STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: part Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: p_mfgr (type: string), p_name (type: string) sort order: ++ Map-reduce partition columns: p_mfgr (type: string) Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE value expressions: p_partkey (type: int), p_name (type: string), p_mfgr (type: string), p_brand (type: string), p_type (type: string), p_size (type: int), p_container (type: string), p_retailprice (type: double), p_comment (type: string), BLOCK__OFFSET__INSIDE__FILE (type: bigint), INPUT__FILE__NAME (type: string), ROW__ID (type: structtransactionid:bigint,bucketid:int,rowid:bigint) ... {noformat} There should be a generic way to discern referenced columns but before that, we know CP can be safely applied to noop functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9334) PredicateTransitivePropagate optimizer should run after PredicatePushDown
[ https://issues.apache.org/jira/browse/HIVE-9334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273132#comment-14273132 ] Navis commented on HIVE-9334: - PredicateTransitivePropagate is for propagating predicates in on condition in JOIN operators only. Others should be dealt with generic PPD optimizer. So it's right order to run PredicateTransitivePropagate first before PPD. (Yes, these two can be merged into one optimizer, but PPD was too unstable in those days) I don't know where not-null predicates is from but it's redundant and should not be added (It's once removed by ConstantPropagateOptimizer). PredicateTransitivePropagate optimizer should run after PredicatePushDown - Key: HIVE-9334 URL: https://issues.apache.org/jira/browse/HIVE-9334 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.10.0, 0.11.0, 0.12.0, 0.13.0, 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-9334.1.patch, HIVE-9334.patch This way PredicateTransitivePropagate will be more effective as it has more filters to push for other branches of joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9336) Fix Hive throws ParseException while handling Grouping-Sets clauses
zhaohm3 created HIVE-9336: - Summary: Fix Hive throws ParseException while handling Grouping-Sets clauses Key: HIVE-9336 URL: https://issues.apache.org/jira/browse/HIVE-9336 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 0.13.1 Reporter: zhaohm3 Fix For: 0.14.0 Currently, when Hive parses GROUPING SETS clauses, and if there are some expressions that were composed of two or more common subexpressions, then the first element of those expressions can only be a simple Identifier without any qualifications, otherwise Hive will throw ParseException during its parser stage. Therefore, Hive will throw ParseException while parsing the following HQLs: drop table test; create table test(tc1 int, tc2 int, tc3 int); explain select test.tc1, test.tc2 from test group by test.tc1, test.tc2 grouping sets(test.tc1, (test.tc1, test.tc2)); explain select tc1+tc2, tc2 from test group by tc1+tc2, tc2 grouping sets(tc2, (tc1 + tc2, tc2)); drop table test; The following contents show some ParseExctption stacktrace: 2015-01-07 09:53:34,718 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,719 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,721 INFO [main]: ql.Driver (Driver.java:checkConcurrency(158)) - Concurrency mode is disabled, not creating a lock manager 2015-01-07 09:53:34,721 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,724 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,724 INFO [main]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: explain select test.tc1, test.tc2 from test group by test.tc1, test.tc2 grouping sets(test.tc1, (test.tc1, test.tc2)) 2015-01-07 09:53:34,734 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: ParseException line 1:105 missing ) at ',' near 'EOF' line 1:116 extraneous input ')' expecting EOF near 'EOF' org.apache.hadoop.hive.ql.parse.ParseException: line 1:105 missing ) at ',' near 'EOF' line 1:116 extraneous input ')' expecting EOF near 'EOF' at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:210) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) 2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=compile start=1420595614721 end=1420595614745 duration=24 from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=releaseLocks start=1420595614745 end=1420595614746 duration=1 from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver 2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=releaseLocks start=1420595614746 end=1420595614746 duration=0 from=org.apache.hadoop.hive.ql.Driver But, Hive will not throw ParseException while handling the follwing HQLs: drop table test;
[jira] [Commented] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273177#comment-14273177 ] Xuefu Zhang commented on HIVE-9257: --- Actually my comments on RB were not covered by HIVE-9335, which already has +1 pending. We may need a separate JIRA to cover them. Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC15 Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9338) Merge from trunk to spark 1/12/2015 [Spark Branch]
Szehon Ho created HIVE-9338: --- Summary: Merge from trunk to spark 1/12/2015 [Spark Branch] Key: HIVE-9338 URL: https://issues.apache.org/jira/browse/HIVE-9338 Project: Hive Issue Type: Task Components: Spark Affects Versions: spark-branch Reporter: Szehon Ho Assignee: Szehon Ho -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9339) Optimize split grouping for CombineHiveInputFormat [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273185#comment-14273185 ] Xuefu Zhang commented on HIVE-9339: --- cc: [~lirui] Optimize split grouping for CombineHiveInputFormat [Spark Branch] - Key: HIVE-9339 URL: https://issues.apache.org/jira/browse/HIVE-9339 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang It seems that split generation, especially in terms of grouping inputs, needs to be improved. For this, we may need cluster information. Because of this, we will first try to solve the problem for Spark. As to cluster information, Spark doesn't provide an API (SPARK-5080). However, Spark doesn't have a listener API, with which Spark driver can get notifications about executor going up/down, task starting/finishing, etc. With this information, Spark client should be able to have a view of the current cluster image. Spark developers mentioned that the listener can only be created after SparkContext is started, at which time, some executions may have already started and so the listener will miss some information. This can be fixed. File a JIRA with Spark project if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9339) Optimize split grouping for CombineHiveInputFormat [Spark Branch]
Xuefu Zhang created HIVE-9339: - Summary: Optimize split grouping for CombineHiveInputFormat [Spark Branch] Key: HIVE-9339 URL: https://issues.apache.org/jira/browse/HIVE-9339 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang It seems that split generation, especially in terms of grouping inputs, needs to be improved. For this, we may need cluster information. Because of this, we will first try to solve the problem for Spark. As to cluster information, Spark doesn't provide an API (SPARK-5080). However, Spark doesn't have a listener API, with which Spark driver can get notifications about executor going up/down, task starting/finishing, etc. With this information, Spark client should be able to have a view of the current cluster image. Spark developers mentioned that the listener can only be created after SparkContext is started, at which time, some executions may have already started and so the listener will miss some information. This can be fixed. File a JIRA with Spark project if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9339) Optimize split grouping for CombineHiveInputFormat [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273191#comment-14273191 ] Rui Li commented on HIVE-9339: -- Using listener is fine. We currently use listeners to collect metrics as well. Optimize split grouping for CombineHiveInputFormat [Spark Branch] - Key: HIVE-9339 URL: https://issues.apache.org/jira/browse/HIVE-9339 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang It seems that split generation, especially in terms of grouping inputs, needs to be improved. For this, we may need cluster information. Because of this, we will first try to solve the problem for Spark. As to cluster information, Spark doesn't provide an API (SPARK-5080). However, Spark doesn't have a listener API, with which Spark driver can get notifications about executor going up/down, task starting/finishing, etc. With this information, Spark client should be able to have a view of the current cluster image. Spark developers mentioned that the listener can only be created after SparkContext is started, at which time, some executions may have already started and so the listener will miss some information. This can be fixed. File a JIRA with Spark project if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9340) Address review of HIVE-9257 (ii)
[ https://issues.apache.org/jira/browse/HIVE-9340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9340: Description: Some minor fixes: 1. Get rid of spark_test.q, which was used to test the sparkCliDriver test fw. 2. Get rid of spark-snapshot repository dep in pom (found by Xuefu) 3. Cleanup ExplainTask to get rid of * in imports. (found by Xuefu) 4. Reorder the scala/spark dependencies in pom to fit the alphabetical order. was: Some minor fixes: 1. Get rid of spark_test.q, which was used to test the sparkCliDriver test fw. 2. Get rid of spark-snapshot repository dep in pom 3. Cleanup ExplainTask to get rid of * in imports. 4. Reorder the scala/spark dependencies in pom to fit the main order. Address review of HIVE-9257 (ii) Key: HIVE-9340 URL: https://issues.apache.org/jira/browse/HIVE-9340 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Szehon Ho Assignee: Szehon Ho Some minor fixes: 1. Get rid of spark_test.q, which was used to test the sparkCliDriver test fw. 2. Get rid of spark-snapshot repository dep in pom (found by Xuefu) 3. Cleanup ExplainTask to get rid of * in imports. (found by Xuefu) 4. Reorder the scala/spark dependencies in pom to fit the alphabetical order. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9335) Address review items on HIVE-9257 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273104#comment-14273104 ] Xuefu Zhang commented on HIVE-9335: --- +1 Address review items on HIVE-9257 [Spark Branch] Key: HIVE-9335 URL: https://issues.apache.org/jira/browse/HIVE-9335 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: spark-branch Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-9335.1-spark.patch, HIVE-9335.2-spark.patch I made a pass through HIVE-9257 and found the following issues: {{HashTableSinkOperator.java}} The fields EMPTY_OBJECT_ARRAY and EMPTY_ROW_CONTAINER are no longer constants and should not be in upper case. {{HivePairFlatMapFunction.java}} We share NumberFormat accross threads and it's not thread safe. {{KryoSerializer.java}} we eat the stack trace in deserializeJobConf {{SparkMapRecordHandler}} in processRow we should not be using {{StringUtils.stringifyException}} since LOG can handle stack traces. in close: {noformat} // signal new failure to map-reduce LOG.error(Hit error while closing operators - failing tree); throw new IllegalStateException(Error while closing operators, e); {noformat} Should be: {noformat} String msg = Error while closing operators: + e; throw new IllegalStateException(msg, e); {noformat} {{SparkSessionManagerImpl}} - the method {{canReuseSession}} is useless {{GenSparkSkewJoinProcessor}} {noformat} + // keep it as reference in case we need fetch work +//localPlan.getAliasToFetchWork().put(small_alias.toString(), +//new FetchWork(tblDir, tableDescList.get(small_alias))); {noformat} {{GenSparkWorkWalker}} trim ws {{SparkCompiler}} remote init {{SparkEdgeProperty}} trim ws {{CounterStatsPublisher}} eat exception {{Hadoop23Shims}} unused import of {{ResourceBundles}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9172) Merging HIVE-5871 into LazySimpleSerDe
[ https://issues.apache.org/jira/browse/HIVE-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273158#comment-14273158 ] Navis commented on HIVE-9172: - [~jdere] Thanks for the precious comments I've missed. Agree on reverting MultiDelimitSerde (Even I didn't noticed base64 encoding problem in lazy serde, which should to be configurable also). For backward compatibility issue, I agree that classes in LazySerDe(objects, OIs, utils) might be useful to implement custom SerDes, but basically it's not for public usage and should not be regarded as that. And also I think I've minimized changes and seemed not that hard to rebase it. Merging HIVE-5871 into LazySimpleSerDe -- Key: HIVE-9172 URL: https://issues.apache.org/jira/browse/HIVE-9172 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-9172.1.patch.txt, HIVE-9172.2.patch.txt, HIVE-9172.3.patch.txt Merging multi character support for field delimiter to LazySimpleSerDe -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9257: Labels: TODOC15 (was: ) Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC15 Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Move ancient Hive issues from Hadoop project to Hive
+1 On Fri, Jan 9, 2015 at 5:48 PM, Ashutosh Chauhan hashut...@apache.org wrote: Hi all, Hive started out as Hadoop subproject. That time Hadoop's jira is used to track Hive's bugs and features. As I try to find lineage of some very old code in Hive, I sometimes end up on those jiras. It will be nice to move those issues from Hadoop to Hive so that its easy to search as all jiras relevant to Hive is contained in one project. A representative list is : *http://s.apache.org/Hive-issues-in-Hadoop http://s.apache.org/Hive-issues-in-Hadoop* Unless some one objects, I will start to move those issues to Hive some time over next week. Thanks, Ashutosh
[jira] [Updated] (HIVE-9119) ZooKeeperHiveLockManager does not use zookeeper in the proper way
[ https://issues.apache.org/jira/browse/HIVE-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9119: -- Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Na. ZooKeeperHiveLockManager does not use zookeeper in the proper way - Key: HIVE-9119 URL: https://issues.apache.org/jira/browse/HIVE-9119 Project: Hive Issue Type: Improvement Components: Locking Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Na Yang Assignee: Na Yang Fix For: 0.15.0 Attachments: HIVE-9119.1.patch, HIVE-9119.2.patch, HIVE-9119.3.patch, HIVE-9119.4.patch ZooKeeperHiveLockManager does not use zookeeper in the proper way. Currently a new zookeeper client instance is created for each getlock/releaselock query which sometimes causes the number of open connections between HiveServer2 and ZooKeeper exceed the max connection number that zookeeper server allows. To use zookeeper as a distributed lock, there is no need to create a new zookeeper instance for every getlock try. A single zookeeper instance could be reused and shared by ZooKeeperHiveLockManagers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9119) ZooKeeperHiveLockManager does not use zookeeper in the proper way
[ https://issues.apache.org/jira/browse/HIVE-9119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9119: -- Labels: TODOC15 (was: ) ZooKeeperHiveLockManager does not use zookeeper in the proper way - Key: HIVE-9119 URL: https://issues.apache.org/jira/browse/HIVE-9119 Project: Hive Issue Type: Improvement Components: Locking Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Na Yang Assignee: Na Yang Labels: TODOC15 Fix For: 0.15.0 Attachments: HIVE-9119.1.patch, HIVE-9119.2.patch, HIVE-9119.3.patch, HIVE-9119.4.patch ZooKeeperHiveLockManager does not use zookeeper in the proper way. Currently a new zookeeper client instance is created for each getlock/releaselock query which sometimes causes the number of open connections between HiveServer2 and ZooKeeper exceed the max connection number that zookeeper server allows. To use zookeeper as a distributed lock, there is no need to create a new zookeeper instance for every getlock try. A single zookeeper instance could be reused and shared by ZooKeeperHiveLockManagers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9343) Fix windowing.q for Spark on trunk
[ https://issues.apache.org/jira/browse/HIVE-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273261#comment-14273261 ] Rui Li commented on HIVE-9343: -- OK I'll take a look Fix windowing.q for Spark on trunk -- Key: HIVE-9343 URL: https://issues.apache.org/jira/browse/HIVE-9343 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland After HIVE-9257 the windowing.q test is failing on trunk since HIVE-9104 was not merge to spark. Details: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-HADOOP-2/lastCompletedBuild/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9342) add num-executors / executor-cores / executor-memory option support for hive on spark in Yarn mode
Pierre Yin created HIVE-9342: Summary: add num-executors / executor-cores / executor-memory option support for hive on spark in Yarn mode Key: HIVE-9342 URL: https://issues.apache.org/jira/browse/HIVE-9342 Project: Hive Issue Type: Improvement Components: spark-branch Affects Versions: spark-branch Reporter: Pierre Yin Priority: Minor When I run hive on spark with Yarn mode, I want to control some yarn option, such as --num-executors, --executor-cores, --executor-memory. We can append these options into argv in SparkClientImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9343) Fix windowing.q for Spark on trunk
[ https://issues.apache.org/jira/browse/HIVE-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9343: - Assignee: Rui Li Status: Patch Available (was: Open) Fix windowing.q for Spark on trunk -- Key: HIVE-9343 URL: https://issues.apache.org/jira/browse/HIVE-9343 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Rui Li Attachments: HIVE-9343.1.patch After HIVE-9257 the windowing.q test is failing on trunk since HIVE-9104 was not merge to spark. Details: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-HADOOP-2/lastCompletedBuild/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9343) Fix windowing.q for Spark on trunk
[ https://issues.apache.org/jira/browse/HIVE-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9343: - Attachment: HIVE-9343.1.patch Fix windowing.q for Spark on trunk -- Key: HIVE-9343 URL: https://issues.apache.org/jira/browse/HIVE-9343 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Attachments: HIVE-9343.1.patch After HIVE-9257 the windowing.q test is failing on trunk since HIVE-9104 was not merge to spark. Details: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-HADOOP-2/lastCompletedBuild/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Status: Patch Available (was: Open) Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Attachment: HIVE-9039.14.patch update union remove 25.q Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9338) Merge from trunk to spark 1/12/2015 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9338: Status: Patch Available (was: Open) Merge from trunk to spark 1/12/2015 [Spark Branch] -- Key: HIVE-9338 URL: https://issues.apache.org/jira/browse/HIVE-9338 Project: Hive Issue Type: Task Components: Spark Affects Versions: spark-branch Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-9338-spark.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9341) Apply ColumnPrunning for noop PTFs
[ https://issues.apache.org/jira/browse/HIVE-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9341: Attachment: HIVE-9341.1.patch.txt Apply ColumnPrunning for noop PTFs -- Key: HIVE-9341 URL: https://issues.apache.org/jira/browse/HIVE-9341 Project: Hive Issue Type: Improvement Components: PTF-Windowing Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-9341.1.patch.txt Currently, PTF disables CP optimization, which can make a huge burden. For example, {noformat} select p_mfgr, p_name, p_size, rank() over (partition by p_mfgr order by p_name) as r, dense_rank() over (partition by p_mfgr order by p_name) as dr, sum(p_retailprice) over (partition by p_mfgr order by p_name rows between unbounded preceding and current row) as s1 from noop(on part partition by p_mfgr order by p_name ); STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: part Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: p_mfgr (type: string), p_name (type: string) sort order: ++ Map-reduce partition columns: p_mfgr (type: string) Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE value expressions: p_partkey (type: int), p_name (type: string), p_mfgr (type: string), p_brand (type: string), p_type (type: string), p_size (type: int), p_container (type: string), p_retailprice (type: double), p_comment (type: string), BLOCK__OFFSET__INSIDE__FILE (type: bigint), INPUT__FILE__NAME (type: string), ROW__ID (type: structtransactionid:bigint,bucketid:int,rowid:bigint) ... {noformat} There should be a generic way to discern referenced columns but before that, we know CP can be safely applied to noop functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9344) Fix flaky test optimize_nullscan
[ https://issues.apache.org/jira/browse/HIVE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-9344: --- Assignee: (was: Brock Noland) Fix flaky test optimize_nullscan Key: HIVE-9344 URL: https://issues.apache.org/jira/browse/HIVE-9344 Project: Hive Issue Type: Bug Reporter: Brock Noland The optimize_nullscan test is extremely flaky. We need to find a way to fix this test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9344) Fix flaky test optimize_nullscan
[ https://issues.apache.org/jira/browse/HIVE-9344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland reassigned HIVE-9344: -- Assignee: Brock Noland Fix flaky test optimize_nullscan Key: HIVE-9344 URL: https://issues.apache.org/jira/browse/HIVE-9344 Project: Hive Issue Type: Bug Reporter: Brock Noland Assignee: Brock Noland The optimize_nullscan test is extremely flaky. We need to find a way to fix this test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273258#comment-14273258 ] Brock Noland commented on HIVE-9257: I also created HIVE-9344 to fix that optimize_nullscan tests that are so flaky. Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC15 Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9341) Apply ColumnPrunning for noop PTFs
[ https://issues.apache.org/jira/browse/HIVE-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273284#comment-14273284 ] Hive QA commented on HIVE-9341: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12691593/HIVE-9341.1.patch.txt {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 7311 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf_streaming org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_windowing org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2335/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2335/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2335/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12691593 - PreCommit-HIVE-TRUNK-Build Apply ColumnPrunning for noop PTFs -- Key: HIVE-9341 URL: https://issues.apache.org/jira/browse/HIVE-9341 Project: Hive Issue Type: Improvement Components: PTF-Windowing Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-9341.1.patch.txt Currently, PTF disables CP optimization, which can make a huge burden. For example, {noformat} select p_mfgr, p_name, p_size, rank() over (partition by p_mfgr order by p_name) as r, dense_rank() over (partition by p_mfgr order by p_name) as dr, sum(p_retailprice) over (partition by p_mfgr order by p_name rows between unbounded preceding and current row) as s1 from noop(on part partition by p_mfgr order by p_name ); STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: part Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: p_mfgr (type: string), p_name (type: string) sort order: ++ Map-reduce partition columns: p_mfgr (type: string) Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE value expressions: p_partkey (type: int), p_name (type: string), p_mfgr (type: string), p_brand (type: string), p_type (type: string), p_size (type: int), p_container (type: string), p_retailprice (type: double), p_comment (type: string), BLOCK__OFFSET__INSIDE__FILE (type: bigint), INPUT__FILE__NAME (type: string), ROW__ID (type: structtransactionid:bigint,bucketid:int,rowid:bigint) ... {noformat} There should be a generic way to discern referenced columns but before that, we know CP can be safely applied to noop functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9338) Merge from trunk to spark 1/12/2015 [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-9338: Attachment: HIVE-9338-spark.patch Merge from trunk to spark 1/12/2015 [Spark Branch] -- Key: HIVE-9338 URL: https://issues.apache.org/jira/browse/HIVE-9338 Project: Hive Issue Type: Task Components: Spark Affects Versions: spark-branch Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-9338-spark.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9342) add num-executors / executor-cores / executor-memory option support for hive on spark in Yarn mode
[ https://issues.apache.org/jira/browse/HIVE-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Yin updated HIVE-9342: - Attachment: HIVE-9342.1-spark.patch add num-executors / executor-cores / executor-memory option support for hive on spark in Yarn mode -- Key: HIVE-9342 URL: https://issues.apache.org/jira/browse/HIVE-9342 Project: Hive Issue Type: Improvement Components: spark-branch Affects Versions: spark-branch Reporter: Pierre Yin Priority: Minor Labels: spark Fix For: spark-branch Attachments: HIVE-9342.1-spark.patch When I run hive on spark with Yarn mode, I want to control some yarn option, such as --num-executors, --executor-cores, --executor-memory. We can append these options into argv in SparkClientImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273256#comment-14273256 ] Brock Noland commented on HIVE-9257: The only test which failed and is not flaky is was the windowing test. I created HIVE-9343 to track the issue. Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC15 Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9344) Fix flaky test optimize_nullscan
Brock Noland created HIVE-9344: -- Summary: Fix flaky test optimize_nullscan Key: HIVE-9344 URL: https://issues.apache.org/jira/browse/HIVE-9344 Project: Hive Issue Type: Bug Reporter: Brock Noland The optimize_nullscan test is extremely flaky. We need to find a way to fix this test. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273223#comment-14273223 ] Brock Noland commented on HIVE-9257: FYI - I am running a build here: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-HADOOP-2/1/ to ensure all the tests are passing post merge. If there is any failing tests, due to the merge, we should address them ASAP tomorrow. Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC15 Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9341) Apply ColumnPrunning for noop PTFs
[ https://issues.apache.org/jira/browse/HIVE-9341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-9341: Status: Patch Available (was: Open) Apply ColumnPrunning for noop PTFs -- Key: HIVE-9341 URL: https://issues.apache.org/jira/browse/HIVE-9341 Project: Hive Issue Type: Improvement Components: PTF-Windowing Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-9341.1.patch.txt Currently, PTF disables CP optimization, which can make a huge burden. For example, {noformat} select p_mfgr, p_name, p_size, rank() over (partition by p_mfgr order by p_name) as r, dense_rank() over (partition by p_mfgr order by p_name) as dr, sum(p_retailprice) over (partition by p_mfgr order by p_name rows between unbounded preceding and current row) as s1 from noop(on part partition by p_mfgr order by p_name ); STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: part Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: p_mfgr (type: string), p_name (type: string) sort order: ++ Map-reduce partition columns: p_mfgr (type: string) Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE value expressions: p_partkey (type: int), p_name (type: string), p_mfgr (type: string), p_brand (type: string), p_type (type: string), p_size (type: int), p_container (type: string), p_retailprice (type: double), p_comment (type: string), BLOCK__OFFSET__INSIDE__FILE (type: bigint), INPUT__FILE__NAME (type: string), ROW__ID (type: structtransactionid:bigint,bucketid:int,rowid:bigint) ... {noformat} There should be a generic way to discern referenced columns but before that, we know CP can be safely applied to noop functions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 29800: Apply ColumnPrunning for noop PTFs
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/29800/ --- Review request for hive. Bugs: HIVE-9341 https://issues.apache.org/jira/browse/HIVE-9341 Repository: hive-git Description --- Currently, PTF disables CP optimization, which can make a huge burden. For example, {noformat} select p_mfgr, p_name, p_size, rank() over (partition by p_mfgr order by p_name) as r, dense_rank() over (partition by p_mfgr order by p_name) as dr, sum(p_retailprice) over (partition by p_mfgr order by p_name rows between unbounded preceding and current row) as s1 from noop(on part partition by p_mfgr order by p_name ); STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: part Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: p_mfgr (type: string), p_name (type: string) sort order: ++ Map-reduce partition columns: p_mfgr (type: string) Statistics: Num rows: 26 Data size: 3147 Basic stats: COMPLETE Column stats: NONE value expressions: p_partkey (type: int), p_name (type: string), p_mfgr (type: string), p_brand (type: string), p_type (type: string), p_size (type: int), p_container (type: string), p_retailprice (type: double), p_comment (type: string), BLOCK__OFFSET__INSIDE__FILE (type: bigint), INPUT__FILE__NAME (type: string), ROW__ID (type: structtransactionid:bigint,bucketid:int,rowid:bigint) ... {noformat} There should be a generic way to discern referenced columns but before that, we know CP can be safely applied to noop functions. Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java afd1738 ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java ee7328e Diff: https://reviews.apache.org/r/29800/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-9039) Support Union Distinct
[ https://issues.apache.org/jira/browse/HIVE-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-9039: -- Status: Open (was: Patch Available) Support Union Distinct -- Key: HIVE-9039 URL: https://issues.apache.org/jira/browse/HIVE-9039 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-9039.01.patch, HIVE-9039.02.patch, HIVE-9039.03.patch, HIVE-9039.04.patch, HIVE-9039.05.patch, HIVE-9039.06.patch, HIVE-9039.07.patch, HIVE-9039.08.patch, HIVE-9039.09.patch, HIVE-9039.10.patch, HIVE-9039.11.patch, HIVE-9039.12.patch, HIVE-9039.13.patch, HIVE-9039.14.patch Current version (Hive 0.14) does not support union (or union distinct). It only supports union all. In this patch, we try to add this new feature by rewriting union distinct to union all followed by group by. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9343) Fix windowing.q for Spark on trunk
Brock Noland created HIVE-9343: -- Summary: Fix windowing.q for Spark on trunk Key: HIVE-9343 URL: https://issues.apache.org/jira/browse/HIVE-9343 Project: Hive Issue Type: Sub-task Reporter: Brock Noland After HIVE-9257 the windowing.q test is failing on trunk since HIVE-9104 was not merge to spark. Details: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-HADOOP-2/lastCompletedBuild/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9342) add num-executors / executor-cores / executor-memory option support for hive on spark in Yarn mode
[ https://issues.apache.org/jira/browse/HIVE-9342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pierre Yin updated HIVE-9342: - Fix Version/s: spark-branch Status: Patch Available (was: Open) Please see patch in attachment. add num-executors / executor-cores / executor-memory option support for hive on spark in Yarn mode -- Key: HIVE-9342 URL: https://issues.apache.org/jira/browse/HIVE-9342 Project: Hive Issue Type: Improvement Components: spark-branch Affects Versions: spark-branch Reporter: Pierre Yin Priority: Minor Labels: spark Fix For: spark-branch When I run hive on spark with Yarn mode, I want to control some yarn option, such as --num-executors, --executor-cores, --executor-memory. We can append these options into argv in SparkClientImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9343) Fix windowing.q for Spark on trunk
[ https://issues.apache.org/jira/browse/HIVE-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273253#comment-14273253 ] Brock Noland commented on HIVE-9343: [~chengxiang li] or [~lirui] - any chance you could update the output file for this test? Fix windowing.q for Spark on trunk -- Key: HIVE-9343 URL: https://issues.apache.org/jira/browse/HIVE-9343 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland After HIVE-9257 the windowing.q test is failing on trunk since HIVE-9104 was not merge to spark. Details: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-HADOOP-2/lastCompletedBuild/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9257) Merge from spark to trunk January 2015
[ https://issues.apache.org/jira/browse/HIVE-9257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273295#comment-14273295 ] Szehon Ho commented on HIVE-9257: - Thanks Brock. FYI, I am planning another merge in next few days to incorporate the review fixes, and also to get rid of the spark-snapshot dependency. It will hopefully be a more managable patch that can uploaded this time and tested the normal way :) Merge from spark to trunk January 2015 -- Key: HIVE-9257 URL: https://issues.apache.org/jira/browse/HIVE-9257 Project: Hive Issue Type: Sub-task Components: Spark Affects Versions: 0.15.0 Reporter: Szehon Ho Assignee: Szehon Ho Labels: TODOC15 Fix For: 0.15.0 Attachments: trunk-mr2-spark-merge.properties The hive on spark work has reached a point where we can merge it into the trunk branch. Note that spark execution engine is optional and no current users should be impacted. This JIRA will be used to track the merge. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9343) Fix windowing.q for Spark on trunk
[ https://issues.apache.org/jira/browse/HIVE-9343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14273296#comment-14273296 ] Szehon Ho commented on HIVE-9343: - Thanks guys for taking care of this! +1 pending tests. Fix windowing.q for Spark on trunk -- Key: HIVE-9343 URL: https://issues.apache.org/jira/browse/HIVE-9343 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Rui Li Attachments: HIVE-9343.1.patch After HIVE-9257 the windowing.q test is failing on trunk since HIVE-9104 was not merge to spark. Details: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/HIVE-TRUNK-HADOOP-2/lastCompletedBuild/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)