Re: [ANNOUNCE] New Hive Committer - Rajesh Balamohan
Congrats Rajesh! :) On Tue, Dec 13, 2016 at 9:36 PM, Pengcheng Xiongwrote: > Congrats Rajesh! :) > > On Tue, Dec 13, 2016 at 6:51 PM, Prasanth Jayachandran < > prasan...@apache.org > > wrote: > > > The Apache Hive PMC has voted to make Rajesh Balamohan a committer on the > > Apache Hive Project. Please join me in congratulating Rajesh. > > > > Congratulations Rajesh! > > > > Thanks > > Prasanth > -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-13730) hybridgrace_hashjoin_1.q test gets stuck
Vikram Dixit K created HIVE-13730: - Summary: hybridgrace_hashjoin_1.q test gets stuck Key: HIVE-13730 URL: https://issues.apache.org/jira/browse/HIVE-13730 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Wei Zheng Priority: Critical I am seeing hybridgrace_hashjoin_1.q getting stuck on master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13628) Support for permanent functions - error handling if no restart
Vikram Dixit K created HIVE-13628: - Summary: Support for permanent functions - error handling if no restart Key: HIVE-13628 URL: https://issues.apache.org/jira/browse/HIVE-13628 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-13628.1.patch Support for permanent functions - error handling if no restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13627) When running under LLAP, for regular map joins, throw an error if memory utilization goes above what is allocated to the task
Vikram Dixit K created HIVE-13627: - Summary: When running under LLAP, for regular map joins, throw an error if memory utilization goes above what is allocated to the task Key: HIVE-13627 URL: https://issues.apache.org/jira/browse/HIVE-13627 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K When running under LLAP, for regular map joins, throw an error if memory utilization goes above what is allocated to the task. This way, the rest of the dependent tasks can fail sooner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13621) compute stats in certain cases fails with NPE
Vikram Dixit K created HIVE-13621: - Summary: compute stats in certain cases fails with NPE Key: HIVE-13621 URL: https://issues.apache.org/jira/browse/HIVE-13621 Project: Hive Issue Type: Bug Components: HBase Metastore, Metastore Affects Versions: 2.1.0, 2.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693) at org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739) at org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13619) Bucket map join plan is incorrect
Vikram Dixit K created HIVE-13619: - Summary: Bucket map join plan is incorrect Key: HIVE-13619 URL: https://issues.apache.org/jira/browse/HIVE-13619 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0, 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Same as HIVE-12992. Missed a single line check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13518) Hive on Tez: Shuffle joins do not choose the right 'big' table.
Vikram Dixit K created HIVE-13518: - Summary: Hive on Tez: Shuffle joins do not choose the right 'big' table. Key: HIVE-13518 URL: https://issues.apache.org/jira/browse/HIVE-13518 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K Currently the big table is always assumed to be at position 0 but this isn't efficient for some queries as the big table at position 1 could have a lot more keys/skew. We already have a mechanism of choosing the big table that can be leveraged to make the right choice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13485) Session id appended to thread name multiple times.
Vikram Dixit K created HIVE-13485: - Summary: Session id appended to thread name multiple times. Key: HIVE-13485 URL: https://issues.apache.org/jira/browse/HIVE-13485 Project: Hive Issue Type: Bug Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-13485.1.patch HIVE-13153 addressed a portion of this issue. Follow up from there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13438) Add a service check script for llap
Vikram Dixit K created HIVE-13438: - Summary: Add a service check script for llap Key: HIVE-13438 URL: https://issues.apache.org/jira/browse/HIVE-13438 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-13438.1.patch We want to have a test script that can be run by an installer such as ambari that makes sure that the service is up and running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13408) Issue appending HIVE_QUERY_ID without checking if the prefix already exists
Vikram Dixit K created HIVE-13408: - Summary: Issue appending HIVE_QUERY_ID without checking if the prefix already exists Key: HIVE-13408 URL: https://issues.apache.org/jira/browse/HIVE-13408 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} We are resetting the hadoop caller context to HIVE_QUERY_ID:HIVE_QUERY_ID: {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13394) Analyze table fails in tez on empty partitions
Vikram Dixit K created HIVE-13394: - Summary: Analyze table fails in tez on empty partitions Key: HIVE-13394 URL: https://issues.apache.org/jira/browse/HIVE-13394 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0, 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:237) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:766) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343) ... 17 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hive.ql.udf.generic.NumDistinctValueEstimator.deserialize(NumDistinctValueEstimator.java:219) at org.apache.hadoop.hive.ql.udf.generic.NumDistinctValueEstimator.(NumDistinctValueEstimator.java:112) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFComputeStats$GenericUDAFNumericStatsEvaluator.merge(GenericUDAFComputeStats.java:556) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:188) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:612) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:851) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695) at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761) ... 18 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_145591034_27748_1_01 [Reducer 2] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13343) Need to disable hybrid grace hash join in llap mode except for dynamically partitioned hash join
Vikram Dixit K created HIVE-13343: - Summary: Need to disable hybrid grace hash join in llap mode except for dynamically partitioned hash join Key: HIVE-13343 URL: https://issues.apache.org/jira/browse/HIVE-13343 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Due to performance reasons, we should disable use of hybrid grace hash join in llap when dynamic partition hash join is not used. With dynamic partition hash join, we need hybrid grace hash join due to the possibility of skews. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13342) Improve logging in llap decider for llap
Vikram Dixit K created HIVE-13342: - Summary: Improve logging in llap decider for llap Key: HIVE-13342 URL: https://issues.apache.org/jira/browse/HIVE-13342 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Currently we do not log our decisions with respect to llap. Are we running everything in llap mode or only parts of the plan. We need more logging. Also, if llap mode is all but for some reason, we cannot run the work in llap mode, fail and throw an exception advise the user to change the mode to auto. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13329) Hive query id should not be allowed to be modified by users.
Vikram Dixit K created HIVE-13329: - Summary: Hive query id should not be allowed to be modified by users. Key: HIVE-13329 URL: https://issues.apache.org/jira/browse/HIVE-13329 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13286) Query ID is being reused across queries
Vikram Dixit K created HIVE-13286: - Summary: Query ID is being reused across queries Key: HIVE-13286 URL: https://issues.apache.org/jira/browse/HIVE-13286 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Pengcheng Xiong Priority: Critical [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is being reused across queries. This defeats the purpose of a query id. I am not sure what the purpose of the change in that jira is but it breaks the assumption about a query id being unique for each query. Please take a look into this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException
Vikram Dixit K created HIVE-13282: - Summary: GroupBy and select operator encounter ArrayIndexOutOfBoundsException Key: HIVE-13282 URL: https://issues.apache.org/jira/browse/HIVE-13282 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 2.0.0, 1.2.1, 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K The group by and select operators run into the ArrayIndexOutOfBoundsException when they incorrectly initialize themselves with tag 0 but the incoming tag id is different. {code} select count(*) from (select rt1.id from (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1 join (select rt2.id from (select t2.key as id, t2.value as od from tab_part t2 group by key, value) rt2) vt2 where vt1.id=vt2.id; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[ANNOUNCE] New Hive Committer - Wei Zheng
The Apache Hive PMC has voted to make Wei Zheng a committer on the Apache Hive Project. Please join me in congratulating Wei. Thanks Vikram.
[jira] [Created] (HIVE-13152) JDBC split refactoring and handle some edge cases
Vikram Dixit K created HIVE-13152: - Summary: JDBC split refactoring and handle some edge cases Key: HIVE-13152 URL: https://issues.apache.org/jira/browse/HIVE-13152 Project: Hive Issue Type: Sub-task Components: JDBC Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: llap wrap jdbc split allow zero split returns in llapdump allow setting of llap output format as file format option create job conf via dag utils JDCB LlapInputSplit should not be an inner class Hack: Spark uses hive-1.2.1, which does not have HiveConf.ConfVars.LLAP_DAEMON_RPC_PORT, use "hive.llap.daemon.rpc.port" - Use Class.getName() rather than Class.toString() for class name - Add LlapInputSplit Writable test Fix for filesink operator close NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache Hive 2.0.0 Release Candidate 3
+1. Downloaded the sources. Ran the rat check and ran tests. Thanks Sergey. On Thu, Feb 11, 2016 at 3:58 PM, Sergey Shelukhinwrote: > Yeah, he gets to do 1.3 ;) > > From: Alan Gates > Reply-To: "dev@hive.apache.org" > Date: Thursday, February 11, 2016 at 11:40 > To: "dev@hive.apache.org" > Subject: Re: [VOTE] Apache Hive 2.0.0 Release Candidate 3 > > +1, checks sigs, build with fresh maven repo, and did a quick smoke test. > > Does the PMC member who votes on the most RC's get a prize? > > Alan. > > Sergey Shelukhin > February 9, 2016 at 18:52 > Another day, another release candidate... this time fixing the logging > issue. > > Apache Hive 2.0.0 Release Candidate 3 is available here: > > http://people.apache.org/~sershe/hive-2.0.0-rc3/ > > Maven artifacts are at > > https://repository.apache.org/content/repositories/orgapachehive-1046/ > > > Source tag for RC3 (github mirror) is: > https://github.com/apache/hive/releases/tag/release-2.0.0-rc3 > > > Voting will conclude in 72 hours. > > Hive PMC Members: Please test and vote. > > > Thanks. > > > -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect
Vikram Dixit K created HIVE-12992: - Summary: Hive on tez: Bucket map join plan is incorrect Key: HIVE-12992 URL: https://issues.apache.org/jira/browse/HIVE-12992 Project: Hive Issue Type: Bug Affects Versions: 1.2.1, 2.0.0 Reporter: Vikram Dixit K TPCH Query 9 fails when bucket map join is enabled: {code} FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, taskVertexName=Map 1, edgeVertexName=Reducer 5, taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, destinationVertexName=Reducer 5, java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88) at org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458) at org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386) at org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439) at org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382) at org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202) at org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172) at org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
Vikram Dixit K created HIVE-12947: - Summary: SMB join in tez has ClassCastException when container reuse is on Key: HIVE-12947 URL: https://issues.apache.org/jira/browse/HIVE-12947 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to org.apache.hadoop.hive.ql.exec.DummyStoreOperator at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) ... 15 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12905) Issue with mapjoin in tez under certain conditions
Vikram Dixit K created HIVE-12905: - Summary: Issue with mapjoin in tez under certain conditions Key: HIVE-12905 URL: https://issues.apache.org/jira/browse/HIVE-12905 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1, 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K In a specific case where we have an outer join followed by another join on the same key and the non-outer side of the outer join is empty, hive-on-tez produces incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12797) Synchronization issues with tez/llap session pool in hs2
Vikram Dixit K created HIVE-12797: - Summary: Synchronization issues with tez/llap session pool in hs2 Key: HIVE-12797 URL: https://issues.apache.org/jira/browse/HIVE-12797 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K The changes introduced as part of HIVE-12674 causes issues while shutting down hs2 when session pools are used. {code} java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) ~[?:1.8.0_45] at java.util.LinkedList$ListItr.remove(LinkedList.java:921) ~[?:1.8.0_45] at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:288) ~[hive-exec-2.0.0.2.3.5.0-79.jar:2.0.0.2.3.5.0-79] at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:479) [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79] at org.apache.hive.service.server.HiveServer2$2.run(HiveServer2.java:183) [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12768) Thread safety: binary sortable serde decimal deserialization
Vikram Dixit K created HIVE-12768: - Summary: Thread safety: binary sortable serde decimal deserialization Key: HIVE-12768 URL: https://issues.apache.org/jira/browse/HIVE-12768 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Blocker We see thread safety issues due to static decimal buffer in binary sortable serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12740) NPE with HS2 when using null input format
Vikram Dixit K created HIVE-12740: - Summary: NPE with HS2 when using null input format Key: HIVE-12740 URL: https://issues.apache.org/jira/browse/HIVE-12740 Project: Hive Issue Type: Bug Components: HiveServer2, Tez Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical When we have a query that returns empty rows and when using tez with hs2, we hit NPE: {code} java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:490) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:447) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.writeOldSplits(MRInputHelpers.java:559) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplits(MRInputHelpers.java:619) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.configureMRInputWithLegacySplitGeneration(MRInputHelpers.java:109) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:617) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:1103) at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:386) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:175) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:156) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1816) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1561) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1338) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1154) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1147) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:181) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:73) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:234) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:247) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.Utilities.isVectorMode(Utilities.java:3241) at org.apache.hadoop.hive.ql.io.HiveInputFormat.wrapForLlap(HiveInputFormat.java:208) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputFormatFromCache(HiveInputFormat.java:267) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:103) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:80) ... 4 more 15/12/17 18:59:06 INFO log.PerfLogger: 15/12/17 18:59:06 ERROR exec.Task: Failed to execute tez graph. org.apache.tez.dag.api.TezUncheckedException: Failed to generate InputSplits at org.apache.tez.mapreduce.hadoop.MRInputHelpers.configureMRInputWithLegacySplitGeneration(MRInputHelpers.java:124) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:617) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:1103) at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:386) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:175) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:156) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1816) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1561) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1338) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1154) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1147) at org.apache.hive.service.cli.operation.SQLOperation.runQuery
[jira] [Created] (HIVE-12437) SMB join in tez fails when one of the tables is empty
Vikram Dixit K created HIVE-12437: - Summary: SMB join in tez fails when one of the tables is empty Key: HIVE-12437 URL: https://issues.apache.org/jira/browse/HIVE-12437 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical It looks like a better check for empty tables is to depend on the existence of the record reader for the input from tez. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12387) Bug with logging improvements in ATS
Vikram Dixit K created HIVE-12387: - Summary: Bug with logging improvements in ATS Key: HIVE-12387 URL: https://issues.apache.org/jira/browse/HIVE-12387 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K When indexing in ATS, the space in the value is not useful. We need to change to use the hive query id throughout the logging phase and also add information about what config the user passed in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12254) Improve logging with yarn/hdfs
Vikram Dixit K created HIVE-12254: - Summary: Improve logging with yarn/hdfs Key: HIVE-12254 URL: https://issues.apache.org/jira/browse/HIVE-12254 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K In extension to HIVE-12249, adding info for Yarn/HDFS as well. Both HIVE-12249 and HDFS-9184 are required before this can be resolved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12249) Improve logging with tez
Vikram Dixit K created HIVE-12249: - Summary: Improve logging with tez Key: HIVE-12249 URL: https://issues.apache.org/jira/browse/HIVE-12249 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need to improve logging across the board. TEZ-2851 added a caller context so that one can correlate logs with the application. This jira adds a new configuration for users that can be used to correlate the logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12204) Tez queries stopped running with ApplicationNotRunningException
Vikram Dixit K created HIVE-12204: - Summary: Tez queries stopped running with ApplicationNotRunningException Key: HIVE-12204 URL: https://issues.apache.org/jira/browse/HIVE-12204 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1, 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K In some error cases, if hive can no longer submit DAGs to tez, there is no use retrying to submit. We need to exit by throwing exception in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12201) Tez settings need to be shown in set -v output when execution engine is tez.
Vikram Dixit K created HIVE-12201: - Summary: Tez settings need to be shown in set -v output when execution engine is tez. Key: HIVE-12201 URL: https://issues.apache.org/jira/browse/HIVE-12201 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Minor Attachments: HIVE-12201.1.patch The set -v output currently shows configurations for yarn, hdfs etc. but does not show tez settings when tez is set as the execution engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan
Congrats Ashutosh! On Wed, Sep 16, 2015 at 9:01 PM, Chetna Cwrote: > Congrats Ashutosh ! > > Thanks, > Chetna Chaudhari > > On 17 September 2015 at 06:53, Navis Ryu wrote: > > > Congratulations! > > > > 2015-09-17 9:35 GMT+09:00 Xu, Cheng A : > > > Congratulations, Ashutosh! > > > > > > -Original Message- > > > From: Mohammad Islam [mailto:misla...@yahoo.com.INVALID] > > > Sent: Thursday, September 17, 2015 8:23 AM > > > To: u...@hive.apache.org; Hive > > > Subject: Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan > > > > > > Congratulations Asutosh! > > > > > > > > > On Wednesday, September 16, 2015 4:51 PM, Bright Ling < > > brig...@hostworks.com.au> wrote: > > > > > > > > > #yiv7221259285 #yiv7221259285 -- _filtered #yiv7221259285 > > {font-family:SimSun;panose-1:2 1 6 0 3 1 1 1 1 1;} _filtered > #yiv7221259285 > > {font-family:PMingLiU;panose-1:2 2 5 0 0 0 0 0 0 0;} _filtered > > #yiv7221259285 {panose-1:2 4 5 3 5 4 6 3 2 4;} _filtered #yiv7221259285 > > {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 2 4;} _filtered > > #yiv7221259285 {font-family:Tahoma;panose-1:2 11 6 4 3 5 4 4 2 4;} > > _filtered #yiv7221259285 {panose-1:2 2 5 0 0 0 0 0 0 0;} _filtered > > #yiv7221259285 {panose-1:2 1 6 0 3 1 1 1 1 1;}#yiv7221259285 > #yiv7221259285 > > p.yiv7221259285MsoNormal, #yiv7221259285 li.yiv7221259285MsoNormal, > > #yiv7221259285 div.yiv7221259285MsoNormal > > {margin:0cm;margin-bottom:.0001pt;font-size:12.0pt;}#yiv7221259285 > a:link, > > #yiv7221259285 span.yiv7221259285MsoHyperlink > > {color:blue;text-decoration:underline;}#yiv7221259285 a:visited, > > #yiv7221259285 span.yiv7221259285MsoHyperlinkFollowed > > {color:purple;text-decoration:underline;}#yiv7221259285 > > p.yiv7221259285MsoAcetate, #yiv7221259285 li.yiv7221259285MsoAcetate, > > #yiv7221259285 div.yiv7221259285MsoAcetate > > {margin:0cm;margin-bottom:.0001pt;font-size:8.0pt;}#yiv7221259285 > > span.yiv7221259285EmailStyle17 {color:#1F497D;}#yiv7221259285 > > span.yiv7221259285BalloonTextChar {}#yiv7221259285 > > .yiv7221259285MsoChpDefault {font-size:10.0pt;} _filtered #yiv7221259285 > > {margin:72.0pt 72.0pt 72.0pt 72.0pt;}#yiv7221259285 > > div.yiv7221259285WordSection1 {}#yiv7221259285 Congratulations Asutosh! > >From: Sathi Chowdhury [mailto:sathi.chowdh...@lithium.com] > > > Sent: Thursday, 17 September 2015 8:04 AM > > > To: u...@hive.apache.org > > > Subject: Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan > > Congrats Asutosh!From:Sergey Shelukhin > > > Reply-To: "u...@hive.apache.org" > > > Date: Wednesday, September 16, 2015 at 2:31 PM > > > To: "u...@hive.apache.org" > > > Subject: Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan > > Congrats!From:Alpesh Patel > > > Reply-To: "u...@hive.apache.org" > > > Date: Wednesday, September 16, 2015 at 13:24 > > > To: "u...@hive.apache.org" > > > Subject: Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan > > Congratulations AshutoshOn Wed, Sep 16, 2015 at 1:23 PM, Pengcheng > > Xiong wrote: Congratulations Ashutosh!On Wed, > Sep > > 16, 2015 at 1:17 PM, John Pullokkaran > > wrote: Congrats Ashutosh!From:Vaibhav Gumashta < > > vgumas...@hortonworks.com> > > > Reply-To: "u...@hive.apache.org" > > > Date: Wednesday, September 16, 2015 at 1:01 PM > > > To: "u...@hive.apache.org" , " > dev@hive.apache.org" > > > > > Cc: Ashutosh Chauhan > > > Subject: Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan > > Congrats Ashutosh! —VaibhavFrom:Prasanth Jayachandran < > > pjayachand...@hortonworks.com> > > > Reply-To: "u...@hive.apache.org" > > > Date: Wednesday, September 16, 2015 at 12:50 PM > > > To: "dev@hive.apache.org" , "u...@hive.apache.org > " > > > > > Cc: "dev@hive.apache.org" , Ashutosh Chauhan < > > hashut...@apache.org> > > > Subject: Re: [ANNOUNCE] New Hive PMC Chair - Ashutosh Chauhan > > Congratulations Ashutosh! > > > > > > On Wed, Sep 16, 2015 at 12:48 PM -0700, "Xuefu Zhang" < > > xzh...@cloudera.com> wrote: Congratulations, Ashutosh!. Well-deserved. > > > > > > Thanks to Carl also for the hard work in the past few years! > > > > > > --Xuefu > > > > > > On Wed, Sep 16, 2015 at 12:39 PM, Carl Steinbach > wrote: > > > > > >> I am very happy to announce that Ashutosh Chauhan is taking over as > > >> the new VP of the Apache Hive project. Ashutosh has been a longtime > > >> contributor to Hive and has played a pivotal role in many of the major > > >> advances that have been made over the past couple of years. Please > > >> join me in congratulating Ashutosh on his new role! > > >> > > > > > > > > > -- Nothing better than when appreciated
[jira] [Created] (HIVE-11829) Create test for HIVE-11216
Vikram Dixit K created HIVE-11829: - Summary: Create test for HIVE-11216 Key: HIVE-11829 URL: https://issues.apache.org/jira/browse/HIVE-11829 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need tests for HIVE-11216. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11806) Create test for HIVE-11174
Vikram Dixit K created HIVE-11806: - Summary: Create test for HIVE-11174 Key: HIVE-11806 URL: https://issues.apache.org/jira/browse/HIVE-11806 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Minor We are lacking tests for HIVE-11174. Adding one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11606) Bucket map joins fail at hash table construction time
Vikram Dixit K created HIVE-11606: - Summary: Bucket map joins fail at hash table construction time Key: HIVE-11606 URL: https://issues.apache.org/jira/browse/HIVE-11606 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11605) Incorrect results with bucket map join in tez.
Vikram Dixit K created HIVE-11605: - Summary: Incorrect results with bucket map join in tez. Key: HIVE-11605 URL: https://issues.apache.org/jira/browse/HIVE-11605 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0, 1.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-11605.1.patch In some cases, we aggressively try to convert to a bucket map join and this ends up producing incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: too many 1.*.* unreleased versions on the JIRA
Updated the releases. 1.0.2 and 1.2.2 are not released yet. On Fri, Aug 14, 2015 at 2:38 PM, Sergey Shelukhin ser...@hortonworks.com wrote: Anyone? :) On 15/8/13, 14:52, Sergey Shelukhin ser...@hortonworks.com wrote: On the JIRA, we currently have 1.1.0 marked as unreleased even though 1.2.0 is released (and 1.1.1 is also present); then, we have both 1.0.1 and 1.0.2, plus 1.2.1 and 1.2.2 showing in unreleased. I poked around and cannot see where this can be changed. Release managers for respective releases should probably clean this up, anyway :) -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-11360) Bucketing doesn't work correctly with unpartitioned tables
Vikram Dixit K created HIVE-11360: - Summary: Bucketing doesn't work correctly with unpartitioned tables Key: HIVE-11360 URL: https://issues.apache.org/jira/browse/HIVE-11360 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0 Reporter: Vikram Dixit K When we try to create bucket files with unpartitioned tables, enforce bucketing doesn't create the empty bucket files. [~prasanth_j] for your reference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11355) Hive on tez: memory manager for sort buffers (input/output) and operators
Vikram Dixit K created HIVE-11355: - Summary: Hive on tez: memory manager for sort buffers (input/output) and operators Key: HIVE-11355 URL: https://issues.apache.org/jira/browse/HIVE-11355 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need to better manage the sort buffer allocations to ensure better performance. Also, we need to provide configurations to certain operators to stay within memory limits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11356) SMB join on tez fails when one of the tables is empty
Vikram Dixit K created HIVE-11356: - Summary: SMB join on tez fails when one of the tables is empty Key: HIVE-11356 URL: https://issues.apache.org/jira/browse/HIVE-11356 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} :java.lang.IllegalStateException: Unexpected event. All physical sources already initialized at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673) at java.lang.Thread.run(Thread.java:745) ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] Vertex killed, vertexName=Reducer 5, vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask HQL-FAILED {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive PMC Member - Sushanth Sowmyan
Congrats Sushanth! On Wed, Jul 22, 2015 at 10:27 AM, Prasanth Jayachandran pjayachand...@hortonworks.com wrote: Congratulations Sushanth! On Jul 22, 2015, at 10:16 AM, Gunther Hagleitner ghagleit...@hortonworks.com wrote: Congratulations Sushanth! Gunther. From: Chao Sun c...@cloudera.com Sent: Wednesday, July 22, 2015 9:59 AM To: dev@hive.apache.org Cc: Sushanth Sowmyan Subject: Re: [ANNOUNCE] New Hive PMC Member - Sushanth Sowmyan Congrats! On Wed, Jul 22, 2015 at 9:58 AM, Jesus Camachorodriguez jcamachorodrig...@hortonworks.com wrote: Congrats Sushanth! -- Jesús On 7/22/15, 5:53 PM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: Congrats Sush! ‹Vaibhav On 7/22/15, 9:45 AM, Carl Steinbach c...@apache.org wrote: I am pleased to announce that Sushanth Sowmyan has been elected to the Hive Project Management Committee. Please join me in congratulating Sushanth! Thanks. - Carl -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-11292) MiniLlapCliDriver for running tests in llap
Vikram Dixit K created HIVE-11292: - Summary: MiniLlapCliDriver for running tests in llap Key: HIVE-11292 URL: https://issues.apache.org/jira/browse/HIVE-11292 Project: Hive Issue Type: Bug Components: Test Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Create MiniLlapCliDriver for running unit tests in llap mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive Committer - Pengcheng Xiong
Congratulations Pengcheng! On Thu, Jul 16, 2015 at 10:10 AM, Hari Subramaniyan hsubramani...@hortonworks.com wrote: Congrats Pengcheng! From: Chao Sun c...@cloudera.com Sent: Thursday, July 16, 2015 10:06 AM To: dev@hive.apache.org Subject: Re: [ANNOUNCE] New Hive Committer - Pengcheng Xiong Congrats Pengcheng! On Thu, Jul 16, 2015 at 10:03 AM, Szehon Ho sze...@cloudera.com wrote: Congrats! On Thu, Jul 16, 2015 at 6:47 AM, Vaibhav Gumashta vgumas...@hortonworks.com wrote: Congrats Pengcheng! ‹Vaibhav On 7/16/15, 7:12 PM, Chaoyu Tang ctang...@gmail.com wrote: Congratulations to Pengcheng! On Thu, Jul 16, 2015 at 9:10 AM, Xuefu Zhang xzh...@cloudera.com wrote: Congratulations, Pengcheng! On Thu, Jul 16, 2015 at 4:50 AM, Carl Steinbach c...@apache.org wrote: The Apache Hive PMC has voted to make Pengcheng Xiong a committer on the Apache Hive Project. Please join me in congratulating Pengcheng! Thanks. - Carl -- Nothing better than when appreciated for hard work. -Mark
Re: [VOTE] Apache Hive 1.2.1 Release Candidate 0
+1 built on both profiles and ran a simple query on the rc. Thanks Vikram. On Sat, Jun 20, 2015 at 7:47 AM, Thejas Nair thejas.n...@gmail.com wrote: +1 Checked signatures, checksums Checked release notes Reviewed changes in pom files. Built with hadoop2 and hadoop1. Ran some simple queries in local mode. On Fri, Jun 19, 2015 at 5:00 PM, Gunther Hagleitner ghagleit...@hortonworks.com wrote: +1 Checked signatures, compiled, ran some tests. Thanks, Gunther. -- *From:* Alan Gates alanfga...@gmail.com *Sent:* Friday, June 19, 2015 11:44 AM *To:* dev@hive.apache.org *Subject:* Re: [VOTE] Apache Hive 1.2.1 Release Candidate 0 +1. Checked signatures, looked for binary files, compiled the code, and ran a rat check. Alan. Sushanth Sowmyan khorg...@gmail.com June 19, 2015 at 2:44 Hi Folks, It's been a month since 1.2.0, and I promised to do a stabilization 1.2.1 release, and this is it. A large number of patches have been applied since 1.2.0, and major known issues have been cleared/fixed. A few jiras were deferred out to 1.3/2.0 as not being ready to commit into 1.2.1 at this time. More details are available here : https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status Apache Hive 1.2.1 Release Candidate 0 is available here: https://people.apache.org/~khorgath/releases/1.2.1_RC0/artifacts/ My public key used for signing is as available from the hive committers key list : http://www.apache.org/dist/hive/KEYS Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1040/ Source tag for RC0 is up on the apache git repo as tag release-1.2.1-rc0 (Browseable view over at https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=0f6ee99efc911cbc1566f9bbbc63a51600302703 ) Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks, -Sushanth -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-11027) Hive on tez: Bucket map joins fail when hashcode goes negative
Vikram Dixit K created HIVE-11027: - Summary: Hive on tez: Bucket map joins fail when hashcode goes negative Key: HIVE-11027 URL: https://issues.apache.org/jira/browse/HIVE-11027 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0 Reporter: Vikram Dixit K Assignee: Prasanth Jayachandran Seeing an issue when dynamic sort optimization is enabled while doing an insert into bucketed table. We seem to be flipping the negative sign on the hashcode instead of taking the complement of it for routing the data correctly. This results in correctness issues in bucket map joins in hive on tez when the hash code goes negative. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values
Vikram Dixit K created HIVE-10929: - Summary: In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values Key: HIVE-10929 URL: https://issues.apache.org/jira/browse/HIVE-10929 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} create table dummy(i int); insert into table dummy values (1); select * from dummy; create table partunion1(id1 int) partitioned by (part1 string); set hive.exec.dynamic.partition.mode=nonstrict; set hive.execution.engine=tez; explain insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; select * from partunion1; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins
Vikram Dixit K created HIVE-10907: - Summary: Hive on Tez: Classcast exception in some cases with SMB joins Key: HIVE-10907 URL: https://issues.apache.org/jira/browse/HIVE-10907 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K In cases where there is a mix of Map side work and reduce side work, we get a classcast exception because we assume homogeneity in the code. We need to fix this correctly. For now this is a workaround. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10908) Hive on tez: SMB join needs to work with different type of work items (map side with reduce side)
Vikram Dixit K created HIVE-10908: - Summary: Hive on tez: SMB join needs to work with different type of work items (map side with reduce side) Key: HIVE-10908 URL: https://issues.apache.org/jira/browse/HIVE-10908 Project: Hive Issue Type: Improvement Components: Tez Affects Versions: 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K This is related to HIVE-10907. This is going to be the actual enhancement/fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Stable releases from branch-1 and experimental releases from master
+1 for all the reasons outlined. On Tue, May 26, 2015 at 6:13 PM, Thejas Nair thejas.n...@gmail.com wrote: +1 - This is great for users who want to take longer to upgrade from hadoop-1 and care mainly for bug fixes and incremental features, rather than radical new features. - The ability to release initial 2.x releases marked as alpha/beta also helps to get users to try it out, and also lets them choose what is right for them. - This also lets developers focus on major new features without the burden of maintaining hadoop-1 compatibility. On Tue, May 26, 2015 at 11:41 AM, Alan Gates alanfga...@gmail.com wrote: We have discussed this for several weeks now. Some concerns have been raised which I have tried to address. I think it is time to vote on it as our release plan. To be specific, I propose: Hive makes a branch-1 from the current master. This would be used for 1.3 and future 1.x releases. This branch would not deprecate existing functionality. Any new features in this branch would also need to be put on master. An upgrade path for users will be maintained from one 1.x release to the next, as well as from the latest 1.x release to the latest 2.x release. Going forward releases numbered 2.x will be made from master. The purpose of these releases will be to enable users to get access to new features being developed in Hive and allow developers to get feedback. It is expected that for a while these releases will not be production ready and will be clearly so labeled. Some legacy features, such as Hadoop 1 and MapReduce, will no longer be supported in the master. Any critical bug fixes (security, incorrect results, crashes) fixed in master will also be ported to branch-1 for at least a year. This time period may be extended in the future based on the stability and adoption of 2.x releases. Based on Hive's bylaws this release plan vote will be open for 3 days and all active committers have binding votes. Here's my +1. Alan. -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-10742) rename_table_location.q test fails
Vikram Dixit K created HIVE-10742: - Summary: rename_table_location.q test fails Key: HIVE-10742 URL: https://issues.apache.org/jira/browse/HIVE-10742 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.0, 1.3.0 Reporter: Vikram Dixit K Assignee: Sushanth Sowmyan The test rename_table_location.q fails all the time but is not being caught by the HiveQA. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache Hive 1.2.0 release candidate 5
I built against hadoop1 and hadoop2 and ran the rat tool as well. Ran a couple of queries. +1 Thanks Vikram. On Thu, May 14, 2015 at 6:30 PM, Sushanth Sowmyan khorg...@gmail.com wrote: Hi Folks, We've cleared all the blockers listed for 1.2.0 release, either committing them, or deferring out to an eventual 1.2.1 stabilization release. (Any deferrals were a result of discussion between myself and the committer responsible for the issue.) More details are available here : https://cwiki.apache.org/confluence/display/Hive/Hive+1.2+Release+Status Apache Hive 1.2.0 Release Candidate 5 is available here: https://people.apache.org/~khorgath/releases/1.2.0_RC5/artifacts/ My public key used for signing is as available from the hive committers key list : http://www.apache.org/dist/hive/KEYS Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1039 Source tag for RC5 is up on the apache git repo as tag release-1.2.0-rc5 (Browseable view over at https://git-wip-us.apache.org/repos/asf?p=hive.git;a=tag;h=76b90268084f529852396302884297b3c22fcf00 ) Since this has minimal changes from the previous RC, I would further request that this vote conclude in 20 hours(which is past the 72 hr time from the previous RC announcement) if we have enough +1s in the meanwhile. Hive PMC Members: Please test and vote. Thanks, -Sushanth -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-10719) Hive metastore failure when alter table rename is attempted.
Vikram Dixit K created HIVE-10719: - Summary: Hive metastore failure when alter table rename is attempted. Key: HIVE-10719 URL: https://issues.apache.org/jira/browse/HIVE-10719 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} create database newDB location /tmp/; describe database extended newDB; use newDB; create table tab (name string); alter table tab rename to newName; {code} Fails: {code} InvalidOperationException(message:Unable to access old location hdfs://localhost:8020/tmp/tab for table x.tab) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive Committers - Cheng Xu, Dong Chen, and Hari Sankar Sivarama Subramaniyan
Congrats guys! On Mon, May 11, 2015 at 2:34 PM, Sushanth Sowmyan khorg...@gmail.com wrote: Congratulations, and thank you for your contributions! :) On Mon, May 11, 2015 at 2:17 PM, Sergio Pena sergio.p...@cloudera.com wrote: Congratulations Guys !!! :) On Mon, May 11, 2015 at 3:54 PM, Carl Steinbach c...@apache.org wrote: The Apache Hive PMC has voted to make Cheng Xu, Dong Chen, and Hari Sankar Sivarama Subramaniyan committers on the Apache Hive Project. Please join me in congratulating Cheng, Dong, and Hari! Thanks. - Carl -- Nothing better than when appreciated for hard work. -Mark
Re: [DISCUSS] Supporting Hadoop-1 and experimental features
The proposal sounds good. Supporting and maintaining hadoop-1 is hard and conflict in API changes in 2.x of hadoop keeps us from using new and better APIs as it breaks compilation. +1 Thanks Vikram. On Mon, May 11, 2015 at 7:17 PM, Sergey Shelukhin ser...@hortonworks.com wrote: That sounds like a good idea. Some features could be back ported to branch-1 if viable, but at least new stuff would not be burdened by Hadoop 1/MR code paths. Probably also a good place to enable vectorization and other perf features by default while we make alpha releases. +1 On 15/5/11, 15:38, Alan Gates alanfga...@gmail.com wrote: There is a lot of forward-looking work going on in various branches of Hive: LLAP, the HBase metastore, and the work to drop the CLI. It would be good to have a way to release this code to users so that they can experiment with it. Releasing it will also provide feedback to developers. At the same time there are discussions on whether to keep supporting Hadoop-1. The burden of supporting older, less used functionality such as Hadoop-1 is becoming ever harder as many new features are added. I propose that the best way to deal with this would be to make a branch-1. We could continue to make new feature releases off of this branch (1.3, 1.4, etc.). This branch would not drop old functionality. This provides stability and continuity for users and developers. We could then merge these new features branches (LLAP, HBase metastore, CLI drop) into the trunk, as well as turn on by default newer features such as the vectorization and ACID. We could also drop older, less used features such as support for Hadoop-1 and MapReduce. It will be a while before we are ready to make stable, production ready releases of this code. But we could start making alpha quality releases soon. We would call these releases 2.x, to stress the non-backward compatible changes such as dropping Hadoop-1. This will give users a chance to play with the new code and developers a chance to get feedback. Thoughts? -- Nothing better than when appreciated for hard work. -Mark
[jira] [Created] (HIVE-10647) Hive on LLAP: Limit HS2 from overwhelming LLAP
Vikram Dixit K created HIVE-10647: - Summary: Hive on LLAP: Limit HS2 from overwhelming LLAP Key: HIVE-10647 URL: https://issues.apache.org/jira/browse/HIVE-10647 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10611) Mini tez tests wait for 5 minutes before shutting down
Vikram Dixit K created HIVE-10611: - Summary: Mini tez tests wait for 5 minutes before shutting down Key: HIVE-10611 URL: https://issues.apache.org/jira/browse/HIVE-10611 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Currently, at shutdown, the tez mini cluster waits for the session to close before shutting down the cluster. This ends up being 5 minutes - the default value. We can shut down the session to alleviate this situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10542) Full outer joins in tez produce incorrect results in certain cases
Vikram Dixit K created HIVE-10542: - Summary: Full outer joins in tez produce incorrect results in certain cases Key: HIVE-10542 URL: https://issues.apache.org/jira/browse/HIVE-10542 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Blocker If there is no records for one of the tables in the full outer join, we do not read the other input and end up not producing rows which we should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10323) Tez merge join operator does not honor hive.join.emit.interal
Vikram Dixit K created HIVE-10323: - Summary: Tez merge join operator does not honor hive.join.emit.interal Key: HIVE-10323 URL: https://issues.apache.org/jira/browse/HIVE-10323 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K This affects efficiency in case of skews. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10273) Union with partition tables which have no data fails with NPE
Vikram Dixit K created HIVE-10273: - Summary: Union with partition tables which have no data fails with NPE Key: HIVE-10273 URL: https://issues.apache.org/jira/browse/HIVE-10273 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10233) Hive on LLAP: Memory manager
Vikram Dixit K created HIVE-10233: - Summary: Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10232) Map join in tez needs to account for memory limits due to other map join operators possible in the same work
Vikram Dixit K created HIVE-10232: - Summary: Map join in tez needs to account for memory limits due to other map join operators possible in the same work Key: HIVE-10232 URL: https://issues.apache.org/jira/browse/HIVE-10232 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K There seems to be a regression with respect to MR in terms of allowing multiple map joins in the same task by not accounting for the memory consumed in each of the joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10201) Hive LLAP needs refactoring of the configuration class
Vikram Dixit K created HIVE-10201: - Summary: Hive LLAP needs refactoring of the configuration class Key: HIVE-10201 URL: https://issues.apache.org/jira/browse/HIVE-10201 Project: Hive Issue Type: Bug Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: llap In order for the client to take decisions regarding resource requirement and availability, we need to move the configuration class to llap-client. In the future, we will need to get the configurations from a service such as zookeeper to keep in sync with what is actually deployed on the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10001) SMB join in reduce side
Vikram Dixit K created HIVE-10001: - Summary: SMB join in reduce side Key: HIVE-10001 URL: https://issues.apache.org/jira/browse/HIVE-10001 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9886) Hive on tez: NPE when converting join to SMB in sub-query
Vikram Dixit K created HIVE-9886: Summary: Hive on tez: NPE when converting join to SMB in sub-query Key: HIVE-9886 URL: https://issues.apache.org/jira/browse/HIVE-9886 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical {code} set hive.auto.convert.sortmerge.join = true; create table t1( id string, od string); create table t2( id string, od string); select vt1.id from (select rt1.id from (select t1.id, row_number() over (partition by id order by od desc) as row_no from t1) rt1 where rt1.row_no=1) vt1 join (select rt2.id from (select t2.id, row_number() over (partition by id order by od desc) as row_no from t2) rt2 where rt2.row_no=1) vt2 where vt1.id=vt2.id; {code} throws NPE: {code} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.getValueObjectInspectors(AbstractMapJoinOperator.java:96) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:167) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:310) at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:72) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.initializeOp(CommonMergeJoinOperator.java:89) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:66) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initializeOp(Operator.java:410) at org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:89) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:116) ... 14 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9832) Merge join followed by union and a map join in hive on tez fails.
Vikram Dixit K created HIVE-9832: Summary: Merge join followed by union and a map join in hive on tez fails. Key: HIVE-9832 URL: https://issues.apache.org/jira/browse/HIVE-9832 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical {code} select a.key, b.value from (select x.key as key, y.value as value from srcpart x join srcpart y on (x.key = y.key) union all select key, value from srcpart z) a join src b on (a.value = b.value); {code} {code} TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:214) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) ... 13 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:317) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:196) ... 14 more ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1425055721029_0048_4_09 [Reducer 5] killed/failed due to:null] Vertex killed, vertexName=Reducer 7, vertexId=vertex_1425055721029_0048_4_11, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1425055721029_0048_4_11 [Reducer 7] killed/failed due to:null] Vertex killed, vertexName=Reducer 4, vertexId=vertex_1425055721029_0048_4_07, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1425055721029_0048_4_07 [Reducer 4] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:2 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9836) Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns)
Vikram Dixit K created HIVE-9836: Summary: Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns) Key: HIVE-9836 URL: https://issues.apache.org/jira/browse/HIVE-9836 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-9836.1.patch {code} explain select a.key, a.value, b.value from tab a join tab_part b on a.key = b.key and a.ds = b.ds; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: VOTE Bylaw for having branch committers in hive
Hi Carl, Here is the list of 17 active PMC members: Brock Noland Carl Steinbach Edward Capriolo Alan Gates Gunther Hagleitner Ashutosh Chauhan Jason Dere Lefty Leverenz Navis Ryu Owen O'Malley Prasad Suresh Mujumdar Prasanth J Harish Butani Szehon Ho Thejas Madhavan Nair Vikram Dixit K Xuefu Zhang Non active members: Ashish Thusoo Kevin Wilfong He Yongqiang Namit Jain Joydeep Sensarma Ning Zhang Raghotham Murthy https://issues.apache.org/jira/issues/?jql=text%20~%20%22kevin%20wilfong%22%20OR%20text%20~%20%22ashish%20thusoo%22%20or%20text%20~%20%22heyongqiang%22%20OR%20text%20~%20%22Namit%20Jain%22%20OR%20text%20~%20%22joydeep%20sensarma%22%20OR%20text%20~%20%22ning%20zhang%22%20OR%20text%20~%20%22raghotham%20murthy%22%20AND%20project%20%3D%20Hive%20ORDER%20BY%20updated%20DESC In the results, only the first 4/5 need to be considered because of the time line of 6 months. All of them were resolved in prior years and the last comments are mostly hudson or closing comments by others. I could not see any mails from them on the mailing lists either during this period. Thus those 7 members haven't met the criterion for being active as specified in the hive bylaws. Should I change the bylaw for this type of vote happening to dev list instead of the user mailing list as it is currently stated? Thanks Vikram. On Wed, Feb 18, 2015 at 12:33 PM, Carl Steinbach cwsteinb...@gmail.com wrote: Hi Vikram, Can you please post the names of the 17 currently active PMC members so that we have it for the records? Also, according to the bylaws this vote was supposed to happen on the user@hive list. Maybe we want to change this? Thanks. - Carl On Wed, Feb 18, 2015 at 12:25 PM, Vikram Dixit K vikram.di...@gmail.com wrote: Yes. The vote passes with 12 +1s out of 17 currently active PMC members. I will update the wiki with the new bylaws. On Wed, Feb 18, 2015 at 11:15 AM, Ashutosh Chauhan hashut...@apache.org wrote: Seems like there is consensus all around. Vikram, would you like to update the wiki with new bylaws? Thanks, Ashutosh On Wed, Feb 18, 2015 at 8:58 AM, Prasad Mujumdar pras...@apache.org wrote: +1 thanks Prasad On Mon, Feb 9, 2015 at 2:43 PM, Vikram Dixit K vikram.di...@gmail.com wrote: Hi Folks, We seem to have quite a few projects going around and in the interest of time and the project as a whole, it seems good to have branch committers much like what is there in the Hadoop project. I am proposing an addition to the committer bylaws as follows ( taken from the hadoop project bylaws http://hadoop.apache.org/bylaws.html ) Significant, pervasive features are often developed in a speculative branch of the repository. The PMC may grant commit rights on the branch to its consistent contributors, while the initiative is active. Branch committers are responsible for shepherding their feature into an active release and do not cast binding votes or vetoes in the project. Actions: New Branch Committer Description: When a new branch committer is proposed for the project. Approval: Lazy Consensus Binding Votes: Active PMC members Minimum Length: 3 days Mailing List: priv...@hive.apache.org Actions: Removal of Branch Committer Description: When a branch committer is removed from the project. Approval: Consensus Binding Votes: Active PMC members excluding the committer in question if they are PMC members too. Minimum Length: 6 days Mailing List: priv...@hive.apache.org This vote will run for 6 days. PMC members please vote. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark -- Nothing better than when appreciated for hard work. -Mark
Re: VOTE Bylaw for having branch committers in hive
Yes. The vote passes with 12 +1s out of 17 currently active PMC members. I will update the wiki with the new bylaws. On Wed, Feb 18, 2015 at 11:15 AM, Ashutosh Chauhan hashut...@apache.org wrote: Seems like there is consensus all around. Vikram, would you like to update the wiki with new bylaws? Thanks, Ashutosh On Wed, Feb 18, 2015 at 8:58 AM, Prasad Mujumdar pras...@apache.org wrote: +1 thanks Prasad On Mon, Feb 9, 2015 at 2:43 PM, Vikram Dixit K vikram.di...@gmail.com wrote: Hi Folks, We seem to have quite a few projects going around and in the interest of time and the project as a whole, it seems good to have branch committers much like what is there in the Hadoop project. I am proposing an addition to the committer bylaws as follows ( taken from the hadoop project bylaws http://hadoop.apache.org/bylaws.html ) Significant, pervasive features are often developed in a speculative branch of the repository. The PMC may grant commit rights on the branch to its consistent contributors, while the initiative is active. Branch committers are responsible for shepherding their feature into an active release and do not cast binding votes or vetoes in the project. Actions: New Branch Committer Description: When a new branch committer is proposed for the project. Approval: Lazy Consensus Binding Votes: Active PMC members Minimum Length: 3 days Mailing List: priv...@hive.apache.org Actions: Removal of Branch Committer Description: When a branch committer is removed from the project. Approval: Consensus Binding Votes: Active PMC members excluding the committer in question if they are PMC members too. Minimum Length: 6 days Mailing List: priv...@hive.apache.org This vote will run for 6 days. PMC members please vote. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark
[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely
[ https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320492#comment-14320492 ] Vikram Dixit K commented on HIVE-9683: -- +1 for 1.0 branch. Hive metastore thrift client connections hang indefinitely -- Key: HIVE-9683 URL: https://issues.apache.org/jira/browse/HIVE-9683 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.0.0, 1.0.1 Reporter: Gopal V Assignee: Gopal V Priority: Minor Fix For: 1.0.1 Attachments: HIVE-9683.1.patch THRIFT-2788 fixed network-partition problems that affect Thrift client connections. Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround can be applied to prevent indefinite connection hangs during net-splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6069: - Affects Version/s: 1.0.0 Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.0.0 Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Fix For: 1.2.0 Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6069: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~apivovarov]! Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.0.0 Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9523) when columns on which tables are partitioned are used in the join condition same join optimizations as for bucketed tables should be applied
[ https://issues.apache.org/jira/browse/HIVE-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9523: - Labels: gsoc2015 (was: ) when columns on which tables are partitioned are used in the join condition same join optimizations as for bucketed tables should be applied Key: HIVE-9523 URL: https://issues.apache.org/jira/browse/HIVE-9523 Project: Hive Issue Type: Improvement Components: Logical Optimizer, Physical Optimizer, SQL Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Maciek Kocon Labels: gsoc2015 For JOIN conditions where partitioning criteria are used respectively: ⋮ FROM TabA JOIN TabB ON TabA.partCol1 = TabB.partCol2 AND TabA.partCol2 = TabB.partCol2 the optimizer could/should choose to treat it the same way as with bucketed tables: ⋮ FROM TabC JOIN TabD ON TabC.clusteredByCol1 = TabD.clusteredByCol2 AND TabC.clusteredByCol2 = TabD.clusteredByCol2 and use either Bucket Map Join or better, the Sort Merge Bucket Map Join. This is based on fact that same way as buckets translate to separate files, the partitions essentially provide the same mapping. When data locality is known the optimizer could focus only on joining corresponding partitions rather than whole data sets. #side notes: ⦿ Currently Table DDL Syntax where Partitioning and Bucketing defined at the same time is allowed: CREATE TABLE ⋮ PARTITIONED BY(…) CLUSTERED BY(…) INTO … BUCKETS; But in this case optimizer never chooses to use Bucket Map Join or Sort Merge Bucket Map Join which defeats the purpose of creating BUCKETed tables in such scenarios. Should that be raised as a separate BUG? ⦿ Currently partitioning and bucketing are two separate things but serve same purpose - shouldn't the concept be merged (explicit/implicit partitions?) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6069: - Fix Version/s: 1.2.0 Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.0.0 Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Fix For: 1.2.0 Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9687) Blink DB style approximate querying in hive
Vikram Dixit K created HIVE-9687: Summary: Blink DB style approximate querying in hive Key: HIVE-9687 URL: https://issues.apache.org/jira/browse/HIVE-9687 Project: Hive Issue Type: New Feature Reporter: Vikram Dixit K http://www.cs.berkeley.edu/~sameerag/blinkdb_eurosys13.pdf There are various pieces here that need to be thought through and implemented. For e.g. sampling offline, run-time sampling selection module etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316923#comment-14316923 ] Vikram Dixit K commented on HIVE-6069: -- +1 LGTM. I will commit this shortly. Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [ANNOUNCE] New Hive Committers -- Chao Sun, Chengxiang Li, and Rui Li
Congrats guys! On Mon, Feb 9, 2015 at 12:42 PM, Szehon Ho sze...@cloudera.com wrote: Congratulations guys ! On Mon, Feb 9, 2015 at 3:38 PM, Jimmy Xiang jxi...@cloudera.com wrote: Congrats!! On Mon, Feb 9, 2015 at 12:36 PM, Alexander Pivovarov apivova...@gmail.com wrote: Congrats! On Mon, Feb 9, 2015 at 12:31 PM, Carl Steinbach c...@apache.org wrote: The Apache Hive PMC has voted to make Chao Sun, Chengxiang Li, and Rui Li committers on the Apache Hive Project. Please join me in congratulating Chao, Chengxiang, and Rui! Thanks. - Carl -- Nothing better than when appreciated for hard work. -Mark
Proposal for having branch committers
Hi Folks, We seem to have quite a few projects going around and in the interest of time and the project as a whole, it seems good to have branch committers much like what is there in the Hadoop project. I am proposing an addition to the committer bylaws as follows (taken from the hadoop project bylaws http://hadoop.apache.org/bylaws.html): Significant, pervasive features are often developed in a speculative branch of the repository. The PMC may grant commit rights on the branch to its consistent contributors, while the initiative is active. Branch committers are responsible for shepherding their feature into an active release and do not cast binding votes or vetoes in the project. I am +1 on this. Thanks Vikram.
Re: Created branch 1.0
The build check in HIVE-8933 fixed in HIVE-8845. On Mon, Feb 9, 2015 at 11:32 AM, Vikram Dixit K vikram.di...@gmail.com wrote: Hi Ed, This was the case with 0.14. It was fixed before 1.0 went out in HIVE-8933. Thanks Vikram. On Mon, Feb 9, 2015 at 9:08 AM, Alan Gates alanfga...@gmail.com wrote: That's fixed, correct? I do not believe there were any SNAPSHOT dependencies in 1.0. Alan. Edward Capriolo edlinuxg...@gmail.com February 9, 2015 at 8:40 Because we can not really have a stable api if by definition we build around snapshot dependencies. On Mon, Feb 9, 2015 at 11:38 AM, Edward Capriolo edlinuxg...@gmail.com edlinuxg...@gmail.com Edward Capriolo edlinuxg...@gmail.com February 9, 2015 at 8:38 Question. https://issues.apache.org/jira/browse/HIVE-8614 Did we not just agree in this thread that hive will no long have dependency that are SNAPSHOT? Brock Noland br...@cloudera.com January 22, 2015 at 22:06 Hi Alan, I agree with Xuefu and what was suggested in your statement. I was thinking we'd release the next release as 0.15 and then later there would be 1.0 off trunk (e.g. what would have been 0.16) and thus be superset (minus anything we intentionally remove). As I have said several times, I'd like to release more often so I feel we could even start the 1.0 work shortly after the 0.15 release. For my part, I do agree with some earlier contributor/user sentiment that it would be good to have some basic public API defined for 1.0. I don't think that will be too hard as it's more or less obvious what our public API is today. Hope this seems reasonable. Cheers, Brock Xuefu Zhang xzh...@cloudera.com January 22, 2015 at 12:31 Hi Thejas/Alan, From all the argument, I think there was an assumption that the proposed 1.0 release will be imminent and 0.15 will happen far after that. Based on that assumption, 0.15 will become 1.1, which is greater in scope than 1.0. However, this assumption may not be true. The confusion will be significant if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0. Another concern is that, the proposed release of 1.0 is a subset of of Hive's functionality, and for major releases users are expecting major improvement in functionality as well as stability. Mutating from 0.14.1 release seems falling short in that expectation. Having said that, I'd think it makes more sense to release 0.15 as 0.15, and later we release 1.0 as the major release that supersedes any previous releases. That will fulfill the expectations of a major release. Thanks, Xuefu Alan Gates ga...@hortonworks.com January 22, 2015 at 12:12 I had one clarifying question for Brock and Xuefu. Was your proposal to still call the branch from trunk you are planning in a few days 0.15 (and hence release it as 0.15) and have 1.0 be a later release? Or did you want to call what is now 0.15 1.0? If you wanted 1.0 to be post 0.15, are you ok with stipulating that the next release from trunk after 0.15 (what would have been 0.16) is 1.0? Alan. -- Nothing better than when appreciated for hard work. -Mark -- Nothing better than when appreciated for hard work. -Mark
Re: Proposal for having branch committers
Hi Folks, Creating a new formal vote thread for this. After looking at the bylaws page, it looks like we need to have a formal vote on it by the PMC members. Thanks Vikram. On Mon, Feb 9, 2015 at 1:56 PM, Lefty Leverenz leftylever...@gmail.com wrote: +1 -- Lefty On Mon, Feb 9, 2015 at 1:52 PM, Vikram Dixit K vikram.di...@gmail.com wrote: Hi Folks, We seem to have quite a few projects going around and in the interest of time and the project as a whole, it seems good to have branch committers much like what is there in the Hadoop project. I am proposing an addition to the committer bylaws as follows (taken from the hadoop project bylaws http://hadoop.apache.org/bylaws.html): Significant, pervasive features are often developed in a speculative branch of the repository. The PMC may grant commit rights on the branch to its consistent contributors, while the initiative is active. Branch committers are responsible for shepherding their feature into an active release and do not cast binding votes or vetoes in the project. I am +1 on this. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark
VOTE Bylaw for having branch committers in hive
Hi Folks, We seem to have quite a few projects going around and in the interest of time and the project as a whole, it seems good to have branch committers much like what is there in the Hadoop project. I am proposing an addition to the committer bylaws as follows ( taken from the hadoop project bylaws http://hadoop.apache.org/bylaws.html ) Significant, pervasive features are often developed in a speculative branch of the repository. The PMC may grant commit rights on the branch to its consistent contributors, while the initiative is active. Branch committers are responsible for shepherding their feature into an active release and do not cast binding votes or vetoes in the project. Actions: New Branch Committer Description: When a new branch committer is proposed for the project. Approval: Lazy Consensus Binding Votes: Active PMC members Minimum Length: 3 days Mailing List: priv...@hive.apache.org Actions: Removal of Branch Committer Description: When a branch committer is removed from the project. Approval: Consensus Binding Votes: Active PMC members excluding the committer in question if they are PMC members too. Minimum Length: 6 days Mailing List: priv...@hive.apache.org This vote will run for 6 days. PMC members please vote. Thanks Vikram.
Re: Created branch 1.0
Hi Ed, This was the case with 0.14. It was fixed before 1.0 went out in HIVE-8933. Thanks Vikram. On Mon, Feb 9, 2015 at 9:08 AM, Alan Gates alanfga...@gmail.com wrote: That's fixed, correct? I do not believe there were any SNAPSHOT dependencies in 1.0. Alan. Edward Capriolo edlinuxg...@gmail.com February 9, 2015 at 8:40 Because we can not really have a stable api if by definition we build around snapshot dependencies. On Mon, Feb 9, 2015 at 11:38 AM, Edward Capriolo edlinuxg...@gmail.com edlinuxg...@gmail.com Edward Capriolo edlinuxg...@gmail.com February 9, 2015 at 8:38 Question. https://issues.apache.org/jira/browse/HIVE-8614 Did we not just agree in this thread that hive will no long have dependency that are SNAPSHOT? Brock Noland br...@cloudera.com January 22, 2015 at 22:06 Hi Alan, I agree with Xuefu and what was suggested in your statement. I was thinking we'd release the next release as 0.15 and then later there would be 1.0 off trunk (e.g. what would have been 0.16) and thus be superset (minus anything we intentionally remove). As I have said several times, I'd like to release more often so I feel we could even start the 1.0 work shortly after the 0.15 release. For my part, I do agree with some earlier contributor/user sentiment that it would be good to have some basic public API defined for 1.0. I don't think that will be too hard as it's more or less obvious what our public API is today. Hope this seems reasonable. Cheers, Brock Xuefu Zhang xzh...@cloudera.com January 22, 2015 at 12:31 Hi Thejas/Alan, From all the argument, I think there was an assumption that the proposed 1.0 release will be imminent and 0.15 will happen far after that. Based on that assumption, 0.15 will become 1.1, which is greater in scope than 1.0. However, this assumption may not be true. The confusion will be significant if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0. Another concern is that, the proposed release of 1.0 is a subset of of Hive's functionality, and for major releases users are expecting major improvement in functionality as well as stability. Mutating from 0.14.1 release seems falling short in that expectation. Having said that, I'd think it makes more sense to release 0.15 as 0.15, and later we release 1.0 as the major release that supersedes any previous releases. That will fulfill the expectations of a major release. Thanks, Xuefu Alan Gates ga...@hortonworks.com January 22, 2015 at 12:12 I had one clarifying question for Brock and Xuefu. Was your proposal to still call the branch from trunk you are planning in a few days 0.15 (and hence release it as 0.15) and have 1.0 be a later release? Or did you want to call what is now 0.15 1.0? If you wanted 1.0 to be post 0.15, are you ok with stipulating that the next release from trunk after 0.15 (what would have been 0.16) is 1.0? Alan. -- Nothing better than when appreciated for hard work. -Mark
[ANNOUNCE] Apache Hive 1.0.0 Released
The Apache Hive team is proud to announce the the release of Apache Hive version 1.0.0. The Apache Hive (TM) data warehouse software facilitates querying and managing large datasets residing in distributed storage. Built on top of Apache Hadoop (TM), it provides: * Tools to enable easy data extract/transform/load (ETL) * A mechanism to impose structure on a variety of data formats * Access to files stored either directly in Apache HDFS (TM) or in other data storage systems such as Apache HBase (TM) * Query execution via Apache Hadoop MapReduce and Apache Tez frameworks. For Hive release details and downloads, please visit:https://hive.apache.org/downloads.html Hive 1.0.0 Release Notes are available here: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12329278styleName=TextprojectId=12310843 We would like to thank the many contributors who made this release possible. Regards, The Apache Hive Team
Re: [VOTE] Apache Hive 1.0 Release Candidate 2
With 3 +1s from the hive PMC, this vote passes. I will be publishing the artifacts to the Apache page shortly. On Sat, Jan 31, 2015 at 1:04 PM, Brock Noland br...@cloudera.com wrote: +1 verified sigs, hashes, verified no SNAPSHOT deps, and ran some queries On Fri, Jan 30, 2015 at 7:48 PM, Prasanth Jayachandran pjayachand...@hortonworks.com wrote: +1 Verified signatures, md5, ran queries from binary and src, compiled src with hadoop-1 and hadoop-2, verified for 1.0.0 version numbers everywhere, no snapshot deps On Jan 30, 2015, at 5:08 PM, Thejas Nair thejas.n...@gmail.com wrote: +1 - verified signatures and checksum - built from source tar.gz - ran simple queries from both bin.tar.gz and newly built package - Verified RELEASE_NOTES.txt, checked LICENSE,NOTICE, README.txt - used schematool to upgrade metastore from hive 0.13.0 to 1.0.0 On Thu, Jan 29, 2015 at 5:05 PM, Vikram Dixit K vikram.di...@gmail.com wrote: Apache Hive 1.0 Release Candidate 2 is available here: http://cp.mcafee.com/d/k-Kr4x8idEI9FKeeecfCXCQrKfnvoppodETsd7b3zaaapJ6XzRTS6mnPqdT3hO_txVCVJohQJJyuMgzI0kjH6to6aNaQVsSjH6to6aNaQVsSMqei2tHcfZvCm7NPVEVWZOWqrz_e3D767KmKzp55l6X_axVZicHs3jq9JATvAXTLuZXTKrKr01Hvlo_-Rrr4_U03xF-cOaNRn2szfVGSS9-n9Oc-nhW_nbNIDxJ3P9ufPrz0KyCMY-qeiWMzFrr4Zwx7o74WNDm1yIiJen1hehD-Rrr4_U02rp79L6MnWhEwdbop3096ziWq811rr4_d40nApYQg8ZsQg0LP_SDCy0iS24AdDVEwjdIe6_9XrzV4T6C0X Maven artifacts are available here: http://cp.mcafee.com/d/k-Kr41EgdEI9FKeeecfCXCQrKfnvoppodETsd7b3zaaapJ6XzRTS6mnPqdT3hO_txVCVJohQJJyuMgzI0kjH6to6aNaQVsSjH6to6aNaQVsSMqei2tHcfZvCm7NPVEVWZOWqrz_e3D767KmKzp55l6X_axVZicHs3jqpJATvAXTLuZXTKrKr9PCJhbczWRqiDm9rJmSNf-00VqI9_2uhZqJ9jH6nQM03GSS9-jApYKztd73q7CiYvCT61t5dxVYQsBRx7iSS9X12eMe9RzeI35oBqsK2yszfZGSS9_M04SOejudwLQzh0qmMO60id6BQQg22SS9-q80L8PVEwhWVEw1vD_Jfd40BI498rfPh0CrosdWDFQxXE4 Source tag for RC1 is at: http://cp.mcafee.com/d/avndzgOd6Qm4QT77767PtPqdT7HLIcII6QrK6zBxNB55cSztNWXX3bbVJ6XxEVvKMYPsSI8WmSNfo8hS0a9RzeI35oBqsKr9RzeI35oBqsKrod791eRC7-LPb3UVYQsZuVtddN_D1Pzz3TbnhIyyGzt_BgY-F6lK1FJASOrLOtXTLuZXTdTdw0W6otGSS9_M078-JmCrhT2szfVfidczV_XjOWfnWVudAYdEupbN-rso5QkS7DPhOnm4tbroDI48X0UDmcWMclylFOUa9Oc_SHroD_00jr8VdUS2_id41Fr38o18Qqnjh08broDVEw2YzfCy17HCy05-v-QYQg2mMgAxI_d42pJxMToJfj2RS Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark -- Nothing better than when appreciated for hard work. -Mark
[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions
[ https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297959#comment-14297959 ] Vikram Dixit K commented on HIVE-9436: -- Committed to RC for 1.0. RetryingMetaStoreClient does not retry JDOExceptions Key: HIVE-9436 URL: https://issues.apache.org/jira/browse/HIVE-9436 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 1.0.0, 1.2.0 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch RetryingMetaStoreClient has a bug in the following bit of code: {code} } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) { caughtException = (MetaException) e.getCause(); } else { throw e.getCause(); } {code} The bug here is that java String.matches matches the entire string to the regex, and thus, that match will fail if the message contains anything before or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we should match (?s).\*JDO[a-zA-Z]\*Exception.\* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called
[ https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297920#comment-14297920 ] Vikram Dixit K commented on HIVE-9473: -- +1 for 1.0.0 sql std auth should disallow built-in udfs that allow any java methods to be called --- Key: HIVE-9473 URL: https://issues.apache.org/jira/browse/HIVE-9473 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-9473.1.patch As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java methods. This should be disallowed when sql standard authorization is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions
[ https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9436: - Fix Version/s: 1.0.0 RetryingMetaStoreClient does not retry JDOExceptions Key: HIVE-9436 URL: https://issues.apache.org/jira/browse/HIVE-9436 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 1.0.0, 1.2.0 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch RetryingMetaStoreClient has a bug in the following bit of code: {code} } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) { caughtException = (MetaException) e.getCause(); } else { throw e.getCause(); } {code} The bug here is that java String.matches matches the entire string to the regex, and thus, that match will fail if the message contains anything before or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we should match (?s).\*JDO[a-zA-Z]\*Exception.\* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9514) schematool is broken in hive 1.0.0
[ https://issues.apache.org/jira/browse/HIVE-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297860#comment-14297860 ] Vikram Dixit K commented on HIVE-9514: -- +1 LGTM. schematool is broken in hive 1.0.0 -- Key: HIVE-9514 URL: https://issues.apache.org/jira/browse/HIVE-9514 Project: Hive Issue Type: Bug Components: Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.0.0 Attachments: HIVE-9514.1.patch Schematool gives following error - {code} bin/schematool -dbType derby -initSchema Starting metastore schema initialization to 1.0 org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 1.0 {code} Metastore schema hasn't changed from 0.14.0 to 1.0.0. So there is no need for new .sql files for 1.0.0. However, schematool needs to be made aware of the metastore schema equivalence. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [VOTE] Apache Hive 1.0 Release Candidate 1
Hi Folks With the issue that Thejas found with the schematool, I need to spin another RC and cancel this vote. I am including Lefty's webhcat change and the NOTICE and README.txt file changes as mentioned by Chao Sun as well. Thanks Vikram. On Thu, Jan 29, 2015 at 4:09 PM, Prasanth Jayachandran pjayachand...@hortonworks.com wrote: +1. Checked MD5, signatures, built source with hadoop-1 and 2 profiles, ran some test queries, no snapshot deps. On Jan 29, 2015, at 10:04 AM, Alan Gates ga...@hortonworks.commailto: ga...@hortonworks.com wrote: +1. Downloaded it, checked out the signatures, did a build, checked there were no snapshot dependencies. Alan. Vikram Dixit Kmailto:vikram.di...@gmail.com January 27, 2015 at 14:28 Apache Hive 1.0 Release Candidate 1 is available here: http://people.apache.org/~vikram/hive/apache-hive-1.0-rc1/ http://cp.mcafee.com/d/5fHCNASyMCC--MYC-rKrhKUZtZxBBwSztMQsIecEEFCQrKfnvoppvdETsd7bZS7CrCRx7iSS9X12eM1heIpRwoH4HjBPpeIpRwoH4HjBPpEVIUeSP3_nVBAtRXBQSjhOqeuvd7bTbnhIyCGyyPOEuvkzaT0QSyrpdTVdByX2rXXapKVI06JZlz_XlJIj_w0e6DUP8H7ls9Oc_CHroDVsD8PVt7HBUShMSxVAL7VJNwnhjovhvbH2eBJIjS24twsjH6to6aNaQVs54V6vXlJIj_w09JVBwsr1vF6y0QJxAc0AqdbFEw45JIjYQg1uhDPh0zRPh02_f_quq81bo8igSvCy14SMUrbh5deRHlsi_T Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1020/ http://cp.mcafee.com/d/k-Kr6hESyMCC--MYC-rKrhKUZtZxBBwSztMQsIecEEFCQrKfnvoppvdETsd7bZS7CrCRx7iSS9X12eM1heIpRwoH4HjBPpeIpRwoH4HjBPpEVIUeSP3_nVBAtRXBQSjhOqeuvd7bTbnhIyCGyyPOEuvkzaT0QSCrpdTVdByX2rXXapKVIDeqR4IOfHlFatoBKRrr4_U03BGMDY9V7RGQBeIpvj00eHroDVehDOWdQXUrgYOnzYSUMbEFIfELBRx7iSS9X12eMe9RzeI35oBqsK2yszfZGSS9_M04SYOMedwLQzh0qmMO60id6BQQg22SS9-q80L8PVEwhWVEw1vD_Jfd40BI498rfPh0yrosdNK_WKF-w Source tag for RC1 is at: http://svn.apache.org/repos/asf/hive/branches/branch-1.0/ http://cp.mcafee.com/d/2DRPoAcyhJ5xddZZxVdYTsSztNWXX3bb1J6XxEVosphhjdETsuK-MOO-rhKUqenXIfcTdH2eBJIjS24tw2ytoPH0Nm9mDbCOtoPH0Nm9mDbCPhPpMtJC7-LPb8XHTbFICzAQsY-qenKmKzp5dl55DBgY-F6lK1FJASOrLOrb5S4TTSkPtPo0exC7qJJyvY01OfHlFCQtMD8P-48X2NfWogzIb4OWfnMSxVAL7VJNwnhjovhvbH2eBJIjS24twsjH6to6aNaQVs54V6vXlJIj_w09JVBwsr1vF6y0QJxAc0AqdbFEw45JIjYQg1uhDPh0zRPh02_f_quq81bo8igSvCy14SMUruy0Jp7Dt1bg Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks Vikram. CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Nothing better than when appreciated for hard work. -Mark
[jira] [Updated] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-8807: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch 1.0 Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Assignee: Eugene Koifman Fix For: 1.0.0 Attachments: HIVE8807.patch The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). no precommit tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[VOTE] Apache Hive 1.0 Release Candidate 2
Apache Hive 1.0 Release Candidate 2 is available here: http://people.apache.org/~vikram/hive/apache-hive-1.0.0-rc2/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1021/ Source tag for RC1 is at: http://svn.apache.org/repos/asf/hive/tags/release-1.0.0-rc2/ Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark
[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295829#comment-14295829 ] Vikram Dixit K commented on HIVE-8807: -- If I end up rolling out a new release and we have a patch for this by then, I will include this in the next roll-out. Thanks Vikram. Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Fix For: 0.14.1 The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[VOTE] Apache Hive 1.0 Release Candidate 0
Apache Hive 1.0 Release Candidate 0 is available here: http://people.apache.org/~vikram/hive/apache-hive-1.0-rc0/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1019/ Source tag for RC0 is at: http://svn.apache.org/repos/asf/hive/branches/branch-1.0/ Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark
[jira] [Updated] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9038: - Fix Version/s: 1.0.0 Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Fix For: 1.0.0 Attachments: HIVE-9038.1.patch, HIVE-9038.2.patch, HIVE-9038.3.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. * {{auto_join21.q}} * {{auto_join29.q}} * {{auto_join30.q}} * {{auto_join_filters.q}} * {{auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9141) HiveOnTez: mix of union all, distinct, group by generates error
[ https://issues.apache.org/jira/browse/HIVE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9141: - Fix Version/s: 1.0.0 HiveOnTez: mix of union all, distinct, group by generates error --- Key: HIVE-9141 URL: https://issues.apache.org/jira/browse/HIVE-9141 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.15.0 Reporter: Pengcheng Xiong Assignee: Navis Fix For: 0.15.0, 1.0.0 Attachments: HIVE-9141.1.patch.txt Here is the way to produce it: in Hive q test setting (with src table) set hive.execution.engine=tez; SELECT key, value FROM ( SELECT key, value FROM src UNION ALL SELECT key, key as value FROM ( SELECT distinct key FROM ( SELECT key, value FROM (SELECT key, value FROM src UNION ALL SELECT key, value FROM src )t1 group by key, value )t2 )t3 )t4 group by key, value; will generate 2014-12-16 23:19:13,593 ERROR ql.Driver (SessionState.java:printError(834)) - FAILED: ClassCastException org.apache.hadoop.hive.ql.plan.MapWork cannot be cast to org.apache.hadoop.hive.ql.plan.ReduceWork java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.MapWork cannot be cast to org.apache.hadoop.hive.ql.plan.ReduceWork at org.apache.hadoop.hive.ql.parse.GenTezWork.process(GenTezWork.java:361) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:87) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.startWalking(GenTezWorkWalker.java:69) at org.apache.hadoop.hive.ql.parse.TezCompiler.generateTaskTree(TezCompiler.java:368) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:202) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1155) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:206) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:158) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:369) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:304) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:834) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:136) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_uniontez2(TestMiniTezCliDriver.java:120) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Created branch 1.0
Hi Folks, It has been a few days and I have done all the work needed to produce a 1.0 RC and think it is better to have a vote on it. I still hope that we can have this release as 1.0 and Brock's release as 1.1. By the end of the day I think having more releases is a good thing for the community as is moving to 1.0 sooner rather than later. Thanks Vikram. On Fri, Jan 23, 2015 at 6:12 PM, Sergey Shelukhin ser...@hortonworks.com wrote: I think the way it is done in Hadoop space is better for Hadoop space (and better wrt consistency, us being in the Hadoop space). Because no single company or QA process controls or covers all the changes to the product, and some changes go unseen by every actor, stabilization period is a must... And anyway enterprise software on trunk model does not cut releases immediately off trunk and ship them. With enterprise software there's lengthy QA, with Hadoop there's lengthy cutting edge release. How about we cut 1.0 with stable version 0.14.1, and instead of 0.15 do 2.0, like HBase did? We can maintain 1.0 as maintenance release; with 2.0 we can add new unstable stuff, and also remove all the old paths we don't care about (old Hadoop support, HiveCLI(?), old Java version support) etc. On Fri, Jan 23, 2015 at 11:40 AM, Szehon Ho sze...@cloudera.com wrote: Wherever I've seen in enterprise software, the trunk-based development model has been the standard where all release branches are cut from trunk and short-lived. I've never heard of a case where a branch originally designated for 0.14 (minor release) is cut again to become 1.0 (major release), and I dont think if you ask anyone they will expect it either. There was also no announced plan when cutting 0.14 branch that it was eventually going to be 1.0. As Brock pointed out in the beginning, Hadoop branch/versioning is the only exception and an anti-pattern, and all the confusion like why 0.xx has features not in 1.0 would not be there if it followed this. I would really hate to see the same anti-pattern happen to Hive, so my vote is also against this. Also this standard release branching practice has been in Hive throughout its history, you wouldn't make 0.14 out of 0.13 branch, would you? From the stability and long-term support use-cases that is very definitely the wrong thing to do - to cram code into a 1.0 release. Major release is supposed to be stable. I also don't see how cutting 1.0 from trunk precludes it from stabilizing. Also I don't think those arguments of 0.14 as most stable that can be backed up, what constitutes stability? Bug fixes are just one part, in that case there are always more bug fixes in later Hive versions than earlier ones, so probably API stability is a more measure-able term and should be more important to consider. Thanks, Szehon On Fri, Jan 23, 2015 at 10:42 AM, Gopal V gop...@apache.org wrote: On 1/23/15, 6:59 AM, Xuefu Zhang wrote: While it's true that a release isn't going to include everything from trunk, proposed 1.0 release is branched off 0.14, which was again branched from trunk long time ago. If you compare the code base, you will see the huge difference. From the stability and long-term support use-cases that is very definitely the wrong thing to do - to cram code into a 1.0 release. The huge difference is *THE* really worrying red-flag. Or is the thought behind everything from trunk that 1.0 just a number? 0.14.1 in terms of functionality and stability will be much clearer, meeting the all expectations for a major release. Just to be clear, when hive-14 was released, it was actually a major release. That branch kicked off in Sept and has been updated since then with a known set of critical fixes, giving it pedigree and has already seen customer time. In all this discussion, it doesn't sound like you consider 0.15 to be a major release - that gives me no confidence in your approach. Cheers, Gopal On Thu, Jan 22, 2015 at 3:08 PM, Thejas Nair the...@hortonworks.com wrote: On Thu, Jan 22, 2015 at 12:31 PM, Xuefu Zhang xzh...@cloudera.com wrote: Hi Thejas/Alan, From all the argument, I think there was an assumption that the proposed 1.0 release will be imminent and 0.15 will happen far after that. Based on that assumption, 0.15 will become 1.1, which is greater in scope than 1.0. However, this assumption may not be true. The confusion will be significant if 0.15 is released early as 0.15 before 0.14.1 is released as 1.0. Yes, the assumption is that 1.0 will be out very soon, before 0.15 line is ready, and that 0.15 can become 1.1 . Do you think that assumption won't hold true ? (In previous emails in this thread, I talk about reasons why this assumption is reliable). I agree that it does not make sense to release
[jira] [Updated] (HIVE-9053) select constant in union all followed by group by gives wrong result
[ https://issues.apache.org/jira/browse/HIVE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9053: - Fix Version/s: 1.0.0 select constant in union all followed by group by gives wrong result Key: HIVE-9053 URL: https://issues.apache.org/jira/browse/HIVE-9053 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 0.15.0, 0.14.1, 1.0.0 Attachments: HIVE-9053.01.patch, HIVE-9053.02.patch, HIVE-9053.03.patch, HIVE-9053.04.patch, HIVE-9053.patch-branch-1.0 Here is the the way to reproduce with q test: select key from (select '1' as key from src union all select key from src)tab group by key; will give OK NULL 1 This is not correct as src contains many other keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9053) select constant in union all followed by group by gives wrong result
[ https://issues.apache.org/jira/browse/HIVE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9053: - Fix Version/s: 0.14.1 select constant in union all followed by group by gives wrong result Key: HIVE-9053 URL: https://issues.apache.org/jira/browse/HIVE-9053 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 0.15.0, 0.14.1, 1.0.0 Attachments: HIVE-9053.01.patch, HIVE-9053.02.patch, HIVE-9053.03.patch, HIVE-9053.04.patch, HIVE-9053.patch-branch-1.0 Here is the the way to reproduce with q test: select key from (select '1' as key from src union all select key from src)tab group by key; will give OK NULL 1 This is not correct as src contains many other keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[VOTE] Apache Hive 1.0 Release Candidate 1
Apache Hive 1.0 Release Candidate 1 is available here: http://people.apache.org/~vikram/hive/apache-hive-1.0-rc1/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1020/ Source tag for RC1 is at: http://svn.apache.org/repos/asf/hive/branches/branch-1.0/ Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark
Re: [VOTE] Apache Hive 1.0 Release Candidate 0
It looks like I missed updating the release notes with the latest changes that have gone into 1.0. I will fix that and create a new RC. Thanks Vikram. On Tue, Jan 27, 2015 at 12:29 PM, Vikram Dixit K vikram.di...@gmail.com wrote: Apache Hive 1.0 Release Candidate 0 is available here: http://people.apache.org/~vikram/hive/apache-hive-1.0-rc0/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-1019/ Source tag for RC0 is at: http://svn.apache.org/repos/asf/hive/branches/branch-1.0/ Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark -- Nothing better than when appreciated for hard work. -Mark
[jira] [Commented] (HIVE-9359) Export of a large table causes OOM in Metastore and Client
[ https://issues.apache.org/jira/browse/HIVE-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292686#comment-14292686 ] Vikram Dixit K commented on HIVE-9359: -- +1 for 1.0 Export of a large table causes OOM in Metastore and Client -- Key: HIVE-9359 URL: https://issues.apache.org/jira/browse/HIVE-9359 Project: Hive Issue Type: Bug Components: Import/Export, Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.15.0 Attachments: HIVE-9359.2.patch, HIVE-9359.patch Running hive export on a table with a large number of partitions winds up making the metastore and client run out of memory. The number of places we wind up having a copy of the entire partitions object wind up being as follows: Metastore * (temporarily) Metastore MPartition objects * ListPartition that gets persisted before sending to thrift * thrift copy of all of those partitions Client side * thrift copy of partitions * deepcopy of above to create ListPartition objects * JSONObject that contains all of those above partition objects * ListReadEntity which each encapsulates the aforesaid partition objects. This memory usage needs to be drastically reduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)