[jira] [Created] (HIVE-13730) hybridgrace_hashjoin_1.q test gets stuck
Vikram Dixit K created HIVE-13730: - Summary: hybridgrace_hashjoin_1.q test gets stuck Key: HIVE-13730 URL: https://issues.apache.org/jira/browse/HIVE-13730 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Wei Zheng Priority: Critical I am seeing hybridgrace_hashjoin_1.q getting stuck on master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13628) Support for permanent functions - error handling if no restart
Vikram Dixit K created HIVE-13628: - Summary: Support for permanent functions - error handling if no restart Key: HIVE-13628 URL: https://issues.apache.org/jira/browse/HIVE-13628 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-13628.1.patch Support for permanent functions - error handling if no restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13627) When running under LLAP, for regular map joins, throw an error if memory utilization goes above what is allocated to the task
Vikram Dixit K created HIVE-13627: - Summary: When running under LLAP, for regular map joins, throw an error if memory utilization goes above what is allocated to the task Key: HIVE-13627 URL: https://issues.apache.org/jira/browse/HIVE-13627 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K When running under LLAP, for regular map joins, throw an error if memory utilization goes above what is allocated to the task. This way, the rest of the dependent tasks can fail sooner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13621) compute stats in certain cases fails with NPE
Vikram Dixit K created HIVE-13621: - Summary: compute stats in certain cases fails with NPE Key: HIVE-13621 URL: https://issues.apache.org/jira/browse/HIVE-13621 Project: Hive Issue Type: Bug Components: HBase Metastore, Metastore Affects Versions: 2.1.0, 2.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatistics(StatsUtils.java:693) at org.apache.hadoop.hive.ql.stats.StatsUtils.convertColStats(StatsUtils.java:739) at org.apache.hadoop.hive.ql.stats.StatsUtils.getTableColumnStats(StatsUtils.java:728) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:183) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:136) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:124){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13619) Bucket map join plan is incorrect
Vikram Dixit K created HIVE-13619: - Summary: Bucket map join plan is incorrect Key: HIVE-13619 URL: https://issues.apache.org/jira/browse/HIVE-13619 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0, 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Same as HIVE-12992. Missed a single line check. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13518) Hive on Tez: Shuffle joins do not choose the right 'big' table.
Vikram Dixit K created HIVE-13518: - Summary: Hive on Tez: Shuffle joins do not choose the right 'big' table. Key: HIVE-13518 URL: https://issues.apache.org/jira/browse/HIVE-13518 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K Currently the big table is always assumed to be at position 0 but this isn't efficient for some queries as the big table at position 1 could have a lot more keys/skew. We already have a mechanism of choosing the big table that can be leveraged to make the right choice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13485) Session id appended to thread name multiple times.
Vikram Dixit K created HIVE-13485: - Summary: Session id appended to thread name multiple times. Key: HIVE-13485 URL: https://issues.apache.org/jira/browse/HIVE-13485 Project: Hive Issue Type: Bug Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-13485.1.patch HIVE-13153 addressed a portion of this issue. Follow up from there. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13438) Add a service check script for llap
Vikram Dixit K created HIVE-13438: - Summary: Add a service check script for llap Key: HIVE-13438 URL: https://issues.apache.org/jira/browse/HIVE-13438 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-13438.1.patch We want to have a test script that can be run by an installer such as ambari that makes sure that the service is up and running. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13408) Issue appending HIVE_QUERY_ID without checking if the prefix already exists
Vikram Dixit K created HIVE-13408: - Summary: Issue appending HIVE_QUERY_ID without checking if the prefix already exists Key: HIVE-13408 URL: https://issues.apache.org/jira/browse/HIVE-13408 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} We are resetting the hadoop caller context to HIVE_QUERY_ID:HIVE_QUERY_ID: {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13394) Analyze table fails in tez on empty partitions
Vikram Dixit K created HIVE-13394: - Summary: Analyze table fails in tez on empty partitions Key: HIVE-13394 URL: https://issues.apache.org/jira/browse/HIVE-13394 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0, 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:237) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:766) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343) ... 17 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hive.ql.udf.generic.NumDistinctValueEstimator.deserialize(NumDistinctValueEstimator.java:219) at org.apache.hadoop.hive.ql.udf.generic.NumDistinctValueEstimator.(NumDistinctValueEstimator.java:112) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFComputeStats$GenericUDAFNumericStatsEvaluator.merge(GenericUDAFComputeStats.java:556) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:188) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:612) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processAggr(GroupByOperator.java:851) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:695) at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:761) ... 18 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_145591034_27748_1_01 [Reducer 2] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13343) Need to disable hybrid grace hash join in llap mode except for dynamically partitioned hash join
Vikram Dixit K created HIVE-13343: - Summary: Need to disable hybrid grace hash join in llap mode except for dynamically partitioned hash join Key: HIVE-13343 URL: https://issues.apache.org/jira/browse/HIVE-13343 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Due to performance reasons, we should disable use of hybrid grace hash join in llap when dynamic partition hash join is not used. With dynamic partition hash join, we need hybrid grace hash join due to the possibility of skews. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13342) Improve logging in llap decider for llap
Vikram Dixit K created HIVE-13342: - Summary: Improve logging in llap decider for llap Key: HIVE-13342 URL: https://issues.apache.org/jira/browse/HIVE-13342 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Currently we do not log our decisions with respect to llap. Are we running everything in llap mode or only parts of the plan. We need more logging. Also, if llap mode is all but for some reason, we cannot run the work in llap mode, fail and throw an exception advise the user to change the mode to auto. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13329) Hive query id should not be allowed to be modified by users.
Vikram Dixit K created HIVE-13329: - Summary: Hive query id should not be allowed to be modified by users. Key: HIVE-13329 URL: https://issues.apache.org/jira/browse/HIVE-13329 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13286) Query ID is being reused across queries
Vikram Dixit K created HIVE-13286: - Summary: Query ID is being reused across queries Key: HIVE-13286 URL: https://issues.apache.org/jira/browse/HIVE-13286 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Pengcheng Xiong Priority: Critical [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is being reused across queries. This defeats the purpose of a query id. I am not sure what the purpose of the change in that jira is but it breaks the assumption about a query id being unique for each query. Please take a look into this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13282) GroupBy and select operator encounter ArrayIndexOutOfBoundsException
Vikram Dixit K created HIVE-13282: - Summary: GroupBy and select operator encounter ArrayIndexOutOfBoundsException Key: HIVE-13282 URL: https://issues.apache.org/jira/browse/HIVE-13282 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 2.0.0, 1.2.1, 2.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K The group by and select operators run into the ArrayIndexOutOfBoundsException when they incorrectly initialize themselves with tag 0 but the incoming tag id is different. {code} select count(*) from (select rt1.id from (select t1.key as id, t1.value as od from tab t1 group by key, value) rt1) vt1 join (select rt2.id from (select t2.key as id, t2.value as od from tab_part t2 group by key, value) rt2) vt2 where vt1.id=vt2.id; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13152) JDBC split refactoring and handle some edge cases
Vikram Dixit K created HIVE-13152: - Summary: JDBC split refactoring and handle some edge cases Key: HIVE-13152 URL: https://issues.apache.org/jira/browse/HIVE-13152 Project: Hive Issue Type: Sub-task Components: JDBC Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: llap wrap jdbc split allow zero split returns in llapdump allow setting of llap output format as file format option create job conf via dag utils JDCB LlapInputSplit should not be an inner class Hack: Spark uses hive-1.2.1, which does not have HiveConf.ConfVars.LLAP_DAEMON_RPC_PORT, use "hive.llap.daemon.rpc.port" - Use Class.getName() rather than Class.toString() for class name - Add LlapInputSplit Writable test Fix for filesink operator close NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12992) Hive on tez: Bucket map join plan is incorrect
Vikram Dixit K created HIVE-12992: - Summary: Hive on tez: Bucket map join plan is incorrect Key: HIVE-12992 URL: https://issues.apache.org/jira/browse/HIVE-12992 Project: Hive Issue Type: Bug Affects Versions: 1.2.1, 2.0.0 Reporter: Vikram Dixit K TPCH Query 9 fails when bucket map join is enabled: {code} FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 5, vertexId=vertex_1450634494433_0007_2_06, diagnostics=[Exception in EdgeManager, vertex=vertex_1450634494433_0007_2_06 [Reducer 5], Fail to sendTezEventToDestinationTasks, event:DataMovementEvent [sourceIndex=0, targetIndex=-1, version=0], sourceInfo:{ producerConsumerType=OUTPUT, taskVertexName=Map 1, edgeVertexName=Reducer 5, taskAttemptId=attempt_1450634494433_0007_2_05_00_0 }, destinationInfo:null, EdgeInfo: sourceVertexName=Map 1, destinationVertexName=Reducer 5, java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.routeDataMovementEventToDestination(CustomPartitionEdge.java:88) at org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:458) at org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:386) at org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:439) at org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:4382) at org.apache.tez.dag.app.dag.impl.VertexImpl.access$4000(VertexImpl.java:202) at org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4172) at org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:4164) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12947) SMB join in tez has ClassCastException when container reuse is on
Vikram Dixit K created HIVE-12947: - Summary: SMB join in tez has ClassCastException when container reuse is on Key: HIVE-12947 URL: https://issues.apache.org/jira/browse/HIVE-12947 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.FileSinkOperator cannot be cast to org.apache.hadoop.hive.ql.exec.DummyStoreOperator at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:300) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getJoinParentOp(MapRecordProcessor.java:302) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:189) ... 15 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12905) Issue with mapjoin in tez under certain conditions
Vikram Dixit K created HIVE-12905: - Summary: Issue with mapjoin in tez under certain conditions Key: HIVE-12905 URL: https://issues.apache.org/jira/browse/HIVE-12905 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1, 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K In a specific case where we have an outer join followed by another join on the same key and the non-outer side of the outer join is empty, hive-on-tez produces incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12797) Synchronization issues with tez/llap session pool in hs2
Vikram Dixit K created HIVE-12797: - Summary: Synchronization issues with tez/llap session pool in hs2 Key: HIVE-12797 URL: https://issues.apache.org/jira/browse/HIVE-12797 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K The changes introduced as part of HIVE-12674 causes issues while shutting down hs2 when session pools are used. {code} java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) ~[?:1.8.0_45] at java.util.LinkedList$ListItr.remove(LinkedList.java:921) ~[?:1.8.0_45] at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:288) ~[hive-exec-2.0.0.2.3.5.0-79.jar:2.0.0.2.3.5.0-79] at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:479) [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79] at org.apache.hive.service.server.HiveServer2$2.run(HiveServer2.java:183) [hive-jdbc-2.0.0.2.3.5.0-79-standalone.jar:2.0.0.2.3.5.0-79] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12768) Thread safety: binary sortable serde decimal deserialization
Vikram Dixit K created HIVE-12768: - Summary: Thread safety: binary sortable serde decimal deserialization Key: HIVE-12768 URL: https://issues.apache.org/jira/browse/HIVE-12768 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Blocker We see thread safety issues due to static decimal buffer in binary sortable serde. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12740) NPE with HS2 when using null input format
Vikram Dixit K created HIVE-12740: - Summary: NPE with HS2 when using null input format Key: HIVE-12740 URL: https://issues.apache.org/jira/browse/HIVE-12740 Project: Hive Issue Type: Bug Components: HiveServer2, Tez Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical When we have a query that returns empty rows and when using tez with hs2, we hit NPE: {code} java.util.concurrent.ExecutionException: java.lang.NullPointerException at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:490) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:447) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.writeOldSplits(MRInputHelpers.java:559) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplits(MRInputHelpers.java:619) at org.apache.tez.mapreduce.hadoop.MRInputHelpers.configureMRInputWithLegacySplitGeneration(MRInputHelpers.java:109) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:617) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:1103) at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:386) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:175) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:156) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1816) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1561) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1338) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1154) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1147) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:181) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:73) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:234) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:247) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.Utilities.isVectorMode(Utilities.java:3241) at org.apache.hadoop.hive.ql.io.HiveInputFormat.wrapForLlap(HiveInputFormat.java:208) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getInputFormatFromCache(HiveInputFormat.java:267) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:103) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CheckNonCombinablePathCallable.call(CombineHiveInputFormat.java:80) ... 4 more 15/12/17 18:59:06 INFO log.PerfLogger: 15/12/17 18:59:06 ERROR exec.Task: Failed to execute tez graph. org.apache.tez.dag.api.TezUncheckedException: Failed to generate InputSplits at org.apache.tez.mapreduce.hadoop.MRInputHelpers.configureMRInputWithLegacySplitGeneration(MRInputHelpers.java:124) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:617) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.createVertex(DagUtils.java:1103) at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:386) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:175) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:156) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1816) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1561) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1338) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1154) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1147) at
[jira] [Created] (HIVE-12437) SMB join in tez fails when one of the tables is empty
Vikram Dixit K created HIVE-12437: - Summary: SMB join in tez fails when one of the tables is empty Key: HIVE-12437 URL: https://issues.apache.org/jira/browse/HIVE-12437 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical It looks like a better check for empty tables is to depend on the existence of the record reader for the input from tez. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12387) Bug with logging improvements in ATS
Vikram Dixit K created HIVE-12387: - Summary: Bug with logging improvements in ATS Key: HIVE-12387 URL: https://issues.apache.org/jira/browse/HIVE-12387 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K When indexing in ATS, the space in the value is not useful. We need to change to use the hive query id throughout the logging phase and also add information about what config the user passed in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12254) Improve logging with yarn/hdfs
Vikram Dixit K created HIVE-12254: - Summary: Improve logging with yarn/hdfs Key: HIVE-12254 URL: https://issues.apache.org/jira/browse/HIVE-12254 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K In extension to HIVE-12249, adding info for Yarn/HDFS as well. Both HIVE-12249 and HDFS-9184 are required before this can be resolved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12249) Improve logging with tez
Vikram Dixit K created HIVE-12249: - Summary: Improve logging with tez Key: HIVE-12249 URL: https://issues.apache.org/jira/browse/HIVE-12249 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need to improve logging across the board. TEZ-2851 added a caller context so that one can correlate logs with the application. This jira adds a new configuration for users that can be used to correlate the logs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12204) Tez queries stopped running with ApplicationNotRunningException
Vikram Dixit K created HIVE-12204: - Summary: Tez queries stopped running with ApplicationNotRunningException Key: HIVE-12204 URL: https://issues.apache.org/jira/browse/HIVE-12204 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1, 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K In some error cases, if hive can no longer submit DAGs to tez, there is no use retrying to submit. We need to exit by throwing exception in this case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12201) Tez settings need to be shown in set -v output when execution engine is tez.
Vikram Dixit K created HIVE-12201: - Summary: Tez settings need to be shown in set -v output when execution engine is tez. Key: HIVE-12201 URL: https://issues.apache.org/jira/browse/HIVE-12201 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Minor Attachments: HIVE-12201.1.patch The set -v output currently shows configurations for yarn, hdfs etc. but does not show tez settings when tez is set as the execution engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11829) Create test for HIVE-11216
Vikram Dixit K created HIVE-11829: - Summary: Create test for HIVE-11216 Key: HIVE-11829 URL: https://issues.apache.org/jira/browse/HIVE-11829 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need tests for HIVE-11216. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11806) Create test for HIVE-11174
Vikram Dixit K created HIVE-11806: - Summary: Create test for HIVE-11174 Key: HIVE-11806 URL: https://issues.apache.org/jira/browse/HIVE-11806 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Minor We are lacking tests for HIVE-11174. Adding one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11606) Bucket map joins fail at hash table construction time
Vikram Dixit K created HIVE-11606: - Summary: Bucket map joins fail at hash table construction time Key: HIVE-11606 URL: https://issues.apache.org/jira/browse/HIVE-11606 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.1, 1.0.1 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: java.lang.AssertionError: Capacity must be a power of two at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:91) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11605) Incorrect results with bucket map join in tez.
Vikram Dixit K created HIVE-11605: - Summary: Incorrect results with bucket map join in tez. Key: HIVE-11605 URL: https://issues.apache.org/jira/browse/HIVE-11605 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0, 1.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical Attachments: HIVE-11605.1.patch In some cases, we aggressively try to convert to a bucket map join and this ends up producing incorrect results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11360) Bucketing doesn't work correctly with unpartitioned tables
Vikram Dixit K created HIVE-11360: - Summary: Bucketing doesn't work correctly with unpartitioned tables Key: HIVE-11360 URL: https://issues.apache.org/jira/browse/HIVE-11360 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0 Reporter: Vikram Dixit K When we try to create bucket files with unpartitioned tables, enforce bucketing doesn't create the empty bucket files. [~prasanth_j] for your reference. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11355) Hive on tez: memory manager for sort buffers (input/output) and operators
Vikram Dixit K created HIVE-11355: - Summary: Hive on tez: memory manager for sort buffers (input/output) and operators Key: HIVE-11355 URL: https://issues.apache.org/jira/browse/HIVE-11355 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need to better manage the sort buffer allocations to ensure better performance. Also, we need to provide configurations to certain operators to stay within memory limits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11356) SMB join on tez fails when one of the tables is empty
Vikram Dixit K created HIVE-11356: - Summary: SMB join on tez fails when one of the tables is empty Key: HIVE-11356 URL: https://issues.apache.org/jira/browse/HIVE-11356 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} :java.lang.IllegalStateException: Unexpected event. All physical sources already initialized at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.tez.mapreduce.input.MultiMRInput.handleEvents(MultiMRInput.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.handleEvent(LogicalIOProcessorRuntimeTask.java:610) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.access$1100(LogicalIOProcessorRuntimeTask.java:90) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$1.run(LogicalIOProcessorRuntimeTask.java:673) at java.lang.Thread.run(Thread.java:745) ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1437168420060_17787_1_01 [Map 4] killed/failed due to:null] Vertex killed, vertexName=Reducer 5, vertexId=vertex_1437168420060_17787_1_02, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1437168420060_17787_1_02 [Reducer 5] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:1 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask HQL-FAILED {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11292) MiniLlapCliDriver for running tests in llap
Vikram Dixit K created HIVE-11292: - Summary: MiniLlapCliDriver for running tests in llap Key: HIVE-11292 URL: https://issues.apache.org/jira/browse/HIVE-11292 Project: Hive Issue Type: Bug Components: Test Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Create MiniLlapCliDriver for running unit tests in llap mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-11027) Hive on tez: Bucket map joins fail when hashcode goes negative
Vikram Dixit K created HIVE-11027: - Summary: Hive on tez: Bucket map joins fail when hashcode goes negative Key: HIVE-11027 URL: https://issues.apache.org/jira/browse/HIVE-11027 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0 Reporter: Vikram Dixit K Assignee: Prasanth Jayachandran Seeing an issue when dynamic sort optimization is enabled while doing an insert into bucketed table. We seem to be flipping the negative sign on the hashcode instead of taking the complement of it for routing the data correctly. This results in correctness issues in bucket map joins in hive on tez when the hash code goes negative. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values
Vikram Dixit K created HIVE-10929: - Summary: In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values Key: HIVE-10929 URL: https://issues.apache.org/jira/browse/HIVE-10929 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} create table dummy(i int); insert into table dummy values (1); select * from dummy; create table partunion1(id1 int) partitioned by (part1 string); set hive.exec.dynamic.partition.mode=nonstrict; set hive.execution.engine=tez; explain insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; select * from partunion1; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins
Vikram Dixit K created HIVE-10907: - Summary: Hive on Tez: Classcast exception in some cases with SMB joins Key: HIVE-10907 URL: https://issues.apache.org/jira/browse/HIVE-10907 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K In cases where there is a mix of Map side work and reduce side work, we get a classcast exception because we assume homogeneity in the code. We need to fix this correctly. For now this is a workaround. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10908) Hive on tez: SMB join needs to work with different type of work items (map side with reduce side)
Vikram Dixit K created HIVE-10908: - Summary: Hive on tez: SMB join needs to work with different type of work items (map side with reduce side) Key: HIVE-10908 URL: https://issues.apache.org/jira/browse/HIVE-10908 Project: Hive Issue Type: Improvement Components: Tez Affects Versions: 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K This is related to HIVE-10907. This is going to be the actual enhancement/fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10742) rename_table_location.q test fails
Vikram Dixit K created HIVE-10742: - Summary: rename_table_location.q test fails Key: HIVE-10742 URL: https://issues.apache.org/jira/browse/HIVE-10742 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.2.0, 1.3.0 Reporter: Vikram Dixit K Assignee: Sushanth Sowmyan The test rename_table_location.q fails all the time but is not being caught by the HiveQA. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10719) Hive metastore failure when alter table rename is attempted.
Vikram Dixit K created HIVE-10719: - Summary: Hive metastore failure when alter table rename is attempted. Key: HIVE-10719 URL: https://issues.apache.org/jira/browse/HIVE-10719 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K {code} create database newDB location /tmp/; describe database extended newDB; use newDB; create table tab (name string); alter table tab rename to newName; {code} Fails: {code} InvalidOperationException(message:Unable to access old location hdfs://localhost:8020/tmp/tab for table x.tab) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10647) Hive on LLAP: Limit HS2 from overwhelming LLAP
Vikram Dixit K created HIVE-10647: - Summary: Hive on LLAP: Limit HS2 from overwhelming LLAP Key: HIVE-10647 URL: https://issues.apache.org/jira/browse/HIVE-10647 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10611) Mini tez tests wait for 5 minutes before shutting down
Vikram Dixit K created HIVE-10611: - Summary: Mini tez tests wait for 5 minutes before shutting down Key: HIVE-10611 URL: https://issues.apache.org/jira/browse/HIVE-10611 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Currently, at shutdown, the tez mini cluster waits for the session to close before shutting down the cluster. This ends up being 5 minutes - the default value. We can shut down the session to alleviate this situation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10542) Full outer joins in tez produce incorrect results in certain cases
Vikram Dixit K created HIVE-10542: - Summary: Full outer joins in tez produce incorrect results in certain cases Key: HIVE-10542 URL: https://issues.apache.org/jira/browse/HIVE-10542 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Blocker If there is no records for one of the tables in the full outer join, we do not read the other input and end up not producing rows which we should be. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10323) Tez merge join operator does not honor hive.join.emit.interal
Vikram Dixit K created HIVE-10323: - Summary: Tez merge join operator does not honor hive.join.emit.interal Key: HIVE-10323 URL: https://issues.apache.org/jira/browse/HIVE-10323 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K This affects efficiency in case of skews. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10273) Union with partition tables which have no data fails with NPE
Vikram Dixit K created HIVE-10273: - Summary: Union with partition tables which have no data fails with NPE Key: HIVE-10273 URL: https://issues.apache.org/jira/browse/HIVE-10273 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10233) Hive on LLAP: Memory manager
Vikram Dixit K created HIVE-10233: - Summary: Hive on LLAP: Memory manager Key: HIVE-10233 URL: https://issues.apache.org/jira/browse/HIVE-10233 Project: Hive Issue Type: Bug Components: Tez Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K We need a memory manager in llap/tez to manage the usage of memory across threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10232) Map join in tez needs to account for memory limits due to other map join operators possible in the same work
Vikram Dixit K created HIVE-10232: - Summary: Map join in tez needs to account for memory limits due to other map join operators possible in the same work Key: HIVE-10232 URL: https://issues.apache.org/jira/browse/HIVE-10232 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K There seems to be a regression with respect to MR in terms of allowing multiple map joins in the same task by not accounting for the memory consumed in each of the joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10201) Hive LLAP needs refactoring of the configuration class
Vikram Dixit K created HIVE-10201: - Summary: Hive LLAP needs refactoring of the configuration class Key: HIVE-10201 URL: https://issues.apache.org/jira/browse/HIVE-10201 Project: Hive Issue Type: Bug Affects Versions: llap Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: llap In order for the client to take decisions regarding resource requirement and availability, we need to move the configuration class to llap-client. In the future, we will need to get the configurations from a service such as zookeeper to keep in sync with what is actually deployed on the cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-10001) SMB join in reduce side
Vikram Dixit K created HIVE-10001: - Summary: SMB join in reduce side Key: HIVE-10001 URL: https://issues.apache.org/jira/browse/HIVE-10001 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9886) Hive on tez: NPE when converting join to SMB in sub-query
Vikram Dixit K created HIVE-9886: Summary: Hive on tez: NPE when converting join to SMB in sub-query Key: HIVE-9886 URL: https://issues.apache.org/jira/browse/HIVE-9886 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0, 1.1.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical {code} set hive.auto.convert.sortmerge.join = true; create table t1( id string, od string); create table t2( id string, od string); select vt1.id from (select rt1.id from (select t1.id, row_number() over (partition by id order by od desc) as row_no from t1) rt1 where rt1.row_no=1) vt1 join (select rt2.id from (select t2.id, row_number() over (partition by id order by od desc) as row_no from t2) rt2 where rt2.row_no=1) vt2 where vt1.id=vt2.id; {code} throws NPE: {code} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:146) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.getValueObjectInspectors(AbstractMapJoinOperator.java:96) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.getJoinOutputObjectInspector(CommonJoinOperator.java:167) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.initializeOp(CommonJoinOperator.java:310) at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:72) at org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.initializeOp(CommonMergeJoinOperator.java:89) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:66) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initializeOp(Operator.java:410) at org.apache.hadoop.hive.ql.exec.PTFOperator.initializeOp(PTFOperator.java:89) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425) at org.apache.hadoop.hive.ql.exec.ExtractOperator.initializeOp(ExtractOperator.java:40) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:116) ... 14 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9832) Merge join followed by union and a map join in hive on tez fails.
Vikram Dixit K created HIVE-9832: Summary: Merge join followed by union and a map join in hive on tez fails. Key: HIVE-9832 URL: https://issues.apache.org/jira/browse/HIVE-9832 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Critical {code} select a.key, b.value from (select x.key as key, y.value as value from srcpart x join srcpart y on (x.key = y.key) union all select key, value from srcpart z) a join src b on (a.value = b.value); {code} {code} TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:186) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators: null at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:214) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) ... 13 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:317) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:196) ... 14 more ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1425055721029_0048_4_09 [Reducer 5] killed/failed due to:null] Vertex killed, vertexName=Reducer 7, vertexId=vertex_1425055721029_0048_4_11, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1425055721029_0048_4_11 [Reducer 7] killed/failed due to:null] Vertex killed, vertexName=Reducer 4, vertexId=vertex_1425055721029_0048_4_07, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1425055721029_0048_4_07 [Reducer 4] killed/failed due to:null] DAG failed due to vertex failure. failedVertices:1 killedVertices:2 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9836) Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns)
Vikram Dixit K created HIVE-9836: Summary: Hive on tez: fails when virtual columns are present in the join conditions (for e.g. partition columns) Key: HIVE-9836 URL: https://issues.apache.org/jira/browse/HIVE-9836 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.0.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-9836.1.patch {code} explain select a.key, a.value, b.value from tab a join tab_part b on a.key = b.key and a.ds = b.ds; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9683) Hive metastore thrift client connections hang indefinitely
[ https://issues.apache.org/jira/browse/HIVE-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320492#comment-14320492 ] Vikram Dixit K commented on HIVE-9683: -- +1 for 1.0 branch. Hive metastore thrift client connections hang indefinitely -- Key: HIVE-9683 URL: https://issues.apache.org/jira/browse/HIVE-9683 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 1.0.0, 1.0.1 Reporter: Gopal V Assignee: Gopal V Priority: Minor Fix For: 1.0.1 Attachments: HIVE-9683.1.patch THRIFT-2788 fixed network-partition problems that affect Thrift client connections. Since hive-1.0 is on thrift-0.9.0 which is affected by the bug, a workaround can be applied to prevent indefinite connection hangs during net-splits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6069: - Affects Version/s: 1.0.0 Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.0.0 Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Fix For: 1.2.0 Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6069: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~apivovarov]! Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.0.0 Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9523) when columns on which tables are partitioned are used in the join condition same join optimizations as for bucketed tables should be applied
[ https://issues.apache.org/jira/browse/HIVE-9523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9523: - Labels: gsoc2015 (was: ) when columns on which tables are partitioned are used in the join condition same join optimizations as for bucketed tables should be applied Key: HIVE-9523 URL: https://issues.apache.org/jira/browse/HIVE-9523 Project: Hive Issue Type: Improvement Components: Logical Optimizer, Physical Optimizer, SQL Affects Versions: 0.13.0, 0.14.0, 0.13.1 Reporter: Maciek Kocon Labels: gsoc2015 For JOIN conditions where partitioning criteria are used respectively: ⋮ FROM TabA JOIN TabB ON TabA.partCol1 = TabB.partCol2 AND TabA.partCol2 = TabB.partCol2 the optimizer could/should choose to treat it the same way as with bucketed tables: ⋮ FROM TabC JOIN TabD ON TabC.clusteredByCol1 = TabD.clusteredByCol2 AND TabC.clusteredByCol2 = TabD.clusteredByCol2 and use either Bucket Map Join or better, the Sort Merge Bucket Map Join. This is based on fact that same way as buckets translate to separate files, the partitions essentially provide the same mapping. When data locality is known the optimizer could focus only on joining corresponding partitions rather than whole data sets. #side notes: ⦿ Currently Table DDL Syntax where Partitioning and Bucketing defined at the same time is allowed: CREATE TABLE ⋮ PARTITIONED BY(…) CLUSTERED BY(…) INTO … BUCKETS; But in this case optimizer never chooses to use Bucket Map Join or Sort Merge Bucket Map Join which defeats the purpose of creating BUCKETed tables in such scenarios. Should that be raised as a separate BUG? ⦿ Currently partitioning and bucketing are two separate things but serve same purpose - shouldn't the concept be merged (explicit/implicit partitions?) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6069: - Fix Version/s: 1.2.0 Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 1.0.0 Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Fix For: 1.2.0 Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9687) Blink DB style approximate querying in hive
Vikram Dixit K created HIVE-9687: Summary: Blink DB style approximate querying in hive Key: HIVE-9687 URL: https://issues.apache.org/jira/browse/HIVE-9687 Project: Hive Issue Type: New Feature Reporter: Vikram Dixit K http://www.cs.berkeley.edu/~sameerag/blinkdb_eurosys13.pdf There are various pieces here that need to be thought through and implemented. For e.g. sampling offline, run-time sampling selection module etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6069) Improve error message in GenericUDFRound
[ https://issues.apache.org/jira/browse/HIVE-6069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14316923#comment-14316923 ] Vikram Dixit K commented on HIVE-6069: -- +1 LGTM. I will commit this shortly. Improve error message in GenericUDFRound Key: HIVE-6069 URL: https://issues.apache.org/jira/browse/HIVE-6069 Project: Hive Issue Type: Bug Components: UDF Reporter: Xuefu Zhang Assignee: Alexander Pivovarov Priority: Trivial Attachments: HIVE-6069.1.patch Suggested in HIVE-6039 review board. https://reviews.apache.org/r/16329/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions
[ https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297959#comment-14297959 ] Vikram Dixit K commented on HIVE-9436: -- Committed to RC for 1.0. RetryingMetaStoreClient does not retry JDOExceptions Key: HIVE-9436 URL: https://issues.apache.org/jira/browse/HIVE-9436 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 1.0.0, 1.2.0 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch RetryingMetaStoreClient has a bug in the following bit of code: {code} } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) { caughtException = (MetaException) e.getCause(); } else { throw e.getCause(); } {code} The bug here is that java String.matches matches the entire string to the regex, and thus, that match will fail if the message contains anything before or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we should match (?s).\*JDO[a-zA-Z]\*Exception.\* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9473) sql std auth should disallow built-in udfs that allow any java methods to be called
[ https://issues.apache.org/jira/browse/HIVE-9473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297920#comment-14297920 ] Vikram Dixit K commented on HIVE-9473: -- +1 for 1.0.0 sql std auth should disallow built-in udfs that allow any java methods to be called --- Key: HIVE-9473 URL: https://issues.apache.org/jira/browse/HIVE-9473 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-9473.1.patch As mentioned in HIVE-8893, some udfs can be used to execute arbitrary java methods. This should be disallowed when sql standard authorization is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9436) RetryingMetaStoreClient does not retry JDOExceptions
[ https://issues.apache.org/jira/browse/HIVE-9436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9436: - Fix Version/s: 1.0.0 RetryingMetaStoreClient does not retry JDOExceptions Key: HIVE-9436 URL: https://issues.apache.org/jira/browse/HIVE-9436 Project: Hive Issue Type: Bug Affects Versions: 0.14.0, 0.13.1 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 1.0.0, 1.2.0 Attachments: HIVE-9436.2.patch, HIVE-9436.3.patch, HIVE-9436.patch RetryingMetaStoreClient has a bug in the following bit of code: {code} } else if ((e.getCause() instanceof MetaException) e.getCause().getMessage().matches(JDO[a-zA-Z]*Exception)) { caughtException = (MetaException) e.getCause(); } else { throw e.getCause(); } {code} The bug here is that java String.matches matches the entire string to the regex, and thus, that match will fail if the message contains anything before or after JDO[a-zA-Z]\*Exception. The solution, however, is very simple, we should match (?s).\*JDO[a-zA-Z]\*Exception.\* -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9514) schematool is broken in hive 1.0.0
[ https://issues.apache.org/jira/browse/HIVE-9514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14297860#comment-14297860 ] Vikram Dixit K commented on HIVE-9514: -- +1 LGTM. schematool is broken in hive 1.0.0 -- Key: HIVE-9514 URL: https://issues.apache.org/jira/browse/HIVE-9514 Project: Hive Issue Type: Bug Components: Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 1.0.0 Attachments: HIVE-9514.1.patch Schematool gives following error - {code} bin/schematool -dbType derby -initSchema Starting metastore schema initialization to 1.0 org.apache.hadoop.hive.metastore.HiveMetaException: Unknown version specified for initialization: 1.0 {code} Metastore schema hasn't changed from 0.14.0 to 1.0.0. So there is no need for new .sql files for 1.0.0. However, schematool needs to be made aware of the metastore schema equivalence. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-8807: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch 1.0 Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Assignee: Eugene Koifman Fix For: 1.0.0 Attachments: HIVE8807.patch The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). no precommit tests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8807) Obsolete default values in webhcat-default.xml
[ https://issues.apache.org/jira/browse/HIVE-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295829#comment-14295829 ] Vikram Dixit K commented on HIVE-8807: -- If I end up rolling out a new release and we have a patch for this by then, I will include this in the next roll-out. Thanks Vikram. Obsolete default values in webhcat-default.xml -- Key: HIVE-8807 URL: https://issues.apache.org/jira/browse/HIVE-8807 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0, 0.13.0, 0.14.0 Reporter: Lefty Leverenz Fix For: 0.14.1 The defaults for templeton.pig.path templeton.hive.path are 0.11 in webhcat-default.xml but they ought to match current release numbers. The Pig version is 0.12.0 for Hive 0.14 RC0 (as shown in pom.xml). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9038: - Fix Version/s: 1.0.0 Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Fix For: 1.0.0 Attachments: HIVE-9038.1.patch, HIVE-9038.2.patch, HIVE-9038.3.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. * {{auto_join21.q}} * {{auto_join29.q}} * {{auto_join30.q}} * {{auto_join_filters.q}} * {{auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9141) HiveOnTez: mix of union all, distinct, group by generates error
[ https://issues.apache.org/jira/browse/HIVE-9141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9141: - Fix Version/s: 1.0.0 HiveOnTez: mix of union all, distinct, group by generates error --- Key: HIVE-9141 URL: https://issues.apache.org/jira/browse/HIVE-9141 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.15.0 Reporter: Pengcheng Xiong Assignee: Navis Fix For: 0.15.0, 1.0.0 Attachments: HIVE-9141.1.patch.txt Here is the way to produce it: in Hive q test setting (with src table) set hive.execution.engine=tez; SELECT key, value FROM ( SELECT key, value FROM src UNION ALL SELECT key, key as value FROM ( SELECT distinct key FROM ( SELECT key, value FROM (SELECT key, value FROM src UNION ALL SELECT key, value FROM src )t1 group by key, value )t2 )t3 )t4 group by key, value; will generate 2014-12-16 23:19:13,593 ERROR ql.Driver (SessionState.java:printError(834)) - FAILED: ClassCastException org.apache.hadoop.hive.ql.plan.MapWork cannot be cast to org.apache.hadoop.hive.ql.plan.ReduceWork java.lang.ClassCastException: org.apache.hadoop.hive.ql.plan.MapWork cannot be cast to org.apache.hadoop.hive.ql.plan.ReduceWork at org.apache.hadoop.hive.ql.parse.GenTezWork.process(GenTezWork.java:361) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:87) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:103) at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.startWalking(GenTezWorkWalker.java:69) at org.apache.hadoop.hive.ql.parse.TezCompiler.generateTaskTree(TezCompiler.java:368) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:202) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:224) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:419) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:305) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1107) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1155) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1044) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1034) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:206) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:158) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:369) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:304) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:834) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:136) at org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_uniontez2(TestMiniTezCliDriver.java:120) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9053) select constant in union all followed by group by gives wrong result
[ https://issues.apache.org/jira/browse/HIVE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9053: - Fix Version/s: 1.0.0 select constant in union all followed by group by gives wrong result Key: HIVE-9053 URL: https://issues.apache.org/jira/browse/HIVE-9053 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 0.15.0, 0.14.1, 1.0.0 Attachments: HIVE-9053.01.patch, HIVE-9053.02.patch, HIVE-9053.03.patch, HIVE-9053.04.patch, HIVE-9053.patch-branch-1.0 Here is the the way to reproduce with q test: select key from (select '1' as key from src union all select key from src)tab group by key; will give OK NULL 1 This is not correct as src contains many other keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9053) select constant in union all followed by group by gives wrong result
[ https://issues.apache.org/jira/browse/HIVE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9053: - Fix Version/s: 0.14.1 select constant in union all followed by group by gives wrong result Key: HIVE-9053 URL: https://issues.apache.org/jira/browse/HIVE-9053 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 0.15.0, 0.14.1, 1.0.0 Attachments: HIVE-9053.01.patch, HIVE-9053.02.patch, HIVE-9053.03.patch, HIVE-9053.04.patch, HIVE-9053.patch-branch-1.0 Here is the the way to reproduce with q test: select key from (select '1' as key from src union all select key from src)tab group by key; will give OK NULL 1 This is not correct as src contains many other keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9359) Export of a large table causes OOM in Metastore and Client
[ https://issues.apache.org/jira/browse/HIVE-9359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14292686#comment-14292686 ] Vikram Dixit K commented on HIVE-9359: -- +1 for 1.0 Export of a large table causes OOM in Metastore and Client -- Key: HIVE-9359 URL: https://issues.apache.org/jira/browse/HIVE-9359 Project: Hive Issue Type: Bug Components: Import/Export, Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.15.0 Attachments: HIVE-9359.2.patch, HIVE-9359.patch Running hive export on a table with a large number of partitions winds up making the metastore and client run out of memory. The number of places we wind up having a copy of the entire partitions object wind up being as follows: Metastore * (temporarily) Metastore MPartition objects * ListPartition that gets persisted before sending to thrift * thrift copy of all of those partitions Client side * thrift copy of partitions * deepcopy of above to create ListPartition objects * JSONObject that contains all of those above partition objects * ListReadEntity which each encapsulates the aforesaid partition objects. This memory usage needs to be drastically reduced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9235) Turn off Parquet Vectorization until all data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289810#comment-14289810 ] Vikram Dixit K commented on HIVE-9235: -- Committed to trunk and branches. Turn off Parquet Vectorization until all data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR - Key: HIVE-9235 URL: https://issues.apache.org/jira/browse/HIVE-9235 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9235.01.patch, HIVE-9235.02.patch Title was: Make Parquet Vectorization of these data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR Support for doing vector column assign is missing for some data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9235) Turn off Parquet Vectorization until all data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9235: - Resolution: Fixed Status: Resolved (was: Patch Available) Turn off Parquet Vectorization until all data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR - Key: HIVE-9235 URL: https://issues.apache.org/jira/browse/HIVE-9235 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9235.01.patch, HIVE-9235.02.patch Title was: Make Parquet Vectorization of these data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR Support for doing vector column assign is missing for some data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9404) NPE in org.apache.hadoop.hive.metastore.txn.TxnHandler.determineDatabaseProduct()
[ https://issues.apache.org/jira/browse/HIVE-9404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289830#comment-14289830 ] Vikram Dixit K commented on HIVE-9404: -- [~ekoifman] Does this need to ported to branch 1.0? NPE in org.apache.hadoop.hive.metastore.txn.TxnHandler.determineDatabaseProduct() - Key: HIVE-9404 URL: https://issues.apache.org/jira/browse/HIVE-9404 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.15.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.15.0 Attachments: HIVE-9404.patch {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.metastore.txn.TxnHandler.determineDatabaseProduct(TxnHandler.java:1015) at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkRetryable(TxnHandler.java:906) at org.apache.hadoop.hive.metastore.txn.TxnHandler.getOpenTxns(TxnHandler.java:238) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_open_txns(HiveMetaStore.java:5321) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7270) SerDe Properties are not considered by show create table Command
[ https://issues.apache.org/jira/browse/HIVE-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290142#comment-14290142 ] Vikram Dixit K commented on HIVE-7270: -- +1. Go ahead with the branch commits. SerDe Properties are not considered by show create table Command Key: HIVE-7270 URL: https://issues.apache.org/jira/browse/HIVE-7270 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.13.1 Reporter: R J Assignee: Navis Priority: Minor Fix For: 0.15.0 Attachments: HIVE-7270.1.patch.txt, HIVE-7270.2.patch.txt, HIVE-7270.3.patch.txt The HIVE table DDl generated by show create table target_table command does not contain SerDe properties of the target table even though it contain specific SerDe properties. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8966) Delta files created by hive hcatalog streaming cannot be compacted
[ https://issues.apache.org/jira/browse/HIVE-8966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286267#comment-14286267 ] Vikram Dixit K commented on HIVE-8966: -- +1 for a branch 1.0. Delta files created by hive hcatalog streaming cannot be compacted -- Key: HIVE-8966 URL: https://issues.apache.org/jira/browse/HIVE-8966 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.14.0 Environment: hive Reporter: Jihong Liu Assignee: Alan Gates Priority: Critical Fix For: 0.14.1 Attachments: HIVE-8966.2.patch, HIVE-8966.3.patch, HIVE-8966.4.patch, HIVE-8966.5.patch, HIVE-8966.patch hive hcatalog streaming will also create a file like bucket_n_flush_length in each delta directory. Where n is the bucket number. But the compactor.CompactorMR think this file also needs to compact. However this file of course cannot be compacted, so compactor.CompactorMR will not continue to do the compaction. Did a test, after removed the bucket_n_flush_length file, then the alter table partition compact finished successfully. If don't delete that file, nothing will be compacted. This is probably a very severity bug. Both 0.13 and 0.14 have this issue -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8485) HMS on Oracle incompatibility
[ https://issues.apache.org/jira/browse/HIVE-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286251#comment-14286251 ] Vikram Dixit K commented on HIVE-8485: -- [~sushanth] can this be committed to branch 1.0 when ready instead of branch 0.14 HMS on Oracle incompatibility - Key: HIVE-8485 URL: https://issues.apache.org/jira/browse/HIVE-8485 Project: Hive Issue Type: Bug Components: Metastore Environment: Oracle as metastore DB Reporter: Ryan Pridgeon Assignee: Chaoyu Tang Attachments: HIVE-8485.2.patch, HIVE-8485.patch Oracle does not distinguish between empty strings and NULL,which proves problematic for DataNucleus. In the event a user creates a table with some property stored as an empty string the table will no longer be accessible. i.e. TBLPROPERTIES ('serialization.null.format'='') If they try to select, describe, drop, etc the client prints the following exception. ERROR ql.Driver: FAILED: SemanticException [Error 10001]: Table not found table name The work around for this was to go into the hive metastore on the Oracle database and replace NULL with some other string. Users could then drop the tables or alter their data to use the new null format they just set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9053) select constant in union all followed by group by gives wrong result
[ https://issues.apache.org/jira/browse/HIVE-9053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286257#comment-14286257 ] Vikram Dixit K commented on HIVE-9053: -- [~pxiong] can you create a patch based on branch 1.0 instead of branch 0.14. select constant in union all followed by group by gives wrong result Key: HIVE-9053 URL: https://issues.apache.org/jira/browse/HIVE-9053 Project: Hive Issue Type: Bug Affects Versions: 0.13.0, 0.14.0 Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 0.15.0 Attachments: HIVE-9053.01.patch, HIVE-9053.02.patch, HIVE-9053.03.patch, HIVE-9053.04.patch Here is the the way to reproduce with q test: select key from (select '1' as key from src union all select key from src)tab group by key; will give OK NULL 1 This is not correct as src contains many other keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9235) Turn off Parquet Vectorization until all data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286534#comment-14286534 ] Vikram Dixit K commented on HIVE-9235: -- +1 for branch 1.0 as well. Turn off Parquet Vectorization until all data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR - Key: HIVE-9235 URL: https://issues.apache.org/jira/browse/HIVE-9235 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9235.01.patch, HIVE-9235.02.patch Title was: Make Parquet Vectorization of these data types work: DECIMAL, DATE, TIMESTAMP, CHAR, and VARCHAR Support for doing vector column assign is missing for some data types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8890) HiveServer2 dynamic service discovery: use persistent ephemeral nodes curator recipe
[ https://issues.apache.org/jira/browse/HIVE-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286262#comment-14286262 ] Vikram Dixit K commented on HIVE-8890: -- [~vgumashta] can you commit this to branch 1.0 once this has been reviewed. HiveServer2 dynamic service discovery: use persistent ephemeral nodes curator recipe Key: HIVE-8890 URL: https://issues.apache.org/jira/browse/HIVE-8890 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.1 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.14.1 Attachments: HIVE-8890.1.patch, HIVE-8890.2.patch Using this recipe gives better reliability. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9414) Fixup post HIVE-9264 - Merge encryption branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284635#comment-14284635 ] Vikram Dixit K commented on HIVE-9414: -- Yeah. Those failures are unrelated. Fixup post HIVE-9264 - Merge encryption branch to trunk --- Key: HIVE-9414 URL: https://issues.apache.org/jira/browse/HIVE-9414 Project: Hive Issue Type: Bug Affects Versions: 0.15.0 Reporter: Brock Noland Assignee: Vikram Dixit K Attachments: HIVE-9414.1.patch.txt See https://issues.apache.org/jira/browse/HIVE-9264?focusedCommentId=14283223page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14283223 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9264) Merge encryption branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9264: - Attachment: HIVE-9264.addendum.patch Portion of patch of HIVE-9038 that was not merged correctly with the committed patch for this jira. Merge encryption branch to trunk Key: HIVE-9264 URL: https://issues.apache.org/jira/browse/HIVE-9264 Project: Hive Issue Type: Sub-task Affects Versions: 0.15.0 Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.15.0 Attachments: HIVE-9264.1.patch, HIVE-9264.2.patch, HIVE-9264.2.patch, HIVE-9264.2.patch, HIVE-9264.3.patch, HIVE-9264.3.patch, HIVE-9264.3.patch, HIVE-9264.addendum.patch The team working on the encryption branch would like to merge their work to trunk. This jira will track that effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8485) HMS on Oracle incompatibility
[ https://issues.apache.org/jira/browse/HIVE-8485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283335#comment-14283335 ] Vikram Dixit K commented on HIVE-8485: -- +1 for branch 0.14 HMS on Oracle incompatibility - Key: HIVE-8485 URL: https://issues.apache.org/jira/browse/HIVE-8485 Project: Hive Issue Type: Bug Components: Metastore Environment: Oracle as metastore DB Reporter: Ryan Pridgeon Assignee: Chaoyu Tang Attachments: HIVE-8485.2.patch, HIVE-8485.patch Oracle does not distinguish between empty strings and NULL,which proves problematic for DataNucleus. In the event a user creates a table with some property stored as an empty string the table will no longer be accessible. i.e. TBLPROPERTIES ('serialization.null.format'='') If they try to select, describe, drop, etc the client prints the following exception. ERROR ql.Driver: FAILED: SemanticException [Error 10001]: Table not found table name The work around for this was to go into the hive metastore on the Oracle database and replace NULL with some other string. Users could then drop the tables or alter their data to use the new null format they just set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9112) Query may generate different results depending on the number of reducers
[ https://issues.apache.org/jira/browse/HIVE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283314#comment-14283314 ] Vikram Dixit K commented on HIVE-9112: -- +1 for 0.14 Query may generate different results depending on the number of reducers Key: HIVE-9112 URL: https://issues.apache.org/jira/browse/HIVE-9112 Project: Hive Issue Type: Bug Components: Logical Optimizer Affects Versions: 0.14.0 Reporter: Chao Assignee: Ted Xu Fix For: 0.15.0 Attachments: HIVE-9112-0.14.1-branch.patch, HIVE-9112.1.patch, HIVE-9112.2.patch, HIVE-9112.patch Some queries may generate different results depending on the number of reducers, for example, tests like ppd_multi_insert.q, join_nullsafe.q, subquery_in.q, etc. Take subquery_in.q as example, if we add {noformat} set mapred.reduce.tasks=3; {noformat} to this test file, the result will be different (and wrong): {noformat} @@ -903,5 +903,3 @@ where li.l_linenumber = 1 and POSTHOOK: type: QUERY POSTHOOK: Input: default@lineitem A masked pattern was here -108570 8571 -4297 1798 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9264) Merge encryption branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283211#comment-14283211 ] Vikram Dixit K commented on HIVE-9264: -- [~brocknoland] It looks like this commit removed some portions of the patch from HIVE-9038. Can you please look into it. Merge encryption branch to trunk Key: HIVE-9264 URL: https://issues.apache.org/jira/browse/HIVE-9264 Project: Hive Issue Type: Sub-task Affects Versions: 0.15.0 Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.15.0 Attachments: HIVE-9264.1.patch, HIVE-9264.2.patch, HIVE-9264.2.patch, HIVE-9264.2.patch, HIVE-9264.3.patch, HIVE-9264.3.patch, HIVE-9264.3.patch The team working on the encryption branch would like to merge their work to trunk. This jira will track that effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9038: - Resolution: Fixed Status: Resolved (was: Patch Available) Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Attachments: HIVE-9038.1.patch, HIVE-9038.2.patch, HIVE-9038.3.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. * {{auto_join21.q}} * {{auto_join29.q}} * {{auto_join30.q}} * {{auto_join_filters.q}} * {{auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9264) Merge encryption branch to trunk
[ https://issues.apache.org/jira/browse/HIVE-9264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14283226#comment-14283226 ] Vikram Dixit K commented on HIVE-9264: -- Added an addendum for the missing portions. [~brocknoland] please take a look. Merge encryption branch to trunk Key: HIVE-9264 URL: https://issues.apache.org/jira/browse/HIVE-9264 Project: Hive Issue Type: Sub-task Affects Versions: 0.15.0 Reporter: Brock Noland Assignee: Brock Noland Fix For: 0.15.0 Attachments: HIVE-9264.1.patch, HIVE-9264.2.patch, HIVE-9264.2.patch, HIVE-9264.2.patch, HIVE-9264.3.patch, HIVE-9264.3.patch, HIVE-9264.3.patch, HIVE-9264.addendum.patch The team working on the encryption branch would like to merge their work to trunk. This jira will track that effort. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280904#comment-14280904 ] Vikram Dixit K commented on HIVE-9038: -- Test failure is unrelated. Same exception occurs without this change. Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Attachments: HIVE-9038.1.patch, HIVE-9038.2.patch, HIVE-9038.3.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. * {{auto_join21.q}} * {{auto_join29.q}} * {{auto_join30.q}} * {{auto_join_filters.q}} * {{auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9038: - Attachment: HIVE-9038.2.patch Address Sergey's comments. Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Attachments: HIVE-9038.1.patch, HIVE-9038.2.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. {{auto_join21.q,auto_join29.q,auto_join30.q ,auto_join_filters.q,auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9380) Hive on tez: Needs filterTag propagation to produce efficient outer join plans
Vikram Dixit K created HIVE-9380: Summary: Hive on tez: Needs filterTag propagation to produce efficient outer join plans Key: HIVE-9380 URL: https://issues.apache.org/jira/browse/HIVE-9380 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.15.0 Reporter: Vikram Dixit K HIVE-9038 brought up this issue of lacking a filter tag in the case of tez for certain types of queries. This is the jira where we want to fix this in a right way. My suggestion is to have a select operator that is added as a parent to the RS operator in the case of map joins that adds a filter tag in case of the table having filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9223) HiveServer2 on Tez doesn't support concurrent queries within one session
[ https://issues.apache.org/jira/browse/HIVE-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14277829#comment-14277829 ] Vikram Dixit K commented on HIVE-9223: -- [~pala] I see how this can happen now. In tez only one DAG is allowed to run at a time. MR subverts this issue by just launching multiple jobs when such a parallel submission is done. However, this change is not easy to do in case of tez right now and we need to have a design to support concurrency at this level. I am exploring some options with the tez folks and will get back to you on this. HiveServer2 on Tez doesn't support concurrent queries within one session Key: HIVE-9223 URL: https://issues.apache.org/jira/browse/HIVE-9223 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Pala M Muthaia When a user submits multiple queries in the same HS2 session (using thrift interface) concurrently, the query goes through the same TezSessionState and ends up being submitted to the same Tez AM, and the second query fails with the error App master already running a DAG Is this by design? I looked into the code, and the comments as well as the code suggest support only for serial execution of queries within the same HiveServer2 session (on tez). This works for CLI environment but in a server, it is plausible that client sends multiple concurrent queries under the same session (e.g: a web app that executes queries for user, such as Cloudera Hue). So shouldn't HS2 on Tez implementation support concurrent queries? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9038: - Attachment: HIVE-9038.3.patch Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Attachments: HIVE-9038.1.patch, HIVE-9038.2.patch, HIVE-9038.3.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. {{auto_join21.q,auto_join29.q,auto_join30.q ,auto_join_filters.q,auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276373#comment-14276373 ] Vikram Dixit K commented on HIVE-9038: -- This issue as Navis mentioned, stems from the fact that tez does not make filterTag. FilterTag is generated in case of MR by the HashTableSinkOperator which is not used in case of tez. The right solution would be to have a select operator that adds the filtertag to the value field so as to work without sacrificing the increased stages by moving this to a shuffle join instead of map join. However, since this only happens in the case where there are multiple joins on the same key with an outer join doing the filtering, I have this patch that changes the join to a shuffle join in this case for the time being so as to get away from the asserts. I will be raising a different jira for the good fix. Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. {{auto_join21.q,auto_join29.q,auto_join30.q ,auto_join_filters.q,auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9278) Cached expression feature broken in one case
[ https://issues.apache.org/jira/browse/HIVE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14276315#comment-14276315 ] Vikram Dixit K commented on HIVE-9278: -- +1 for 0.14 Cached expression feature broken in one case Key: HIVE-9278 URL: https://issues.apache.org/jira/browse/HIVE-9278 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Navis Priority: Blocker Fix For: 0.15.0 Attachments: HIVE-9278.1.patch.txt Different query result depending on whether hive.cache.expr.evaluation is true or false. When true, no query results are produced (this is wrong). The q file: {noformat} set hive.cache.expr.evaluation=true; CREATE TABLE cache_expr_repro (date_str STRING); LOAD DATA LOCAL INPATH '../../data/files/cache_expr_repro.txt' INTO TABLE cache_expr_repro; SELECT MONTH(date_str) AS `mon`, CAST((MONTH(date_str) - 1) / 3 + 1 AS int) AS `quarter`, YEAR(date_str) AS `year` FROM cache_expr_repro WHERE ((CAST((MONTH(date_str) - 1) / 3 + 1 AS int) = 1) AND (YEAR(date_str) = 2015)) GROUP BY MONTH(date_str), CAST((MONTH(date_str) - 1) / 3 + 1 AS int), YEAR(date_str) ; {noformat} cache_expr_repro.txt {noformat} 2015-01-01 00:00:00 2015-02-01 00:00:00 2015-01-01 00:00:00 2015-02-01 00:00:00 2015-01-01 00:00:00 2015-01-01 00:00:00 2015-02-01 00:00:00 2015-02-01 00:00:00 2015-01-01 00:00:00 2015-01-01 00:00:00 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9038: - Status: Patch Available (was: Open) Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Attachments: HIVE-9038.1.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. {{auto_join21.q,auto_join29.q,auto_join30.q ,auto_join_filters.q,auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9038) Join tests fail on Tez
[ https://issues.apache.org/jira/browse/HIVE-9038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-9038: - Attachment: HIVE-9038.1.patch Join tests fail on Tez -- Key: HIVE-9038 URL: https://issues.apache.org/jira/browse/HIVE-9038 Project: Hive Issue Type: Bug Components: Tests, Tez Reporter: Ashutosh Chauhan Assignee: Vikram Dixit K Attachments: HIVE-9038.1.patch Tez doesn't run all tests. But, if you run them, following tests fail with runt time exception pointing to bugs. {{auto_join21.q,auto_join29.q,auto_join30.q ,auto_join_filters.q,auto_join_nulls.q}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9304) [Refactor] remove unused method in SemAly
[ https://issues.apache.org/jira/browse/HIVE-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14270175#comment-14270175 ] Vikram Dixit K commented on HIVE-9304: -- +1 LGTM. [Refactor] remove unused method in SemAly - Key: HIVE-9304 URL: https://issues.apache.org/jira/browse/HIVE-9304 Project: Hive Issue Type: Task Components: Query Processor Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Trivial Attachments: HIVE-9304.patch Seems like method {{genConversionOps}} don't serve any purpose any longer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9249) Vectorization: Join involving CHAR/VARCHAR fails during execution
[ https://issues.apache.org/jira/browse/HIVE-9249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14265703#comment-14265703 ] Vikram Dixit K commented on HIVE-9249: -- +1 for 0.14 Vectorization: Join involving CHAR/VARCHAR fails during execution - Key: HIVE-9249 URL: https://issues.apache.org/jira/browse/HIVE-9249 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Attachments: HIVE-9249.01.patch VectorColumnAssignFactory doesn't handle HiveCharWritable / HiveVarcharWritable objects. {code} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.io.HiveVarcharWritable cannot be cast to org.apache.hadoop.hive.common.type.HiveVarchar at org.apache.hadoop.hive.ql.exec.vector.VectorColumnAssignFactory$17.assignObjectValue(VectorColumnAssignFactory.java:417) at org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.internalForward(VectorMapJoinOperator.java:196) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject(CommonJoinOperator.java:670) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:748) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:299) ... 24 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9223) HiveServer2 on Tez doesn't support concurrent queries within one session
[ https://issues.apache.org/jira/browse/HIVE-9223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263199#comment-14263199 ] Vikram Dixit K commented on HIVE-9223: -- [~pala] can you let us know what is the scenario here? Do you mean to say that you have a single hs2 session that multiple queries are concurrently being fired to? How are you doing that? Does this work in mapreduce? Any light on this matter would be useful for identifying the issue. HiveServer2 on Tez doesn't support concurrent queries within one session Key: HIVE-9223 URL: https://issues.apache.org/jira/browse/HIVE-9223 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Pala M Muthaia When a user submits multiple queries in the same HS2 session (using thrift interface) concurrently, the query goes through the same TezSessionState and ends up being submitted to the same Tez AM, and the second query fails with the error App master already running a DAG Is this by design? I looked into the code, and the comments as well as the code suggest support only for serial execution of queries within the same HiveServer2 session (on tez). This works for CLI environment but in a server, it is plausible that client sends multiple concurrent queries under the same session (e.g: a web app that executes queries for user, such as Cloudera Hue). So shouldn't HS2 on Tez implementation support concurrent queries? -- This message was sent by Atlassian JIRA (v6.3.4#6332)