[jira] [Created] (HIVE-23972) Add external client ID to LLAP external client
Jason Dere created HIVE-23972: - Summary: Add external client ID to LLAP external client Key: HIVE-23972 URL: https://issues.apache.org/jira/browse/HIVE-23972 Project: Hive Issue Type: Bug Components: llap Reporter: Jason Dere Assignee: Jason Dere There currently is not a good way to tell which currently running LLAP tasks are from external LLAP clients, and also no good way to know which application is submitting these external LLAP requests. One possible solution for this is to add an option for the external LLAP client to pass in an external client ID, which can get logged by HiveServer2 during the getSplits request, as well as displayed from the LLAP executorsStatus. cc [~ShubhamChaurasia] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23868) Windowing function spec: support 0 preceeding/following
Jason Dere created HIVE-23868: - Summary: Windowing function spec: support 0 preceeding/following Key: HIVE-23868 URL: https://issues.apache.org/jira/browse/HIVE-23868 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Jason Dere Assignee: Jason Dere HIVE-12574 removed support for 0 PRECEDING/FOLLOWING in window function specifications. We can restore support for this by converting 0 PRECEDING/FOLLOWING to CURRENT ROW in the query plan, which should be the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23068) Error when submitting fragment to LLAP via external client: IllegalStateException: Only a single registration allowed per entity
Jason Dere created HIVE-23068: - Summary: Error when submitting fragment to LLAP via external client: IllegalStateException: Only a single registration allowed per entity Key: HIVE-23068 URL: https://issues.apache.org/jira/browse/HIVE-23068 Project: Hive Issue Type: Bug Components: llap Reporter: Jason Dere Assignee: Jason Dere LLAP external client (via hive-warehouse-connector) somehow seems to be sending duplicate submissions for the same fragment/attempt. When the 2nd request is sent this results in the following error: {noformat} 2020-03-17T06:49:11,239 WARN [IPC Server handler 2 on 15001 ()] org.apache.hadoop.ipc.Server: IPC Server handler 2 on 15001, call Call#75 Retry#0 org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork from 19.40.252.114:33906 java.lang.IllegalStateException: Only a single registration allowed per entity. Duplicate for TaskWrapper{task=attempt_1854104024183112753_6052_0_00_000128_1, inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true, canFinish=true, canFinish(in queue)=true, isGuaranteed=false, firstAttemptStartTime=1584442003327, dagStartTime=1584442003327, withinDagPriority=0, vertexParallelism= 2132, selfAndUpstreamParallelism= 2132, selfAndUpstreamComplete= 0} at org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.registerForUpdates(QueryInfo.java:233) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.QueryInfo.registerForFinishableStateUpdates(QueryInfo.java:205) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.QueryFragmentInfo.registerForFinishableStateUpdates(QueryFragmentInfo.java:160) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$TaskWrapper.maybeRegisterForFinishedStateNotifications(TaskExecutorService.java:1167) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.schedule(TaskExecutorService.java:564) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.schedule(TaskExecutorService.java:93) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:292) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:610) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.LlapProtocolServerImpl.submitWork(LlapProtocolServerImpl.java:122) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:22695) ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1] at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?] at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?] at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?] at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_191] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_191] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?] at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?] {noformat} I think the issue here is that this error occurred too late - based on the stack trace, LLAP has already accepted/registered the fragment. The subsequent cleanup of this fragment/attempt also affects the first request. Which results in the LLAP crash described in HIVE-23061: {noformat} 2020-03-17T06:49:11,304 ERROR [ExecutionCompletionThread #0 ()] org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread Thread[ExecutionCompletionThread #0,5,main] threw an Exception. Shutting down now... java.lang.IllegalStateException: Cannot invoke unregister on an entity which has not been registered at com.google.common.base.Preconditions.checkState(Preconditions.java:508) ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1] at org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.unregisterForUpdates(QueryInfo.java:256) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3
[jira] [Created] (HIVE-23061) LLAP crash due to unhandled exception: Cannot invoke unregister on an entity which has not been registered
Jason Dere created HIVE-23061: - Summary: LLAP crash due to unhandled exception: Cannot invoke unregister on an entity which has not been registered Key: HIVE-23061 URL: https://issues.apache.org/jira/browse/HIVE-23061 Project: Hive Issue Type: Bug Components: llap Reporter: Jason Dere Assignee: Jason Dere The following exception goes uncaught and causes the entire LLAP daemon to shut down: {noformat} 2020-03-17T06:49:11,304 ERROR [ExecutionCompletionThread #0 ()] org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread Thread[ExecutionCompletionThread #0,5,main] threw an Exception. Shutting down now... java.lang.IllegalStateException: Cannot invoke unregister on an entity which has not been registered at com.google.common.base.Preconditions.checkState(Preconditions.java:508) ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1] at org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.unregisterForUpdates(QueryInfo.java:256) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.QueryInfo.unregisterFinishableStateUpdate(QueryInfo.java:209) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.QueryFragmentInfo.unregisterForFinishableStateUpdates(QueryFragmentInfo.java:166) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$TaskWrapper.maybeUnregisterForFinishedStateNotifications(TaskExecutorService.java:1177) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:980) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:944) ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3] at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1021) ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_191] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191] {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22946) HIVE-20082 removed conversion of complex types to string
Jason Dere created HIVE-22946: - Summary: HIVE-20082 removed conversion of complex types to string Key: HIVE-22946 URL: https://issues.apache.org/jira/browse/HIVE-22946 Project: Hive Issue Type: Bug Components: Types, UDF Reporter: Jason Dere Assignee: Jason Dere Looks like we used to support cast/conversion of complex data types (array, map, struct) to string, and HIVE-20082 removed that. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22714) TestScheduledQueryService is flaky
Jason Dere created HIVE-22714: - Summary: TestScheduledQueryService is flaky Key: HIVE-22714 URL: https://issues.apache.org/jira/browse/HIVE-22714 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere {noformat} [ERROR] Failures: [ERROR] TestScheduledQueryService.testScheduledQueryExecution:152 Expected: <5> but: was <0> [INFO] [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0 {noformat} Looks like sometimes we are not waiting long enough for the INSERT query to complete and the SELECT runs before it finishes: {noformat} $ egrep "insert|select" target/surefire-reports/org.apache.hadoop.hive.ql.schq.TestScheduledQueryService-output.txt | grep HOOK PREHOOK: query: insert into tu values(1),(2),(3),(4),(5) 2020-01-09T14:49:09,497 INFO [SchQ 0] SessionState: PREHOOK: query: insert into tu values(1),(2),(3),(4),(5) PREHOOK: query: select 1 from tu 2020-01-09T14:49:11,452 INFO [main] SessionState: PREHOOK: query: select 1 from tu POSTHOOK: query: select 1 from tu 2020-01-09T14:49:11,452 INFO [main] SessionState: POSTHOOK: query: select 1 from tu POSTHOOK: query: insert into tu values(1),(2),(3),(4),(5) 2020-01-09T14:49:12,062 INFO [SchQ 0] SessionState: POSTHOOK: query: insert into tu values(1),(2),(3),(4),(5) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22709) NullPointerException during query compilation after HIVE-22578
Jason Dere created HIVE-22709: - Summary: NullPointerException during query compilation after HIVE-22578 Key: HIVE-22709 URL: https://issues.apache.org/jira/browse/HIVE-22709 Project: Hive Issue Type: Bug Reporter: Jason Dere Attachments: results_cache_with_auth.q Getting a NPE during query compilation, when query results cache and Ranger auth is enabled. This seems to have been caused by HIVE-22578. {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getQueryStringFromAst(SemanticAnalyzer.java:14987) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getQueryStringForCache(SemanticAnalyzer.java:15036) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.createLookupInfoForQuery(SemanticAnalyzer.java:15077) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12513) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:358) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:283) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:283) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:219) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:103) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:215) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:828) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:774) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:768) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:249) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:415) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:346) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:708) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:678) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:169) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22599) Query results cache: 733 permissions check is not necessary
Jason Dere created HIVE-22599: - Summary: Query results cache: 733 permissions check is not necessary Key: HIVE-22599 URL: https://issues.apache.org/jira/browse/HIVE-22599 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere The query results cache initialization makes a call to Utilties.ensurePathIsWritable(), which checks the results cache directory for 733 permissions (default cache dir is {{/tmp/hive/_resultscache_).}} The 733 permissions (at least the 033 part) are not actually necessary - we actually don't really want the results cache directory to be world-writable, and the subdirectories we create within this one are actually done with 700 perms. So I think the call to Utilties.ensurePathIsWritable() can be removed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22595) Dynamic partition inserts fail on Avro table table with external schema
Jason Dere created HIVE-22595: - Summary: Dynamic partition inserts fail on Avro table table with external schema Key: HIVE-22595 URL: https://issues.apache.org/jira/browse/HIVE-22595 Project: Hive Issue Type: Bug Components: Avro, Serializers/Deserializers Reporter: Jason Dere Assignee: Jason Dere Example qfile test: {noformat} create external table avro_extschema_insert1 (name string) partitioned by (p1 string) stored as avro tblproperties ('avro.schema.url'='${system:test.tmp.dir}/table1.avsc'); create external table avro_extschema_insert2 like avro_extschema_insert1; insert overwrite table avro_extschema_insert1 partition (p1='part1') values ('col1_value', 1, 'col3_value'); insert overwrite table avro_extschema_insert2 partition (p1) select * from avro_extschema_insert1; {noformat} The last statement fails with the following error: {noformat} ], TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : attempt_1575484789169_0003_4_00_00_3:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:576) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) ... 19 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Number of input columns was different than output columns (in = 2 vs out = 1 at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1047) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555) ... 20 more Caused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Number of input columns was different than output columns (in = 2 vs out = 1
[jira] [Created] (HIVE-22530) Connection pool timeout in TxnHandler.java is hardcoded to 30 secs
Jason Dere created HIVE-22530: - Summary: Connection pool timeout in TxnHandler.java is hardcoded to 30 secs Key: HIVE-22530 URL: https://issues.apache.org/jira/browse/HIVE-22530 Project: Hive Issue Type: Bug Components: Locking Reporter: Jason Dere If the time to acquire locks gets long enough, we can end up running into the time limit for acquiring DB connections in TxnHandler: {noformat} 2019-07-23 11:49:54,285 ERROR [HiveServer2-Background-Pool: Thread-3881156]: operation.Operation (SQLOperation.java:run(258)) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Error in acquiring locks: Error communicating with the metastore Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore Caused by: MetaException(message:Unable to update transaction database org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object {noformat} This appears to be hard-coded to 30 seconds here: https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2359 It may sense to either make this configurable or eliminate the timeout altogether. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22391) NPE while checking Hive query results cache
Jason Dere created HIVE-22391: - Summary: NPE while checking Hive query results cache Key: HIVE-22391 URL: https://issues.apache.org/jira/browse/HIVE-22391 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Jason Dere Assignee: Jason Dere NPE when results cache was enabled: {noformat} 2019-10-21T14:51:55,718 ERROR [b7d7bea8-eef0-4ea4-ae12-951cb5dc96e3 HiveServer2-Handler-Pool: Thread-210]: ql.Driver (:()) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.checkResultsCache(SemanticAnalyzer.java:15061) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12320) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:360) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1869) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1816) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1811) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262) at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:575) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:561) at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:566) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22275) OperationManager.queryIdOperation does not properly clean up multiple queryIds
Jason Dere created HIVE-22275: - Summary: OperationManager.queryIdOperation does not properly clean up multiple queryIds Key: HIVE-22275 URL: https://issues.apache.org/jira/browse/HIVE-22275 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Jason Dere Assignee: Jason Dere In the case that multiple statements are run by a single Session before being cleaned up, it appears that OperationManager.queryIdOperation is not cleaned up properly. See the log statements below - with the exception of the first "Removed queryId:" log line, the queryId listed during cleanup is the same, when each of these handles should have their own queryId. Looks like only the last queryId executed is being cleaned up. As a result, HS2 can run out of memory as OperationManager.queryIdOperation grows and never cleans these queryIds/Operations up. {noformat} 2019-09-13T08:37:36,785 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9] 2019-09-13T08:37:38,432 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Removed queryId: hive_20190913083736_c49cf3cc-cfe8-48a1-bd22-8b924dfb0396 corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9] with tag: null 2019-09-13T08:37:38,469 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb] 2019-09-13T08:37:52,662 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b983802c-1dec-4fa0-8680-d05ab555321b] 2019-09-13T08:37:56,239 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=75dbc531-2964-47b2-84d7-85b59f88999c] 2019-09-13T08:38:02,551 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=72c79076-9d67-4894-a526-c233fa5450b2] 2019-09-13T08:38:10,558 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=17b30a62-612d-4b70-9ba7-4287d2d9229b] 2019-09-13T08:38:16,930 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=ea97e99d-cc77-470b-b49a-b869c73a4615] 2019-09-13T08:38:20,440 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=a277b789-ebb8-4925-878f-6728d3e8c5fb] 2019-09-13T08:38:26,303 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=9a023ab8-aa80-45db-af88-94790cc83033] 2019-09-13T08:38:30,791 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b697c801-7da0-4544-bcfa-442eb1d3bd77] 2019-09-13T08:39:10,187 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=bda93c8f-0822-4592-a61c-4701720a1a5c] 2019-09-13T08:39:15,471 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb] with tag: null 2019-09-13T08:39:15,507 INFO [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b983802c-1dec
[jira] [Created] (HIVE-22050) Enable download of just UDF resources using LlapServiceDriver command line
Jason Dere created HIVE-22050: - Summary: Enable download of just UDF resources using LlapServiceDriver command line Key: HIVE-22050 URL: https://issues.apache.org/jira/browse/HIVE-22050 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere LlapServiceDriver currently has several components that it downloads as part of the LLAP packaging: Tez jars, UDF jars, aux jars, configs. I'd like to add some options to the LlapServiceDriver command line to enable selective downloading of these components, for example to be able to download just the UDF jars. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22035) HiveStrictManagedMigration settings do not always get set with --hiveconf arguments
Jason Dere created HIVE-22035: - Summary: HiveStrictManagedMigration settings do not always get set with --hiveconf arguments Key: HIVE-22035 URL: https://issues.apache.org/jira/browse/HIVE-22035 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Currently the --hiveconf arguments get added to the System properties. While this allows official HiveConf variables to be set in the conf that is loaded by the HiveStrictManagedMigration utility, there are utility-specific configuration settings which we would want to be set from the command line. For example since Ambari knows what the Hive system user name is it would make sense to be able to set strict.managed.tables.migration.owner on the command line when running this utility. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22034) HiveStrictManagedMigration updates DB location even with --dryRun setting on
Jason Dere created HIVE-22034: - Summary: HiveStrictManagedMigration updates DB location even with --dryRun setting on Key: HIVE-22034 URL: https://issues.apache.org/jira/browse/HIVE-22034 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere The logic at the end of procesDatabase() to update the DB location in the Metastore should only run if runOptions.dryRun == false. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time
Jason Dere created HIVE-22001: - Summary: AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time Key: HIVE-22001 URL: https://issues.apache.org/jira/browse/HIVE-22001 Project: Hive Issue Type: Bug Components: Transactions Reporter: Jason Dere Had one user hit the following error during getSplits {noformat} 2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState (SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: ORC split generation failed with exception: java.io.FileNotFoundException: File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does not exist. at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269) at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does not exist. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809) ... 17 more Caused by: java.io.FileNotFoundException: File hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953) at org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.chooseFile(AcidUtils.java:1903) at org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.isRawFormat(AcidUtils.java:1913) at org.apache.hadoop.hive.ql.io.AcidUtils.parsedDelta(AcidUtils.java:947) at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java:935) at org.apache.hadoop.hive.ql.io.AcidUtils.getChildState(AcidUtils.java:1250) <--- at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:1071) <--- at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.callInternal(OrcInputFormat.jav
[jira] [Created] (HIVE-21963) TransactionalValidationListener.validateTableStructure should check the partition directories in the case of partitioned tables
Jason Dere created HIVE-21963: - Summary: TransactionalValidationListener.validateTableStructure should check the partition directories in the case of partitioned tables Key: HIVE-21963 URL: https://issues.apache.org/jira/browse/HIVE-21963 Project: Hive Issue Type: Bug Components: Transactions Reporter: Jason Dere Assignee: Jason Dere The transactional validation check is checking just the base table directory, but for partitioned tables this should be checking the partitioned directories (some of which may not even be in the base table directory). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21878) Metric for AM to show whether it is currently running a DAG
Jason Dere created HIVE-21878: - Summary: Metric for AM to show whether it is currently running a DAG Key: HIVE-21878 URL: https://issues.apache.org/jira/browse/HIVE-21878 Project: Hive Issue Type: Bug Components: Tez Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-21878.1.patch Add a basic gauge metric to indicate whether a Tez AM is currently running a DAG for a Hive query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21799) NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column
Jason Dere created HIVE-21799: - Summary: NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column Key: HIVE-21799 URL: https://issues.apache.org/jira/browse/HIVE-21799 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Jason Dere Assignee: Jason Dere Following table/query results in NPE: {noformat} create table tez_no_dynpart_hashjoin_on_agg(id int, outcome string, eventid int) stored as orc; explain select a.id, b.outcome from (select id, max(eventid) as event_id_max from tez_no_dynpart_hashjoin_on_agg group by id) a LEFT OUTER JOIN tez_no_dynpart_hashjoin_on_agg b on a.event_id_max = b.eventid; {noformat} Stack trace: {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan(DynamicPartitionPruningOptimization.java:608) at org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.process(DynamicPartitionPruningOptimization.java:239) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) at org.apache.hadoop.hive.ql.parse.TezCompiler.runDynamicPartitionPruning(TezCompiler.java:584) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:165) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:159) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12562) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:370) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:671) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1905) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1852) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1847) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:219) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:340) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:676) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:647) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:59) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21746) ArrayIndexOutOfBoundsException during dynamically partitioned hash join, with CBO disabled
Jason Dere created HIVE-21746: - Summary: ArrayIndexOutOfBoundsException during dynamically partitioned hash join, with CBO disabled Key: HIVE-21746 URL: https://issues.apache.org/jira/browse/HIVE-21746 Project: Hive Issue Type: Bug Components: Query Planning Reporter: Jason Dere Assignee: Jason Dere ArrayIndexOutOfBounds exception during query execution with dynamically partitioned hash join. Found on Hive 2.x. Seems to occur with CBO disabled/failed. Disabling constant propagation seems to allow the query to succeed. {noformat} java.lang.ArrayIndexOutOfBoundsException: 203 at org.apache.hadoop.hive.serde2.io.TimestampWritable.getTotalLength(TimestampWritable.java:217) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:205) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getFieldsAsList(LazyBinaryStruct.java:281) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.unpack(MapJoinBytesTableContainer.java:744) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.next(MapJoinBytesTableContainer.java:730) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.next(MapJoinBytesTableContainer.java:605) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.next(UnwrapRowContainer.java:70) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.next(UnwrapRowContainer.java:34) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:819) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:924) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:456) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:359) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:290) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:319) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:189) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172) ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3] at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:377) ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) ~[hadoop-common-2.7.3.2.6.4.119-3.jar:?] at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) ~[tez-common-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3] at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) ~[hive-llap-server
Re: Review Request 70372: HIVE-21427: Syslog storage handler
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70372/#review214337 --- llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java Line 615 (original), 615 (patched) <https://reviews.apache.org/r/70372/#comment300536> Just curious about this one, was there a difference between rbCtx.getRowColumnTypeInfos() and rbCtx.getDataColumnCount()? Or just the fact that rbCtx.getDataColumnCount() directly returns an int value? ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogSerDe.java Lines 57 (patched) <https://reviews.apache.org/r/70372/#comment300537> Is the list of columns from SyslogSerDe fixed to (facility, severity, version, ts, hostname, app_name, proc_id, msg_id, structured_data, msg, unmatched)? If so then should the column list/types be hardcoded rather than set via LIST_COLUMNS/LIST_COLUMNS_TYPES properties? - Jason Dere On April 2, 2019, 10:29 p.m., Prasanth_J wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70372/ > --- > > (Updated April 2, 2019, 10:29 p.m.) > > > Review request for hive, Ashutosh Chauhan and Jason Dere. > > > Bugs: HIVE-21427 > https://issues.apache.org/jira/browse/HIVE-21427 > > > Repository: hive-git > > > Description > --- > > HIVE-21427: Syslog storage handler > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java > 777f8b51215523fca8e396ddf77139420666311a > data/files/syslog-hs2-2.log PRE-CREATION > data/files/syslog-hs2.log PRE-CREATION > itests/src/test/resources/testconfiguration.properties > 96dfbc4b56b6eb3dff6b8e1e42a2371d090426e7 > > llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java > 9ef7af4eb0c9787a33d2aa4c9a4528b8f356106b > ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java > 27fe828b7531584138cd002956a9fcc20f238f71 > ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogInputFormat.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogParser.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogSerDe.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogStorageHandler.java > PRE-CREATION > ql/src/test/org/apache/hadoop/hive/ql/log/TestSyslogInputFormat.java > PRE-CREATION > ql/src/test/queries/clientpositive/syslog_parser.q PRE-CREATION > ql/src/test/queries/clientpositive/syslog_parser_file_pruning.q > PRE-CREATION > ql/src/test/results/clientpositive/llap/syslog_parser.q.out PRE-CREATION > ql/src/test/results/clientpositive/llap/syslog_parser_file_pruning.q.out > PRE-CREATION > > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/MetastoreSchemaTool.java > eafe0c6d46d448bce287e61fabac0384b12b9295 > > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolCommandLine.java > 6282078411c4c728beed8e957aa857ed3c02133c > > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskCreateLogsTable.java > PRE-CREATION > > > Diff: https://reviews.apache.org/r/70372/diff/1/ > > > Testing > --- > > > Thanks, > > Prasanth_J > >
[jira] [Created] (HIVE-21561) Revert removal of TableType.INDEX_TABLE enum
Jason Dere created HIVE-21561: - Summary: Revert removal of TableType.INDEX_TABLE enum Key: HIVE-21561 URL: https://issues.apache.org/jira/browse/HIVE-21561 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-21561.1.patch Index tables have been removed from Hive as of HIVE-18715. However, in case users still have index tables defined in the metastore, we should keep the TableType.INDEX_TABLE enum around so that users can drop these tables. Without the enum defined Hive cannot do anything with them as it fails with IllegalArgumentException errors when trying to call TableType.valueOf() on INDEX_TABLE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21528) Add metric to track the number of queries waiting for tez session
Jason Dere created HIVE-21528: - Summary: Add metric to track the number of queries waiting for tez session Key: HIVE-21528 URL: https://issues.apache.org/jira/browse/HIVE-21528 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21518) GenericUDFOPNotEqualNS does not run in LLAP
Jason Dere created HIVE-21518: - Summary: GenericUDFOPNotEqualNS does not run in LLAP Key: HIVE-21518 URL: https://issues.apache.org/jira/browse/HIVE-21518 Project: Hive Issue Type: Bug Components: UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-21518.1.patch GenericUDFOPNotEqualNS (Not equal nullsafe operator) does not run in LLAP mode, because it is not registered as a built-in function. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 69903: HIVE-21214
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69903/#review212581 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java Line 1829 (original), 1838 (patched) <https://reviews.apache.org/r/69903/#comment298407> No "if" - this dedup strategy does not work with speculative execution enabled. - Jason Dere On Feb. 5, 2019, 10:10 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69903/ > --- > > (Updated Feb. 5, 2019, 10:10 p.m.) > > > Review request for hive and Jason Dere. > > > Bugs: HIVE-21214 > https://issues.apache.org/jira/browse/HIVE-21214 > > > Repository: hive-git > > > Description > --- > > MoveTask : Use attemptId instead of file size for deduplication of files > compareTempOrDuplicateFiles() > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43811 > > > Diff: https://reviews.apache.org/r/69903/diff/1/ > > > Testing > --- > > > Thanks, > > Deepak Jaiswal > >
Re: Review Request 69903: HIVE-21214
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69903/#review212580 --- ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java Lines 1876 (patched) <https://reviews.apache.org/r/69903/#comment298406> nit: add the filenames to the error message - Jason Dere On Feb. 5, 2019, 10:10 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/69903/ > --- > > (Updated Feb. 5, 2019, 10:10 p.m.) > > > Review request for hive and Jason Dere. > > > Bugs: HIVE-21214 > https://issues.apache.org/jira/browse/HIVE-21214 > > > Repository: hive-git > > > Description > --- > > MoveTask : Use attemptId instead of file size for deduplication of files > compareTempOrDuplicateFiles() > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43811 > > > Diff: https://reviews.apache.org/r/69903/diff/1/ > > > Testing > --- > > > Thanks, > > Deepak Jaiswal > >
[jira] [Created] (HIVE-20998) HiveStrictManagedMigration utility should update DB/Table location as last migration steps
Jason Dere created HIVE-20998: - Summary: HiveStrictManagedMigration utility should update DB/Table location as last migration steps Key: HIVE-20998 URL: https://issues.apache.org/jira/browse/HIVE-20998 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere When processing a database or table, the HiveStrictManagedMigration utility currently changes the database/table locations as the first step in processing that database/table. Unfortunately if an error occurs while processing this database or table, then there may still be migration work that needs to continue for that db/table by running the migration again. However the migration tool only processes dbs/tables that have the old warehouse location, then the tool will skip over the db/table when the migration is run again. One fix here is to set the new location as the last step after all of the migration work is done: - The new table location will not be set until all of its partitions have been successfully migrated. - The new database location will not be set until all of its tables have been successfully migrated. For existing migrations that failed with an error, the following workaround can be done so that the db/tables can be re-processed by the migration tool: 1) Use the migration tool logs to find which databases/tables failed during processing. 2) For each db/table, change location of of the database and table back to old location: ALTER DATABASE tpcds_bin_partitioned_orc_10 SET LOCATION 'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db'; ALTER TABLE tpcds_bin_partitioned_orc_10.store_sales SET LOCATION 'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db/store_sales'; 2) Rerun the migration tool -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats
Jason Dere created HIVE-20900: - Summary: serde2.JsonSerDe no longer supports timestamp.formats Key: HIVE-20900 URL: https://issues.apache.org/jira/browse/HIVE-20900 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Jason Dere Looks like HIVE-18545 broke this. Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the format in that test used the ISO timestamp format which apparently is now parsed by the default timestamp parsing, so the test was too simple. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20839) "Cannot find field" error during dynamically partitioned hash join
Jason Dere created HIVE-20839: - Summary: "Cannot find field" error during dynamically partitioned hash join Key: HIVE-20839 URL: https://issues.apache.org/jira/browse/HIVE-20839 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Jason Dere Assignee: Jason Dere {noformat} 2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 (1539092085144_8944_1085_28_000996_2)] tez.ReduceRecordProcessor: Hit error while closing operators - failing tree 2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 (1539092085144_8944_1085_28_000996_2)] tez.TezProcessor: java.lang.RuntimeException: cannot find field _col304 from [0:_col0, 1:_col1, 2:_col2, 3:_col3, 4:_col4, 5:_col5, 6:_col6, 7:_col7, 8:_col8, 9:_col9, 10:_col10, 11:_col11, 12:_col12, 13:_col13, 14:_col15, 15:_col16, 16:_col17, 17:_col18, 18:_col19, 19:_col20, 20:_col21, 21:_col22, 22:_col23, 23:_col24, 24:_col25, 25:_col26, 26:_col27, 27:_col28, 28:_col29, 29:_col30, 30:_col31, 31:_col32, 32:_col33, 33:_col34, 34:_col35, 35:_col36, 36:_col37, 37:_col38, 38:_col39, 39:_col40, 40:_col41, 41:_col42, 42:_col43, 43:_col44, 44:_col45, 45:_col46, 46:_col47, 47:_col48, 48:_col49, 49:_col50, 50:_col51, 51:_col52, 52:_col53, 53:_col54, 54:_col55, 55:_col56, 56:_col57, 57:_col58, 58:_col59, 59:_col60, 60:_col61, 61:_col62, 62:_col63, 63:_col64, 64:_col65, 65:_col66, 66:_col67, 67:_col68, 68:_col70, 69:_col72, 70:_col73, 71:_col74, 72:_col75, 73:_col76, 74:_col77, 75:_col78, 76:_col79, 77:_col80, 78:_col81, 79:_col82, 80:_col83, 81:_col84, 82:_col85, 83:_col86, 84:_col87, 85:_col88, 86:_col89, 87:_col90, 88:_col91, 89:_col92, 90:_col93, 91:_col94, 92:_col95, 93:_col96, 94:_col97, 95:_col98, 96:_col99, 97:_col100, 98:_col101, 99:_col102, 100:_col103, 101:_col104, 102:_col105, 103:_col106, 104:_col107, 105:_col108, 106:_col109, 107:_col110, 108:_col111, 109:_col112, 110:_col113, 111:_col114, 112:_col115, 113:_col116, 114:_col117, 115:_col118, 116:_col119, 117:_col120, 118:_col121, 119:_col122, 120:_col123, 121:_col124, 122:_col125, 123:_col126, 124:_col127, 125:_col128, 126:_col129, 127:_col130, 128:_col131, 129:_col132, 130:_col133, 131:_col134, 132:_col135, 133:_col136, 134:_col137, 135:_col138, 136:_col139, 137:_col140, 138:_col141, 139:_col142, 140:_col143, 141:_col144, 142:_col145, 143:_col146, 144:_col147, 145:_col148, 146:_col149, 147:_col150, 148:_col151, 149:_col152, 150:_col153, 151:_col154, 152:_col155, 153:_col156, 154:_col157, 155:_col158, 156:_col159, 157:_col160, 158:_col161, 159:_col162, 160:_col163, 161:_col164, 162:_col165, 163:_col166, 164:_col167, 165:_col168, 166:_col169, 167:_col170, 168:_col171, 169:_col318] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:485) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:153) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:80) at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:91) at org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:74) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:144) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:374) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:195) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:188) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20834) Hive QueryResultCache entries keeping reference to SemanticAnalyzer from cached query
Jason Dere created HIVE-20834: - Summary: Hive QueryResultCache entries keeping reference to SemanticAnalyzer from cached query Key: HIVE-20834 URL: https://issues.apache.org/jira/browse/HIVE-20834 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere QueryResultCache.LookupInfo ends up keeping a reference to the SemanticAnalyzer from the cached query, for as long as the cached entry is in the cache. We should not be keeping the SemanticAnalyzer around after the query is done executing since they can hold on to quite a bit of memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 69173: HIVE-20259 Cleanup of results cache directory
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/69173/ --- Review request for hive and Gopal V. Bugs: HIVE-20259 https://issues.apache.org/jira/browse/HIVE-20259 Repository: hive-git Description --- Attached patch with utility DirectoryMarkerUpdate/Cleanup classes to create .cacheupdate files in the cache directory, to indicate that this directory should not be cleaned up by any other process performing DirectoryMarkerCleanup. This uses the last modify date of the .cacheupdate file to determine whether the file should be cleaned up, if the instance running cleanup determines this date is too old then the directory will be deleted. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e226a1f82d common/src/java/org/apache/hive/common/util/DirectoryMarkerCleanup.java PRE-CREATION common/src/java/org/apache/hive/common/util/DirectoryMarkerUpdate.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java a51b7e750b Diff: https://reviews.apache.org/r/69173/diff/1/ Testing --- Thanks, Jason Dere
Re: Review Request 68946: HIVE-20707: Automatic MSCK REPAIR for external tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68946/#review209582 --- ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java Lines 4761 (patched) <https://reviews.apache.org/r/68946/#comment294106> Should this be on by default? If there are a lot of external tables (especially on s3), the metastore could be spending a lot of time doing auto discover. Could also affect the running of other MetastoreTaskThreads. ql/src/test/results/clientpositive/msck_repair_drop.q.out Line 127 (original), 127 (patched) <https://reviews.apache.org/r/68946/#comment294105> What is the new ordering of these messages? Looks like it could be potential issue when diffing golden files? standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java Lines 141 (patched) <https://reviews.apache.org/r/68946/#comment294108> Is this variable used? It's logged, but I think retentionSeconds should be used instead. standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java Lines 142 (patched) <https://reviews.apache.org/r/68946/#comment294107> Might want to check for exception from TimeValidator.validate() in getRententionPeriodInSeconds, or else a bad setting in one table can fail here and prevent this from running for any tables. But if you do skip that table, make sure the countdown latch is updated appropriately. - Jason Dere On Oct. 16, 2018, 12:21 a.m., Prasanth_J wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68946/ > --- > > (Updated Oct. 16, 2018, 12:21 a.m.) > > > Review request for hive, Ashutosh Chauhan and Jason Dere. > > > Bugs: HIVE-20707 > https://issues.apache.org/jira/browse/HIVE-20707 > > > Repository: hive-git > > > Description > --- > > HIVE-20707: Automatic partition management > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 92a1c31 > hbase-handler/src/test/results/positive/external_table_ppd.q.out edcbe7e > hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out > 1209c88 > hbase-handler/src/test/results/positive/hbase_ddl.q.out ccd4148 > hbase-handler/src/test/results/positive/hbase_queries.q.out eeb97f0 > hbase-handler/src/test/results/positive/hbasestats.q.out 5a4aea9 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java > a9d7468 > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 807f159 > ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 46bf088 > ql/src/java/org/apache/hadoop/hive/ql/metadata/CheckResult.java 0b4240f > ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java > 598bb2e > ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java cff32d3 > ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java > 29f6ecf > ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 27f677e > > ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckCreatePartitionsInBatches.java > ce2b186 > > ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckDropPartitionsInBatches.java > 9480d38 > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java > a2a0583 > ql/src/test/queries/clientpositive/msck_repair_acid.q PRE-CREATION > ql/src/test/queries/clientpositive/partition_discovery.q PRE-CREATION > ql/src/test/results/clientpositive/create_like.q.out f4a5ed5 > ql/src/test/results/clientpositive/create_like_view.q.out 870f280 > ql/src/test/results/clientpositive/default_file_format.q.out 0adf5ae > ql/src/test/results/clientpositive/druid_topn.q.out 179902a > ql/src/test/results/clientpositive/explain_locks.q.out ed7f1e8 > ql/src/test/results/clientpositive/llap/external_table_purge.q.out 24c778e > ql/src/test/results/clientpositive/llap/mm_exim.q.out ee6cf06 > ql/src/test/results/clientpositive/llap/strict_managed_tables2.q.out > f3b6152 > ql/src/test/results/clientpositive/llap/whroot_external1.q.out cac158c > ql/src/test/results/clientpositive/msck_repair_acid.q.out PRE-CREATION > ql/src/test/results/clientpositive/msck_repair_drop.q.out 2456734 > ql/src/test/results/clientpositive/partition_discovery.q.out PRE-CREATION > ql/src/test/results/clientpositive/rename_external_partition_location.q.out > 02cd814 >
Re: Review Request 68946: HIVE-20707: Automatic MSCK REPAIR for external tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68946/#review209341 --- ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java Lines 88 (patched) <https://reviews.apache.org/r/68946/#comment293671> Can you use FileUtils.HIDDEN_FILES_PATH_FILTER? I believe standalone-metastore also has a FileUtils.java - Jason Dere On Oct. 8, 2018, 4:16 p.m., Prasanth_J wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68946/ > --- > > (Updated Oct. 8, 2018, 4:16 p.m.) > > > Review request for hive, Ashutosh Chauhan and Jason Dere. > > > Bugs: HIVE-20707 > https://issues.apache.org/jira/browse/HIVE-20707 > > > Repository: hive-git > > > Description > --- > > HIVE-20707: Automatic MSCK REPAIR for external tables > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java > d0adc35544cb8ae9d007a1d2ccb9b9565eedca88 > data/conf/hive-site.xml 0daf9adc717bc1c4413d2e34691c26a3e2585c77 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java > cffa21af33d5abb2162fa16b6b990a469075f03d > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java > e91346228e8724b8253364114145a348a7cbee26 > ql/src/java/org/apache/hadoop/hive/ql/metadata/CheckResult.java > 0b4240f5665f0b544b2fc5864fc098eb286a281e > ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java > 598bb2ee8b72f1b7f75be7802b4eaae0204c988d > > ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckCreatePartitionsInBatches.java > ce2b186b4dceda780106776daa022f18388ec76f > > ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckDropPartitionsInBatches.java > 7e768dacb0b00a0f1a9e64efbe778f9c2daaa31b > > ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java > a2a0583d4dbdfe9aece1a14ecac24e0e6189cafa > ql/src/test/queries/clientpositive/auto_msck_repair_0.q PRE-CREATION > ql/src/test/queries/clientpositive/auto_msck_repair_1.q PRE-CREATION > ql/src/test/queries/clientpositive/auto_msck_repair_2.q PRE-CREATION > ql/src/test/queries/clientpositive/auto_msck_repair_3.q PRE-CREATION > ql/src/test/queries/clientpositive/auto_msck_repair_4.q PRE-CREATION > ql/src/test/queries/clientpositive/auto_msck_repair_batchsize.q > PRE-CREATION > ql/src/test/queries/clientpositive/msck_repair_0.q > aeb4820af5b6687f7ae4163a94bdd2be25a8b0cd > ql/src/test/queries/clientpositive/msck_repair_2.q > be745b2d607d8c727b862c71f153f09d5622a8b5 > ql/src/test/queries/clientpositive/msck_repair_3.q > 140a6904ddc98b165d71a8b24314c56888ccbb9c > ql/src/test/queries/clientpositive/msck_repair_batchsize.q > 5a7afcca5b86c1887308626c0dc4d99916811bea > ql/src/test/queries/clientpositive/msck_repair_drop.q > 9923fb50cbdbdf9e8e07276ccaec073c490770e6 > ql/src/test/results/clientpositive/auto_msck_repair_0.q.out PRE-CREATION > ql/src/test/results/clientpositive/auto_msck_repair_1.q.out PRE-CREATION > ql/src/test/results/clientpositive/auto_msck_repair_2.q.out PRE-CREATION > ql/src/test/results/clientpositive/auto_msck_repair_3.q.out PRE-CREATION > ql/src/test/results/clientpositive/auto_msck_repair_4.q.out PRE-CREATION > ql/src/test/results/clientpositive/auto_msck_repair_batchsize.q.out > PRE-CREATION > ql/src/test/results/clientpositive/msck_repair_0.q.out > fa6e4a988273a71b0f9dab64a48ddda6320d5f2f > ql/src/test/results/clientpositive/msck_repair_2.q.out > 7fbd934e118e81b9c5f028191c7ea6582a34db75 > ql/src/test/results/clientpositive/msck_repair_3.q.out > 0e153fbe69ba39819fac4629ef1bf5f90c17f37f > ql/src/test/results/clientpositive/msck_repair_batchsize.q.out > ab4b83137dcf1ce36846ce74e0a546528e81358b > ql/src/test/results/clientpositive/msck_repair_drop.q.out > 971c1381276fa626bd91d34488a65e3bfb2781ae > > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java > 294dfb728e12efaa13d239ea7b8949587a50fe1f > > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/api/MetastoreException.java > PRE-CREATION > > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java > 7b01678a10f4f0667844fec64ae76695d835bd6e > > standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java &
[jira] [Created] (HIVE-20603) "Wrong FS" error when inserting to partition after changing table location filesystem
Jason Dere created HIVE-20603: - Summary: "Wrong FS" error when inserting to partition after changing table location filesystem Key: HIVE-20603 URL: https://issues.apache.org/jira/browse/HIVE-20603 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Inserting into an existing partition, after changing a table's location to point to a different HDFS filesystem: {noformat} query += "CREATE TABLE test_managed_tbl (id int, name string, dept string) PARTITIONED BY (year int);\n" query += "INSERT INTO test_managed_tbl PARTITION (year=2016) VALUES (8,'Henry','CSE');\n" query += "ALTER TABLE test_managed_tbl ADD PARTITION (year=2017);\n" query += "ALTER TABLE test_managed_tbl SET LOCATION 'hdfs://ns2/warehouse/tablespace/managed/hive/test_managed_tbl'" query += "INSERT INTO test_managed_tbl PARTITION (year=2017) VALUES (9,'Harris','CSE');\n" {noformat} Results in the following error: {noformat} java.lang.IllegalArgumentException: Wrong FS: hdfs://ns1/warehouse/tablespace/managed/hive/test_managed_tbl/year=2017, expected: hdfs://ns2 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:240) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583) at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1580) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1595) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1734) at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:4141) at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1966) at org.apache.hadoop.hive.ql.exec.MoveTask.handleStaticParts(MoveTask.java:477) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:397) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2701) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2372) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2048) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1740) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20515) Empty query results when using results cache and query temp dir, results cache dir in different filesystems
Jason Dere created HIVE-20515: - Summary: Empty query results when using results cache and query temp dir, results cache dir in different filesystems Key: HIVE-20515 URL: https://issues.apache.org/jira/browse/HIVE-20515 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere If the scratchdir for temporary query results and the results cache dir are in different filesystems, moving the query from the temp directory to results cache will fail. Looking at the moveResultsToCacheDirectory() logic in QueryResultsCache.java, I see the following issues: - FileSystem.rename() is used, which only works if the files are on the same filesystem. Need to use something like Hive.mvFile or something similar which can work between different filesystems. - The return code from rename() was not checked which might possibly have caught the error here. This may not be applicable if a different method from FS.rename() is used in the proper fix. With some filesystems (noticed this with WASB), if FileSystem.rename() returns false on failure rather than throwing an exception, then this results in empty results showing up for the query because the return code was not checked properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20412) NPE in HiveMetaHook
Jason Dere created HIVE-20412: - Summary: NPE in HiveMetaHook Key: HIVE-20412 URL: https://issues.apache.org/jira/browse/HIVE-20412 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jason Dere {noformat} java.lang.NullPointerException: null at org.apache.hadoop.hive.metastore.HiveMetaHook.preAlterTable(HiveMetaHook.java:113) ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:427) ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104] at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.java:415) ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104] at com.sun.proxy.$Proxy37.alter_table(Unknown Source) ~[?:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2933) ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104] at com.sun.proxy.$Proxy37.alter_table(Unknown Source) ~[?:?] at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:708) ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104] at org.apache.hadoop.hive.ql.util.HiveStrictManagedMigration$HiveUpdater.updateTableProperties(HiveStrictManagedMigration.java:954) ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104] {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20397) HiveStrictManagedMigration updates
Jason Dere created HIVE-20397: - Summary: HiveStrictManagedMigration updates Key: HIVE-20397 URL: https://issues.apache.org/jira/browse/HIVE-20397 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere - Switch from using Driver instance to using metastore calls via Hive.alterDatabase/Hive.alterTable - For tables converted from ORC to ACID tables, handle renaming of the files - Fix error handling so utility does not terminate after the first error encountered -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20298) Illegal null value in column `TBLS`.`WRITE_ID`
Jason Dere created HIVE-20298: - Summary: Illegal null value in column `TBLS`.`WRITE_ID` Key: HIVE-20298 URL: https://issues.apache.org/jira/browse/HIVE-20298 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jason Dere Manually upgraded my existing local metastore using upgrade-3.0.0-to-3.1.0.mysql.sql, upgrade-3.1.0-to-3.2.0.mysql.sql, upgrade-3.2.0-to-4.0.0.mysql.sql. When running DESCRIBE EXTENDED of an existing table, I was getting the following error in hive.log. It looks like the ObjectStore/MTable classes don't seem to be able to support null values in the new writeId column that was added to the TBLS table in the metastore. cc [~sershe] [~ekoifman] {noformat} Caused by: javax.jdo.JDODataStoreException: Illegal null value in column `TBLS`.`WRITE_ID` NestedThrowables: org.datanucleus.store.rdbms.exceptions.NullValueException: Illegal null value in column `TBLS`.`WRITE_ID` at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:553) at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:255) at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:1802) at org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:1838) at org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:1424) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) at com.sun.proxy.$Proxy39.getTable(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:2950) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:2898) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:2882) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) ... 36 more Caused by: org.datanucleus.store.rdbms.exceptions.NullValueException: Illegal null value in column `TBLS`.`WRITE_ID` at org.datanucleus.store.rdbms.mapping.datastore.BigIntRDBMSMapping.getLong(BigIntRDBMSMapping.java:140) at org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.getLong(SingleFieldMapping.java:155) at org.datanucleus.store.rdbms.fieldmanager.ResultSetGetter.fetchLongField(ResultSetGetter.java:124) at org.datanucleus.state.AbstractStateManager.replacingLongField(AbstractStateManager.java:1549) at org.datanucleus.state.StateManagerImpl.replacingLongField(StateManagerImpl.java:120) at org.apache.hadoop.hive.metastore.model.MTable.dnReplaceField(MTable.java) at org.apache.hadoop.hive.metastore.model.MTable.dnReplaceFields(MTable.java) at org.datanucleus.state.StateManagerImpl.replaceFields(StateManagerImpl.java:3109) at org.datanucleus.store.rdbms.query.PersistentClassROF$1.fetchFields(PersistentClassROF.java:465) at org.datanucleus.state.StateManagerImpl.loadFieldValues(StateManagerImpl.java:2238) at org.datanucleus.state.StateManagerImpl.initialiseForHollow(StateManagerImpl.java:263) at org.datanucleus.state.ObjectProviderFactoryImpl.newForHollow(ObjectProviderFactoryImpl.java:112) at org.datanucleus.ExecutionContextImpl.findObject(ExecutionContextImpl.java:3097) at org.datanucleus.store.rdbms.query.PersistentClassROF.getObjectForDatastoreId(PersistentClassROF.java:460) at org.datanucleus.store.rdbms.query.PersistentClassROF.getObject(PersistentClassROF.java:385) at org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:188) at org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:416) at org.datanucleus.store.rdbms.query.ForwardQueryResult.processNumberOfResults(ForwardQueryResult.java:143) at org.datanucleus.store.rdbms.query.ForwardQueryResult.advanceToEndOfResultSet(ForwardQueryResult.java:171
[jira] [Created] (HIVE-20259) Cleanup of results cache directory
Jason Dere created HIVE-20259: - Summary: Cleanup of results cache directory Key: HIVE-20259 URL: https://issues.apache.org/jira/browse/HIVE-20259 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere The query results cache directory is currently deleted at process exit. This does not work in the case of a kill -9 or a sudden process exit of Hive. There should be some cleanup mechanism in place to take care of any old cache directories that were not deleted at process exit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20250) Option to allow external tables to use query results cache
Jason Dere created HIVE-20250: - Summary: Option to allow external tables to use query results cache Key: HIVE-20250 URL: https://issues.apache.org/jira/browse/HIVE-20250 Project: Hive Issue Type: Sub-task Reporter: Jason Dere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20242) Query results cache: Improve ability of queries to use pending query results
Jason Dere created HIVE-20242: - Summary: Query results cache: Improve ability of queries to use pending query results Key: HIVE-20242 URL: https://issues.apache.org/jira/browse/HIVE-20242 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere HIVE-19138 allowed a currently running query to wait on the pending results of an already running query. [~gopalv], after testing with high concurrency, suggested further improving this by having a way to use the switch to using the results cache even at the end of query compilation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master
/clientpositive/spark/groupby8_noskew.q.out 2ef72b7c18 ql/src/test/results/clientpositive/spark/groupby9.q.out 316f936db3 ql/src/test/results/clientpositive/spark/groupby_position.q.out 7bb5f18e41 ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 873717273d ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 571203089d ql/src/test/results/clientpositive/spark/infer_bucket_sort_map_operators.q.out 268dd10450 ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out 22fe91cb2b ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out 0dde265f8d ql/src/test/results/clientpositive/spark/smb_mapjoin_20.q.out fd0f1c0b26 ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out cecee578db ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out c41dba93ee ql/src/test/results/clientpositive/spark/stats1.q.out b755b4cc3a ql/src/test/results/clientpositive/spark/subquery_multi.q.out f90b353818 ql/src/test/results/clientpositive/spark/union17.q.out 93086a03fe ql/src/test/results/clientpositive/spark/union18.q.out 4b6c32daa7 ql/src/test/results/clientpositive/spark/union19.q.out 6d47270aee ql/src/test/results/clientpositive/spark/union20.q.out b9674089fe ql/src/test/results/clientpositive/spark/union32.q.out 925392b500 ql/src/test/results/clientpositive/spark/union33.q.out 190b6c0128 ql/src/test/results/clientpositive/spark/union6.q.out fca52a3dda ql/src/test/results/clientpositive/spark/union_remove_19.q.out bf8abf1b42 ql/src/test/results/clientpositive/spark/vector_string_concat.q.out cee7995a99 ql/src/test/results/clientpositive/stats1.q.out 10291ce4b5 ql/src/test/results/clientpositive/tablevalues.q.out 74fda005d5 ql/src/test/results/clientpositive/udf3.q.out 0f7c859db8 ql/src/test/results/clientpositive/udf_string.q.out 71b9b293df ql/src/test/results/clientpositive/union17.q.out b7748c0270 ql/src/test/results/clientpositive/union18.q.out 109fa8d4ff ql/src/test/results/clientpositive/union19.q.out f57d8fb4f9 ql/src/test/results/clientpositive/union20.q.out 6cc5eff503 ql/src/test/results/clientpositive/union32.q.out 92ed7d1d19 ql/src/test/results/clientpositive/union33.q.out 1b8b35b9c6 ql/src/test/results/clientpositive/union6.q.out 37c75214c3 ql/src/test/results/clientpositive/union_remove_19.q.out 0c67e67ca5 ql/src/test/results/clientpositive/vector_case_when_1.q.out 59d813371d ql/src/test/results/clientpositive/vector_char_mapjoin1.q.out 73012578b8 ql/src/test/results/clientpositive/vector_decimal_1.q.out e61691273c ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 0193f3bc88 ql/src/test/results/clientpositive/vector_string_concat.q.out 68b011d2e5 ql/src/test/results/clientpositive/vector_varchar_mapjoin1.q.out f956d58c5f ql/src/test/results/clientpositive/vectorized_casts.q.out a19b5ee67a serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java 1e12ccaf3e serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java 6362f2ef57 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java 32fab314a5 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/primitive/TestPrimitiveObjectInspectorUtils.java 3c2797e979 Diff: https://reviews.apache.org/r/68013/diff/3/ Changes: https://reviews.apache.org/r/68013/diff/2-3/ Testing --- Thanks, Jason Dere
Re: Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master
/groupby7_map_multi_single_reducer.q.out 9d09491a46 ql/src/test/results/clientpositive/spark/groupby7_map_skew.q.out 5868f7abf9 ql/src/test/results/clientpositive/spark/groupby7_noskew.q.out 53345aac9e ql/src/test/results/clientpositive/spark/groupby7_noskew_multi_single_reducer.q.out 68809005e1 ql/src/test/results/clientpositive/spark/groupby8.q.out c6cac1bf80 ql/src/test/results/clientpositive/spark/groupby8_map.q.out 40d3e7c103 ql/src/test/results/clientpositive/spark/groupby8_map_skew.q.out 053c717d09 ql/src/test/results/clientpositive/spark/groupby8_noskew.q.out 2ef72b7c18 ql/src/test/results/clientpositive/spark/groupby9.q.out 316f936db3 ql/src/test/results/clientpositive/spark/groupby_position.q.out 7bb5f18e41 ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 873717273d ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 571203089d ql/src/test/results/clientpositive/spark/infer_bucket_sort_map_operators.q.out 268dd10450 ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out 22fe91cb2b ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out 0dde265f8d ql/src/test/results/clientpositive/spark/smb_mapjoin_20.q.out fd0f1c0b26 ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out cecee578db ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out c41dba93ee ql/src/test/results/clientpositive/spark/stats1.q.out b755b4cc3a ql/src/test/results/clientpositive/spark/subquery_multi.q.out f90b353818 ql/src/test/results/clientpositive/spark/union17.q.out 93086a03fe ql/src/test/results/clientpositive/spark/union18.q.out 4b6c32daa7 ql/src/test/results/clientpositive/spark/union19.q.out 6d47270aee ql/src/test/results/clientpositive/spark/union20.q.out b9674089fe ql/src/test/results/clientpositive/spark/union32.q.out 925392b500 ql/src/test/results/clientpositive/spark/union33.q.out 190b6c0128 ql/src/test/results/clientpositive/spark/union6.q.out fca52a3dda ql/src/test/results/clientpositive/spark/union_remove_19.q.out bf8abf1b42 ql/src/test/results/clientpositive/spark/vector_string_concat.q.out cee7995a99 ql/src/test/results/clientpositive/stats1.q.out 10291ce4b5 ql/src/test/results/clientpositive/tablevalues.q.out 74fda005d5 ql/src/test/results/clientpositive/udf3.q.out 0f7c859db8 ql/src/test/results/clientpositive/udf_string.q.out 71b9b293df ql/src/test/results/clientpositive/union17.q.out b7748c0270 ql/src/test/results/clientpositive/union18.q.out 109fa8d4ff ql/src/test/results/clientpositive/union19.q.out f57d8fb4f9 ql/src/test/results/clientpositive/union20.q.out 6cc5eff503 ql/src/test/results/clientpositive/union32.q.out 92ed7d1d19 ql/src/test/results/clientpositive/union33.q.out 1b8b35b9c6 ql/src/test/results/clientpositive/union6.q.out 37c75214c3 ql/src/test/results/clientpositive/union_remove_19.q.out 0c67e67ca5 ql/src/test/results/clientpositive/vector_case_when_1.q.out 59d813371d ql/src/test/results/clientpositive/vector_char_mapjoin1.q.out 73012578b8 ql/src/test/results/clientpositive/vector_decimal_1.q.out e61691273c ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 0193f3bc88 ql/src/test/results/clientpositive/vector_string_concat.q.out 68b011d2e5 ql/src/test/results/clientpositive/vector_varchar_mapjoin1.q.out f956d58c5f ql/src/test/results/clientpositive/vectorization_parquet_ppd_decimal.q.out 49d7354b60 ql/src/test/results/clientpositive/vectorized_casts.q.out a19b5ee67a serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java 1e12ccaf3e serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java 6362f2ef57 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java 32fab314a5 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/primitive/TestPrimitiveObjectInspectorUtils.java 3c2797e979 Diff: https://reviews.apache.org/r/68013/diff/2/ Changes: https://reviews.apache.org/r/68013/diff/1-2/ Testing --- Thanks, Jason Dere
Re: Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master
> On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToString.java > > Lines 20-21 (patched) > > <https://reviews.apache.org/r/68013/diff/1/?file=2062580#file2062580line20> > > > > Need to use slf4j. Will fix > On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToString.java > > Lines 35 (patched) > > <https://reviews.apache.org/r/68013/diff/1/?file=2062580#file2062580line35> > > > > Don't see a deleted file of earlier udf in patch. We shall delete that. Deleting UDFToString in the new patch > On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote: > > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToString.java > > Lines 53 (patched) > > <https://reviews.apache.org/r/68013/diff/1/?file=2062580#file2062580line53> > > > > I guess there can be a string representation for map,array,struct. > > Wasn't earlier udf supporting it? If not, lets leave a TODO here. Just tried a query on Hive master casting complex types to String, it fails elsewhere during query compilation. So there seem to be other obstacles here besides this. 2018-07-23T14:40:18,381 ERROR [7c182ca7-d2aa-4949-a3f3-ba034754c3c2 main] ql.Driver: FAILED: ClassCastException org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo java.lang.ClassCastException: org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.isRedundantConversionFunction(TypeCheckProcFactory.java:893) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:996) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1468) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) at org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:240) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:186) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:12684) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12639) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSelectLogicalPlan(CalcitePlanner.java:4614) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4951) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1740) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1688) > On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote: > > ql/src/test/results/clientpositive/char_pad_convert.q.out > > Line 133 (original), 133 (patched) > > <https://reviews.apache.org/r/68013/diff/1/?file=2062590#file2062590line133> > > > > Lets add test for cast from dec to char/varchar as well. Both cases > > where size of char/varchar is bigger as well as smaller than decimal's > > scale. Added test in TestObjectInspectorConverters - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/68013/#review206338 --- On July 23, 2018, 6:37 a.m., Jason Dere wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/68013/ > --- > > (Updated July 23, 2018, 6:37 a.m.) > > > Review request for hive, Ashutosh Chauhan and Sergey Shelukhin. > > > Bugs: HIVE-20082 > https://issues.apache.org/jira/browse/HIVE-20082 > > > Repository: hive-git > > > Description > --- > > preserve decimal 0-padding during decimal-to-string conversion > > > Diffs > - >
Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master
/clientpositive/spark/infer_bucket_sort_map_operators.q.out 268dd10450 ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out 22fe91cb2b ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out 0dde265f8d ql/src/test/results/clientpositive/spark/smb_mapjoin_20.q.out fd0f1c0b26 ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out cecee578db ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out c41dba93ee ql/src/test/results/clientpositive/spark/stats1.q.out b755b4cc3a ql/src/test/results/clientpositive/spark/subquery_multi.q.out f90b353818 ql/src/test/results/clientpositive/spark/union17.q.out 93086a03fe ql/src/test/results/clientpositive/spark/union18.q.out 4b6c32daa7 ql/src/test/results/clientpositive/spark/union19.q.out 6d47270aee ql/src/test/results/clientpositive/spark/union20.q.out b9674089fe ql/src/test/results/clientpositive/spark/union32.q.out 925392b500 ql/src/test/results/clientpositive/spark/union33.q.out 190b6c0128 ql/src/test/results/clientpositive/spark/union6.q.out fca52a3dda ql/src/test/results/clientpositive/spark/union_remove_19.q.out bf8abf1b42 ql/src/test/results/clientpositive/spark/vector_string_concat.q.out cee7995a99 ql/src/test/results/clientpositive/stats1.q.out 10291ce4b5 ql/src/test/results/clientpositive/tablevalues.q.out 74fda005d5 ql/src/test/results/clientpositive/udf3.q.out 0f7c859db8 ql/src/test/results/clientpositive/udf_string.q.out 71b9b293df ql/src/test/results/clientpositive/union17.q.out b7748c0270 ql/src/test/results/clientpositive/union18.q.out 109fa8d4ff ql/src/test/results/clientpositive/union19.q.out f57d8fb4f9 ql/src/test/results/clientpositive/union20.q.out 6cc5eff503 ql/src/test/results/clientpositive/union32.q.out 92ed7d1d19 ql/src/test/results/clientpositive/union33.q.out 1b8b35b9c6 ql/src/test/results/clientpositive/union6.q.out 37c75214c3 ql/src/test/results/clientpositive/union_remove_19.q.out 0c67e67ca5 ql/src/test/results/clientpositive/vector_case_when_1.q.out 59d813371d ql/src/test/results/clientpositive/vector_char_mapjoin1.q.out 73012578b8 ql/src/test/results/clientpositive/vector_decimal_1.q.out e61691273c ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 0193f3bc88 ql/src/test/results/clientpositive/vector_string_concat.q.out 68b011d2e5 ql/src/test/results/clientpositive/vector_varchar_mapjoin1.q.out f956d58c5f ql/src/test/results/clientpositive/vectorized_casts.q.out a19b5ee67a serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java 1e12ccaf3e serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java 6362f2ef57 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java 32fab314a5 serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/primitive/TestPrimitiveObjectInspectorUtils.java 3c2797e979 Diff: https://reviews.apache.org/r/68013/diff/1/ Testing --- Thanks, Jason Dere
Re: Review Request 67974: HIVE-20164
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67974/#review206321 --- ql/src/test/queries/clientpositive/murmur_hash_migration.q Lines 57 (patched) <https://reviews.apache.org/r/67974/#comment289246> Can you make this count(*)? Kind of hard to verify. - Jason Dere On July 20, 2018, 11:10 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67974/ > --- > > (Updated July 20, 2018, 11:10 p.m.) > > > Review request for hive, Gopal V and Jason Dere. > > > Bugs: HIVE-20164 > https://issues.apache.org/jira/browse/HIVE-20164 > > > Repository: hive-git > > > Description > --- > > Murmur Hash : Make sure CTAS and IAS use correct bucketing version > > > Diffs > - > > itests/src/test/resources/testconfiguration.properties d5a33bd8ca > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1661aeccd7 > ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e > ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION > ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out > PRE-CREATION > > > Diff: https://reviews.apache.org/r/67974/diff/2/ > > > Testing > --- > > > Thanks, > > Deepak Jaiswal > >
Re: Review Request 67974: HIVE-20164
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67974/#review206291 --- ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java Lines 1672 (patched) <https://reviews.apache.org/r/67974/#comment289187> Remove these comments? ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java Lines 1695 (patched) <https://reviews.apache.org/r/67974/#comment289188> please add curly braces ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java Lines 1701 (patched) <https://reviews.apache.org/r/67974/#comment289189> curly braces ql/src/test/queries/clientpositive/murmur_hash_migration.q Lines 36 (patched) <https://reviews.apache.org/r/67974/#comment289194> Does this test also need to query the inserted tables to show that things are working properly? - Jason Dere On July 19, 2018, 6:02 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67974/ > --- > > (Updated July 19, 2018, 6:02 p.m.) > > > Review request for hive, Gopal V and Jason Dere. > > > Bugs: HIVE-20164 > https://issues.apache.org/jira/browse/HIVE-20164 > > > Repository: hive-git > > > Description > --- > > Murmur Hash : Make sure CTAS and IAS use correct bucketing version > > > Diffs > - > > itests/src/test/resources/testconfiguration.properties d08528f319 > ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1b433c7498 > ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e > ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION > ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out > PRE-CREATION > > > Diff: https://reviews.apache.org/r/67974/diff/1/ > > > Testing > --- > > > Thanks, > > Deepak Jaiswal > >
Re: Review Request 67970: HIVE-20204 Type conversion during IN () comparisons is using different rules from other comparison operations
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67970/ --- (Updated July 19, 2018, 8:12 p.m.) Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez. Changes --- Update to fix failures in join45.q,join47.q,mapjoin47.q Bugs: HIVE-20204 https://issues.apache.org/jira/browse/HIVE-20204 Repository: hive-git Description --- Change GenericUDFIn to use FunctionRegistry.getCommonClassForComparison() to match type conversion done during other comparison operations. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 0800a10541 ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java 2ae015adf4 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIn.java cf26fce00f ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java c91865b173 ql/src/test/queries/clientpositive/orc_ppd_decimal.q 2134a9f207 ql/src/test/queries/clientpositive/parquet_ppd_decimal.q e8e118d541 ql/src/test/queries/clientpositive/vectorization_parquet_ppd_decimal.q 0b0811b055 ql/src/test/results/clientpositive/llap/orc_ppd_decimal.q.out 4b535d4480 ql/src/test/results/clientpositive/parquet_ppd_decimal.q.out c9a4338dbf ql/src/test/results/clientpositive/vectorization_parquet_ppd_decimal.q.out 49d7354b60 Diff: https://reviews.apache.org/r/67970/diff/2/ Changes: https://reviews.apache.org/r/67970/diff/1-2/ Testing --- Thanks, Jason Dere
Review Request 67970: HIVE-20204 Type conversion during IN () comparisons is using different rules from other comparison operations
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67970/ --- Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez. Bugs: HIVE-20204 https://issues.apache.org/jira/browse/HIVE-20204 Repository: hive-git Description --- Change GenericUDFIn to use FunctionRegistry.getCommonClassForComparison() to match type conversion done during other comparison operations. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 0800a10541 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIn.java cf26fce00f ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java c91865b173 ql/src/test/queries/clientpositive/orc_ppd_decimal.q 2134a9f207 ql/src/test/queries/clientpositive/parquet_ppd_decimal.q e8e118d541 ql/src/test/queries/clientpositive/vectorization_parquet_ppd_decimal.q 0b0811b055 ql/src/test/results/clientpositive/llap/orc_ppd_decimal.q.out 4b535d4480 ql/src/test/results/clientpositive/parquet_ppd_decimal.q.out c9a4338dbf ql/src/test/results/clientpositive/vectorization_parquet_ppd_decimal.q.out 49d7354b60 Diff: https://reviews.apache.org/r/67970/diff/1/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-20204) Type conversion during IN () comparisons is using different rules from other comparison operations
Jason Dere created HIVE-20204: - Summary: Type conversion during IN () comparisons is using different rules from other comparison operations Key: HIVE-20204 URL: https://issues.apache.org/jira/browse/HIVE-20204 Project: Hive Issue Type: Bug Components: Types Reporter: Jason Dere Assignee: Jason Dere Noticed this while looking at HIVE-20082. The type conversion done during GenericUDFIn (via ReturnObjectInspectorResolver) uses FunctionRegistry.getCommonClass(), whereas the other comparison operators (=, <, >, <=, >=) use FunctionRegistry.getCommonClassForComparison(). As a result, dec_column IN ('1.1', '2.2') compares the values as strings, whereas dec_column = '1.1' would compare the values as doubles. This makes a difference for HIVE-20082 since it is related to changing the 0-padding during decimal-to-string conversions. cc [~ashutoshc] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19981) Managed tables converted to external tables by the HiveStrictManagedMigration utility should be set to delete data when the table is dropped
Jason Dere created HIVE-19981: - Summary: Managed tables converted to external tables by the HiveStrictManagedMigration utility should be set to delete data when the table is dropped Key: HIVE-19981 URL: https://issues.apache.org/jira/browse/HIVE-19981 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Using the HiveStrictManagedMigration utility, tables can be converted to conform to the Hive strict managed tables mode. For managed tables that are converted to external tables by the utility, these tables should keep the "drop data on delete" semantics they had when they were managed tables. One way to do this is to introduce a table property "external.table.purge", which if true (and if the table is an external table), will let Hive know to delete the table data when the table is dropped. This property will be set by the HiveStrictManagedMigration utility when managed tables are converted to external tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 67608: HIVE-19898 Disable TransactionalValidationListener when the table is not in the Hive catalog
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67608/ --- Review request for hive and Eugene Koifman. Bugs: HIVE-19898 https://issues.apache.org/jira/browse/HIVE-19898 Repository: hive-git Description --- - Only run TransactionalValidationListener for hive catalog - Added unit test - Listener also did not seem to be getting the configuration that the metastore was being initialized with - made a change to how the conf was being retrieved. Diffs - itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestTransactionalValidationListener.java PRE-CREATION standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java 56da1151cc standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/MetaStoreClientTest.java a0e9d32546 Diff: https://reviews.apache.org/r/67608/diff/1/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-19898) Disable TransactionalValidationListener when the table is not in the Hive catalog
Jason Dere created HIVE-19898: - Summary: Disable TransactionalValidationListener when the table is not in the Hive catalog Key: HIVE-19898 URL: https://issues.apache.org/jira/browse/HIVE-19898 Project: Hive Issue Type: Bug Components: Metastore, Standalone Metastore Reporter: Jason Dere Assignee: Jason Dere The TransactionalValidationListener does validation of tables specified as transactional tables, as well as enforcing create.as.acid. While this can be useful to Hive, this may not be useful to other catalogs which do not support transactional tables, and would not benefit from being automatically tagged as a transactional table. This should be changed so the TransactionalValidationListener does not run for non-hive catalogs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19892) Disable query results cache for for HiveServer2 doAs=true
Jason Dere created HIVE-19892: - Summary: Disable query results cache for for HiveServer2 doAs=true Key: HIVE-19892 URL: https://issues.apache.org/jira/browse/HIVE-19892 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere If running HS2 with doAs=true, the temp query results directory will have ownership/permissions based on the doAs user. A subsequent query running as a different user may not be able to access this query results directory. Results caching will have to be disabled in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19883) QTestUtil: initDataset() can be affected by the settings of the previous test
Jason Dere created HIVE-19883: - Summary: QTestUtil: initDataset() can be affected by the settings of the previous test Key: HIVE-19883 URL: https://issues.apache.org/jira/browse/HIVE-19883 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Jason Dere Tried creating a test that set metastore.create.as.acid/hive.create.as.insert.only, and I found that the built-in table default.src was being created as an insert-only transactional table, which will cause errors in other tests that do not set the TxnManager to one that supports transactional tables. It appears that initDataset() uses the old CliDriver that was used for the previous test, which has any settings used during that test: {noformat} java.lang.Exception: Creating src at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4926) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:428) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2659) [hive-exec-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2311) [hive-exec-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1982) [hive-exec-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1683) [hive-exec-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1677) [hive-exec-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) [hive-cli-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) [hive-cli-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) [hive-cli-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) [hive-cli-4.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.hive.ql.QTestUtil.initDataset(QTestUtil.java:1277) [classes/:?] at org.apache.hadoop.hive.ql.QTestUtil.initDataSetForTest(QTestUtil.java:1259) [classes/:?] at org.apache.hadoop.hive.ql.QTestUtil.cliInit(QTestUtil.java:1328) [classes/:?] at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:176) [classes/:?] at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) [classes/:?] at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:59) [test-classes/:?] {noformat} A new CliDriver is created for the new test, but only after we've created the dataset tables for the next test (see the line numbers for QTestUtil.cliInit() in both stack traces). {noformat} CliSessionState(SessionState).getConf() line: 317 CliDriver.() line: 110 QTestUtil.cliInit(File, boolean) line: 1360 CoreCliDriver.runTest(String, String, String) line: 176 CoreCliDriver(CliAdapter).runTest(String, File) line: 104 TestMiniLlapLocalCliDriver.testCliDriver() line: 59 {noformat} I think fix is to move the creation of the new CliDriver higher up in QTestUtil.cliInit(), before we call initDataset(). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 67540: HIVE-19861 Fix temp table path generation for acid table export
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67540/ --- Review request for hive and Eugene Koifman. Bugs: HIVE-19861 https://issues.apache.org/jira/browse/HIVE-19861 Repository: hive-git Description --- Change DDLTask so temp tables do not get location generated. Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e06949928d ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 209fdfb287 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 2e055aba4b ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 04292787a8 ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 83490d2d53 Diff: https://reviews.apache.org/r/67540/diff/1/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-19861) Fix temp table path generation for acid table export
Jason Dere created HIVE-19861: - Summary: Fix temp table path generation for acid table export Key: HIVE-19861 URL: https://issues.apache.org/jira/browse/HIVE-19861 Project: Hive Issue Type: Bug Components: Import/Export, Transactions Reporter: Jason Dere Assignee: Jason Dere Temp tables that are analyzed by the SemanticAnalyzer get their default location set to a location in the session directory. Export of Acid tables also creates temp tables, but this is done via a plan transformation, and the temp table creation never goes through the SemanticAnalyzer, meaning the location is not set. There is some other logic in DDLTask (which I am changing in HIV-19837) which ends up automatically setting this path to the default table location in the warehouse directory. This should be fixed so that the path defaults to a location in the session directory, like with normal temp tables. cc [~ekoifman] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19837) Setting to have different default location for external tables
Jason Dere created HIVE-19837: - Summary: Setting to have different default location for external tables Key: HIVE-19837 URL: https://issues.apache.org/jira/browse/HIVE-19837 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Allow external tables to have a different default location than managed tables -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19778) Flaky test: TestCliDriver#input31
Jason Dere created HIVE-19778: - Summary: Flaky test: TestCliDriver#input31 Key: HIVE-19778 URL: https://issues.apache.org/jira/browse/HIVE-19778 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Jason Dere Noticed this one has been failing occasionally on precommit test runs. {noformat} Running: diff -a /home/hiveptest/35.193.227.186-hiveptest-1/apache-github-source-source/itests/qtest/target/qfile-results/clientpositive/input31.q.out /home/hiveptest/35.193.227.186-hiveptest-1/apache-github-source-source/ql/src/test/results/clientpositive/input31.q.out 128c128 < 496 --- > 242 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19777) NPE in TezSessionState
Jason Dere created HIVE-19777: - Summary: NPE in TezSessionState Key: HIVE-19777 URL: https://issues.apache.org/jira/browse/HIVE-19777 Project: Hive Issue Type: Bug Components: Tez Reporter: Jason Dere Encountered while running "insert into table values (..)" Looks like it is due to the fact that TezSessionState.close() sets console to null at the start of the method, and then calls getSession() which attempts to log to console. {noformat} java.lang.NullPointerException: null at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.getSession(TezSessionState.java:711) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:646) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeIfNotDefault(TezSessionPoolManager.java:353) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:467) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getUnmanagedSession(WorkloadManagerFederation.java:66) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getSession(WorkloadManagerFederation.java:38) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:184) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2497) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2149) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1826) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1569) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1563) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) ~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) ~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) ~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) ~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) ~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) ~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_121] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_121] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_121] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121] at org.apache.hadoop.util.RunJar.run(RunJar.java:308) ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?] at org.apache.hadoop.util.RunJar.main(RunJar.java:222) ~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?] {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19768) Utility to convert tables to conform to Hive strict managed tables mode
Jason Dere created HIVE-19768: - Summary: Utility to convert tables to conform to Hive strict managed tables mode Key: HIVE-19768 URL: https://issues.apache.org/jira/browse/HIVE-19768 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere Create a utility that can check existing hive tables and convert them if necessary to conform to strict managed tables mode. - Managed non-transactional ORC tables will be converted to full transactional tables - Managed non-transactional tables of other types will be converted to insert-only transactional tables - Tables with non-native storage/schema will be converted to external tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19753) Strict managed tables mode in Hive
Jason Dere created HIVE-19753: - Summary: Strict managed tables mode in Hive Key: HIVE-19753 URL: https://issues.apache.org/jira/browse/HIVE-19753 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Create a mode in Hive which enforces that all managed tables are transactional (both full or insert-only tables allowed). Non-transactional tables, as well as non-native tables, must be created as external tables when this mode is enabled. The idea would be that in strict managed tables mode all of the data written to managed tables would have been done through Hive. The mode would be enabled using config setting hive.strict.managed.tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19563) Flaky test: TestMiniLlapLocalCliDriver.tez_vector_dynpart_hashjoin_1
Jason Dere created HIVE-19563: - Summary: Flaky test: TestMiniLlapLocalCliDriver.tez_vector_dynpart_hashjoin_1 Key: HIVE-19563 URL: https://issues.apache.org/jira/browse/HIVE-19563 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Jason Dere {noformat} Client Execution succeeded but contained differences (error code = 1) after executing tez_vector_dynpart_hashjoin_1.q 407c407 < -13036 1 --- > -8915 1 410c410 < -8915 1 --- > -13036 1 {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 67138: HIVE-4367 enhance TRUNCATE syntex to drop data of external table
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67138/ --- Review request for hive and Teddy Choi. Bugs: HIVE-4367 https://issues.apache.org/jira/browse/HIVE-4367 Repository: hive-git Description --- Allow TRUNCATE TABLE for external tables with FORCE option Diffs - itests/src/test/resources/testconfiguration.properties cf6d19a593 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java f0b9edaf01 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 09a4368984 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 3712a53521 ql/src/test/queries/clientpositive/truncate_external_force.q PRE-CREATION ql/src/test/results/clientpositive/llap/truncate_external_force.q.out PRE-CREATION Diff: https://reviews.apache.org/r/67138/diff/1/ Testing --- qtest Thanks, Jason Dere
[jira] [Created] (HIVE-19489) Disable stats autogather for external tables
Jason Dere created HIVE-19489: - Summary: Disable stats autogather for external tables Key: HIVE-19489 URL: https://issues.apache.org/jira/browse/HIVE-19489 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Jason Dere Assignee: Jason Dere Hive auto-gather of table statistics can result in incorrect generation of stats (and the stats being marked as accurate) in the case of external tables where the data is being written by external apps. To avoid this issue, stats autogather will be disabled on external tables when loading/inserting into a table with existing data, if HIVE_DISABLE_UNSAFE_EXTERNALTABLE_OPERATIONS is enabled. In this situation, users should rely on explicitly calling ANALYZE TABLE on their external tables to make sure the stats are kept up-to-date. Autogather of stats will still be allowed to occur on external tables in the case of INSERT OVERWRITE or LOAD DATA OVERWRITE, since the existing data is being removed and so the stats calculated on the inserted/loaded data should be accurate. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66999: HIVE-19453
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66999/#review202699 --- ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g Line 838 (original), 839 (patched) <https://reviews.apache.org/r/66999/#comment284680> Should the inputFileFormat expression be aliased, like '(inputFileFmt=inputFileFormat)?', and referenced in the line below as '$inputFileFmt?' ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g Line 839 (original), 840 (patched) <https://reviews.apache.org/r/66999/#comment284686> Might be useful to be able to pass in SerDe params which are used to initialize the SerDe - this could be useful for some SerDes. For example LazySimpleSerDe allows you to pass in the field separator, or set the timestamp format etc. ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java Lines 475 (patched) <https://reviews.apache.org/r/66999/#comment284684> Is this supposed to be set using the class name (String), or the actual class object (Class)? Do the inputFormat/serde classes need to be validated here? ql/src/test/queries/clientpositive/load_data_using_job.q Lines 90 (patched) <https://reviews.apache.org/r/66999/#comment284685> Previously what would indicate to Hive that an INSERT plan was required, as opposed to just saving the data as-is like is done for a traditional LOAD DATA? - Jason Dere On May 8, 2018, 6:12 a.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66999/ > --- > > (Updated May 8, 2018, 6:12 a.m.) > > > Review request for hive, Jason Dere and Prasanth_J. > > > Bugs: HIVE-19453 > https://issues.apache.org/jira/browse/HIVE-19453 > > > Repository: hive-git > > > Description > --- > > Extend the load data statement to take the inputformat of the source files > and the serde to interpret it as parameter. For eg, > > load data local inpath > '../../data/files/load_data_job/partitions/load_data_2_partitions.txt' INTO > TABLE srcbucket_mapjoin > INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' > SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'; > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g a837d67b96 > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java > 2b88ea651b > ql/src/test/queries/clientpositive/load_data_using_job.q 3928f1fa07 > ql/src/test/results/clientpositive/llap/load_data_using_job.q.out > 116630c237 > > > Diff: https://reviews.apache.org/r/66999/diff/1/ > > > Testing > --- > > Added a test to load_data_using_job.q > > > Thanks, > > Deepak Jaiswal > >
[jira] [Created] (HIVE-19467) Make storage format configurable for temp tables created using LLAP external client
Jason Dere created HIVE-19467: - Summary: Make storage format configurable for temp tables created using LLAP external client Key: HIVE-19467 URL: https://issues.apache.org/jira/browse/HIVE-19467 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere Temp tables created for complex queries when using the LLAP external client are created using the default storage format. Default to orc, and make configurable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66862: HIVE-19258 add originals support to MM tables (and make the conversion a metadata only operation)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66862/#review202395 --- ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java Lines 553 (patched) <https://reviews.apache.org/r/66862/#comment284304> 'fi' - comment chopped off? ql/src/test/queries/clientpositive/mm_conversions.q Lines 28 (patched) <https://reviews.apache.org/r/66862/#comment284204> No golden file changes for this test in this patch. - Jason Dere On May 3, 2018, 2:23 a.m., Sergey Shelukhin wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66862/ > --- > > (Updated May 3, 2018, 2:23 a.m.) > > > Review request for hive and Thejas Nair. > > > Repository: hive-git > > > Description > --- > > see jira > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 6358ff3002 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java > 7e17d5d888 > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 3141a7e981 > ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 969c591917 > ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 183515a0ed > ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java b25bb1de49 > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 2337a350e6 > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java > b698c84080 > ql/src/test/queries/clientpositive/mm_conversions.q 55565a9428 > > > Diff: https://reviews.apache.org/r/66862/diff/2/ > > > Testing > --- > > > Thanks, > > Sergey Shelukhin > >
Review Request 66887: HIVE-19336 Disable SMB/Bucketmap join for external tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66887/ --- Review request for hive and Deepak Jaiswal. Bugs: HIVE-19336 https://issues.apache.org/jira/browse/HIVE-19336 Repository: hive-git Description --- Disable SMB/Bucketmap join for external tables by default Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 7121bceb22 ql/src/test/queries/clientpositive/bucket_map_join_tez2.q 1361e32c1a ql/src/test/queries/clientpositive/tez_smb_1.q ecfb0dcf79 ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out fa90ccd556 ql/src/test/results/clientpositive/llap/tez_smb_1.q.out faa948627e Diff: https://reviews.apache.org/r/66887/diff/1/ Testing --- qfile tests Thanks, Jason Dere
Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66567/#review202074 --- ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java Line 341 (original), 352 (patched) <https://reviews.apache.org/r/66567/#comment283728> Can you just add a comment here describing why it is ok to hardcode bucketing version to 2 here? - Jason Dere On April 27, 2018, 1:14 a.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66567/ > --- > > (Updated April 27, 2018, 1:14 a.m.) > > > Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and > Matt McCline. > > > Bugs: HIVE-18910 > https://issues.apache.org/jira/browse/HIVE-18910 > > > Repository: hive-git > > > Description > --- > > Hive uses JAVA hash which is not as good as murmur for better distribution > and efficiency in bucketing a table. > Migrate to murmur hash but still keep backward compatibility for existing > users so that they dont have to reload the existing tables. > > To keep backward compatibility, bucket_version is added as a table property, > resulting in high number of result updates. > > > Diffs > - > > hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 > hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out > 153613e6d0 > hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e > hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java > 924e233293 > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > fe2b1c1f3c > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java > 03c28a33c8 > > hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java > 996329195c > > hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java > f9ee9d9a03 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out > caa00292b8 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out > ab8ad77074 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out > 2b28a6677e > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out > cdb67dd786 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out > 2c23a7e94f > > itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out > a1be085ea5 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java > 82ba775286 > itests/src/test/resources/testconfiguration.properties 1a346593fd > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e > ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a > ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java > d4363fdf91 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 25035433c7 > > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java > a42c299537 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/keyseries/VectorKeySeriesSerializedImpl.java > 86f466fc4e > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java > 1bc3fdabac > ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 71498a125c > ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 019682fb10 > ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java a51fdd322f > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java > 7121bceb22 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java > 5f65f638ca > ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerOperatorFactory.java > 2be3c9b9a2 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOpti
[jira] [Created] (HIVE-19336) Disable SMB/Bucketmap join for external tables
Jason Dere created HIVE-19336: - Summary: Disable SMB/Bucketmap join for external tables Key: HIVE-19336 URL: https://issues.apache.org/jira/browse/HIVE-19336 Project: Hive Issue Type: Sub-task Reporter: Jason Dere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19335) Disable runtime filtering (semijoin reduction opt with bloomfilter) for external tables
Jason Dere created HIVE-19335: - Summary: Disable runtime filtering (semijoin reduction opt with bloomfilter) for external tables Key: HIVE-19335 URL: https://issues.apache.org/jira/browse/HIVE-19335 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Even with good stats runtime filtering can cause issues, if they are out of date things are even worse. Disable by default for external tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19334) Use actual file size rather than stats for fetch task optimization with external tables
Jason Dere created HIVE-19334: - Summary: Use actual file size rather than stats for fetch task optimization with external tables Key: HIVE-19334 URL: https://issues.apache.org/jira/browse/HIVE-19334 Project: Hive Issue Type: Sub-task Reporter: Jason Dere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19333) Disable operator tree branch removal using stats
Jason Dere created HIVE-19333: - Summary: Disable operator tree branch removal using stats Key: HIVE-19333 URL: https://issues.apache.org/jira/browse/HIVE-19333 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Can result in wrong results if branch removal occurs due to out-of-date stats -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19332) Disable compute.query.using.stats for external table
Jason Dere created HIVE-19332: - Summary: Disable compute.query.using.stats for external table Key: HIVE-19332 URL: https://issues.apache.org/jira/browse/HIVE-19332 Project: Hive Issue Type: Sub-task Reporter: Jason Dere -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19329) Disallow some optimizations/behaviors for external tables
Jason Dere created HIVE-19329: - Summary: Disallow some optimizations/behaviors for external tables Key: HIVE-19329 URL: https://issues.apache.org/jira/browse/HIVE-19329 Project: Hive Issue Type: Bug Reporter: Jason Dere Assignee: Jason Dere External tables in Hive are often used in situations where the data is being created and managed by other applications outside of Hive. There are several issues that can occur when data being written to table directories by external apps: - If an application is writing files to a table/partition at the same time that Hive tries to merge files for the same table/partition (ALTER TABLE CONCATENATE, or hive.merge.tezfiles during insert) data can be lost. - When new data has been added to the table by external applications, the Hive table statistics are often way out of date with the current state of the data. This can result in wrong results in the case of answering queries using stats, or bad query plans being generated. Some of these operations should be blocked in Hive. It looks like some already have been (HIVE-17403). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66567/#review201975 --- hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java Lines 179 (patched) <https://reviews.apache.org/r/66567/#comment283611> Check the existing table params for bucketing_version before hard-coding to v2. ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java Lines 143 (patched) <https://reviews.apache.org/r/66567/#comment283612> This derives from Operator? So it should already have the bucketingVersion field from that? ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java Line 339 (original), 339 (patched) <https://reviews.apache.org/r/66567/#comment283613> I think this change is no longer necessary. standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java Lines 89 (patched) <https://reviews.apache.org/r/66567/#comment283614> Is this no longer used? - Jason Dere On April 25, 2018, 7:21 a.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66567/ > --- > > (Updated April 25, 2018, 7:21 a.m.) > > > Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and > Matt McCline. > > > Bugs: HIVE-18910 > https://issues.apache.org/jira/browse/HIVE-18910 > > > Repository: hive-git > > > Description > --- > > Hive uses JAVA hash which is not as good as murmur for better distribution > and efficiency in bucketing a table. > Migrate to murmur hash but still keep backward compatibility for existing > users so that they dont have to reload the existing tables. > > To keep backward compatibility, bucket_version is added as a table property, > resulting in high number of result updates. > > > Diffs > - > > hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 > hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out > 153613e6d0 > hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e > hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java > 924e233293 > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > fe2b1c1f3c > > hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java > 996329195c > > hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java > f9ee9d9a03 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out > caa00292b8 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out > ab8ad77074 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out > 2b28a6677e > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out > cdb67dd786 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out > 2c23a7e94f > > itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out > a1be085ea5 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java > 82ba775286 > itests/src/test/resources/testconfiguration.properties 2c1a76d89b > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e > ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a > ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java > d4363fdf91 > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6395c31ec7 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/keyseries/VectorKeySeriesSerializedImpl.java > 86f466fc4e > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java > 4077552a56 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java > 1bc3fdabac > ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java > 71498a125c
Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing
> On April 24, 2018, 11:29 p.m., Jason Dere wrote: > > ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java > > Line 156 (original), 156 (patched) > > <https://reviews.apache.org/r/66567/diff/1-5/?file=1996135#file1996135line156> > > > > What is the point of the conf and HIVE_BUCKETING_JAVA_HASH, is it > > supposed to be for testing? I don't see this setting being used anywhere. > > Deepak Jaiswal wrote: > The setting lets user to use old bucketing logic if they want to. I am > working on a testcase to cover it. Why not just allow users to set bucketing_version in the tble prpperties? > On April 24, 2018, 11:29 p.m., Jason Dere wrote: > > ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out > > Lines 181 (patched) > > <https://reviews.apache.org/r/66567/diff/1/?file=1996191#file1996191line181> > > > > Why did bucketing version disappear here? > > Deepak Jaiswal wrote: > I removed a place where I was setting it, possibly due to that. Is it a bug that it is not being set here now? - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66567/#review201866 --- On April 23, 2018, 5:26 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66567/ > --- > > (Updated April 23, 2018, 5:26 p.m.) > > > Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and > Matt McCline. > > > Bugs: HIVE-18910 > https://issues.apache.org/jira/browse/HIVE-18910 > > > Repository: hive-git > > > Description > --- > > Hive uses JAVA hash which is not as good as murmur for better distribution > and efficiency in bucketing a table. > Migrate to murmur hash but still keep backward compatibility for existing > users so that they dont have to reload the existing tables. > > To keep backward compatibility, bucket_version is added as a table property, > resulting in high number of result updates. > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2403d7ac6c > hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 > hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out > 153613e6d0 > hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e > hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java > 924e233293 > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolver.java > 5dd0b8ea5b > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java > ad14c7265f > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > 3733e3d02f > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java > 03c28a33c8 > > hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java > 996329195c > > hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java > f9ee9d9a03 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out > caa00292b8 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out > ab8ad77074 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out > 2b28a6677e > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out > cdb67dd786 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out > 2c23a7e94f > > itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out > a1be085ea5 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java > 82ba775286 > itests/src/test/resources/testconfiguration.properties 3aaa68b11f > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e > ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a > ql/src/jav
Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66567/#review201866 --- ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java Line 138 (original), 139 (patched) <https://reviews.apache.org/r/66567/#comment283474> Remove? ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java Line 156 (original), 156 (patched) <https://reviews.apache.org/r/66567/#comment283478> What is the point of the conf and HIVE_BUCKETING_JAVA_HASH, is it supposed to be for testing? I don't see this setting being used anywhere. ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out Lines 181 (patched) <https://reviews.apache.org/r/66567/#comment283479> Why did bucketing version disappear here? ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java Lines 1605 (patched) <https://reviews.apache.org/r/66567/#comment283473> Can you use bucketing version from OpTraits, rather than having to redefine it here? ql/src/test/results/clientpositive/results_cache_invalidation2.q.out Lines 88 (patched) <https://reviews.apache.org/r/66567/#comment283475> This plan should not change to not using the cache. It's possible this is because of HIVE-19232. - Jason Dere On April 23, 2018, 5:26 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66567/ > --- > > (Updated April 23, 2018, 5:26 p.m.) > > > Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and > Matt McCline. > > > Bugs: HIVE-18910 > https://issues.apache.org/jira/browse/HIVE-18910 > > > Repository: hive-git > > > Description > --- > > Hive uses JAVA hash which is not as good as murmur for better distribution > and efficiency in bucketing a table. > Migrate to murmur hash but still keep backward compatibility for existing > users so that they dont have to reload the existing tables. > > To keep backward compatibility, bucket_version is added as a table property, > resulting in high number of result updates. > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2403d7ac6c > hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 > hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out > 153613e6d0 > hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e > hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java > 924e233293 > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolver.java > 5dd0b8ea5b > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java > ad14c7265f > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > 3733e3d02f > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java > 03c28a33c8 > > hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java > 996329195c > > hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java > f9ee9d9a03 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out > caa00292b8 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out > ab8ad77074 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out > 2b28a6677e > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out > cdb67dd786 > > itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out > 2c23a7e94f > > itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out > a1be085ea5 > > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java > 82ba775286 > itests/src/test/resources/testconfiguration.properties 3aaa68b11f > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e > ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 > ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a >
Re: Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
> On April 16, 2018, 7:45 p.m., Sergey Shelukhin wrote: > > llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java > > Lines 331 (patched) > > <https://reviews.apache.org/r/66514/diff/3/?file=1996069#file1996069line331> > > > > is this related? I threw that in, since this patch (plus this fix) also fixed TestAcidOnTez#testGetSplitsLocks - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66514/#review201250 --- On April 11, 2018, 7:58 p.m., Jason Dere wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66514/ > --- > > (Updated April 11, 2018, 7:58 p.m.) > > > Review request for hive, Eugene Koifman and Sergey Shelukhin. > > > Repository: hive-git > > > Description > --- > > Replace usage of SessionState.getTxnMgr() from several places, by doing some > refactoring to make the TxnManager available in fields passed in during > construction/initialization: > - SemanticAnalyzer.genFileSinkPlan() > - ReplicationSemanticAnalyzer.analyzeReplLoad() > - LoadSemanticAnalyzer.analyzeExternal() > - ImportSemanticAnalyzer.prepareImport() > - DDLSemanticAnalyzer.handleTransactionalTable() > > > Diffs > - > > > llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java > 3aec46be51 > ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bda2af3a04 > ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java > 6b333d7184 > > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java > 60c85f58e5 > > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java > bc7d0ad0b9 > > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java > 06adc64727 > > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java > 1395027159 > > ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java > bb51f36a25 > ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7a7bdea89d > ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java > f38b0bc546 > ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java > 8b639f7922 > ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java > e49089b91e > > ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java > e5af95b121 > > ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java > 79b2e48ee2 > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > 10982ddbd1 > > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java > 3ccd639d62 > > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java > 4cd75d8128 > ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6003ced27e > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java > fe570f0f8e > > > Diff: https://reviews.apache.org/r/66514/diff/3/ > > > Testing > --- > > > Thanks, > > Jason Dere > >
Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing
> On April 14, 2018, 1:13 a.m., Jason Dere wrote: > > serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java > > Lines 813 (patched) > > <https://reviews.apache.org/r/66567/diff/1/?file=1996736#file1996736line814> > > > > For these primitive types, might make sense to pre-allocate fixed size > > ByteBuffers of size 2/4/8 which can be used here rather than having to > > allocate new ones for every value. > > Deepak Jaiswal wrote: > That is how I did it before but it would send a byte array of length 8 > all the time. The murmur function would consider all 8 bytes to generate > hash. When I noticed it was creating different hashes for same key, I found > the bug, hence the specific size allocation. Also, it wont affect the > efficiency. What I mean is this is performing an allocation for every call to hashCode() here, which I think could affect the efficiency. This could be avoided by passing in pre-allocated arrays of each size to this method. Also, could you use the other version of hash32() where you can also pass in the array length - that way you could just use the same array of size 8, but pass in length 2/4/8 depending on which type you are hashing. > On April 14, 2018, 1:13 a.m., Jason Dere wrote: > > serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java > > Lines 858 (patched) > > <https://reviews.apache.org/r/66567/diff/1/?file=1996736#file1996736line859> > > > > Old impl (based on DateWritable.hashCode()) did hashCode based on > > daysSinceEpoc value, will be faster than doing toString() > > Deepak Jaiswal wrote: > The new one converts it into string format to get bytes array. Are you > suggesting what we get from getPrimitiveWritableObject is daysSinceEpoc? And > since it is integer, it is faster to convert it into byte array directly > rethet than doing "toString"? Yes, DateWritable.toString() converts to Date, which then has to call toString() which means date conversion/formatting. Simpler to base it on the int value. - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66567/#review201133 --- On April 12, 2018, 6:24 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66567/ > --- > > (Updated April 12, 2018, 6:24 p.m.) > > > Review request for hive, Eugene Koifman, Jason Dere, and Matt McCline. > > > Bugs: HIVE-18910 > https://issues.apache.org/jira/browse/HIVE-18910 > > > Repository: hive-git > > > Description > --- > > Hive uses JAVA hash which is not as good as murmur for better distribution > and efficiency in bucketing a table. > Migrate to murmur hash but still keep backward compatibility for existing > users so that they dont have to reload the existing tables. > > To keep backward compatibility, bucket_version is added as a table property, > resulting in high number of result updates. > > > Diffs > - > > hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 > hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out > 153613e6d0 > hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e > hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java > 924e233293 > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolver.java > 5dd0b8ea5b > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolverImpl.java > 7c2cadefa7 > > hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java > ad14c7265f > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java > 3733e3d02f > > hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java > 03c28a33c8 > > hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java > 996329195c > > hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java > f9ee9d9a03 > > itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out > caa00292b8 > > itests/h
Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66567/#review201133 --- hbase-handler/src/test/results/positive/external_table_ppd.q.out Lines 59 (patched) <https://reviews.apache.org/r/66567/#comment282148> Are there any tests for the old-style bucketing, to make sure that previously created bucketed tables still work properly? hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolverImpl.java Lines 25 (patched) <https://reviews.apache.org/r/66567/#comment282146> Unnecessary change? itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java Lines 850 (patched) <https://reviews.apache.org/r/66567/#comment282150> missing comment? ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java Line 1053 (original), 1051 (patched) <https://reviews.apache.org/r/66567/#comment282162> If this occurs every row, I wonder if it would be better to determine the bucketing version once during initializeOp() and create some object which knows which knows which bucketing hash code method to call here ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java Lines 469 (patched) <https://reviews.apache.org/r/66567/#comment282170> should we validate that this is a valid bucketing version that we support? ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java Lines 639 (patched) <https://reviews.apache.org/r/66567/#comment282173> Do we also need to check the bucketing type in the case that op is not a TableScan? If op is a ReduceSink or Join, would that end up being bucketingVersion 2? ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java Lines 72 (patched) <https://reviews.apache.org/r/66567/#comment282176> Was this commented code for testing? ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java Lines 411 (patched) <https://reviews.apache.org/r/66567/#comment282178> It seems to me a lot of the logic will treat -1 as bucketing version 1, since there are a lot of (bucketingVersion == 2 ? doVersion2 : doVersion1) statements. Where in the code would SMB be disabled because of -1 bucketingVersion? ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java Lines 187 (patched) <https://reviews.apache.org/r/66567/#comment282180> Maybe make some common utility to parse/validate bucketing version, that both places can use? ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java Lines 198 (patched) <https://reviews.apache.org/r/66567/#comment282179> Validate bucketing version number? ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java Lines 32 (patched) <https://reviews.apache.org/r/66567/#comment282181> Docs for this UDF will probably need to mention that this uses the old hashing/bucketing scheme which and that a new one has replaced it. ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMurmurHash.java Lines 1 (patched) <https://reviews.apache.org/r/66567/#comment282182> Missing Apache header serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java Lines 813 (patched) <https://reviews.apache.org/r/66567/#comment282184> For these primitive types, might make sense to pre-allocate fixed size ByteBuffers of size 2/4/8 which can be used here rather than having to allocate new ones for every value. serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java Lines 858 (patched) <https://reviews.apache.org/r/66567/#comment282185> Old impl (based on DateWritable.hashCode()) did hashCode based on daysSinceEpoc value, will be faster than doing toString() serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java Lines 866 (patched) <https://reviews.apache.org/r/66567/#comment282187> Faster to do hashcode based on the underlying values (totalMonths) rather than toString serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java Lines 869 (patched) <https://reviews.apache.org/r/66567/#comment282186> Faster to do hashcode based on the underlying values (totalSeconds/nanos) rather than toString - Jason Dere On April 12, 2018, 6:24 p.m., Deepak Jaiswal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66567/ > ------- > > (Updated April 12, 2018, 6:24 p.m.) > > > Review request for hive, Eugene Koifman, Jason Dere, and Matt McCline. > > > Bugs: HIVE
Re: Review Request 64511: HIVE-18252 Limit the size of the object inspector caches
/hive/serde2/objectinspector/primitive/WritableConstantIntObjectInspector.java 129b681795 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantLongObjectInspector.java 0452def8b4 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantShortObjectInspector.java 3343b1ffc4 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantStringObjectInspector.java ba3183bf82 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantTimestampLocalTZObjectInspector.java bf461c0255 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantTimestampObjectInspector.java dc8fedfdd8 serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableVoidObjectInspector.java cdd87018f6 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroObjectInspectorGenerator.java 3736a1f8fc Diff: https://reviews.apache.org/r/64511/diff/2/ Changes: https://reviews.apache.org/r/64511/diff/1-2/ Testing --- Added Junit tests Thanks, Jason Dere
Re: Review Request 66368: HIVE-18609: Results cache invalidation based on table updates
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66368/ --- (Updated April 12, 2018, 8:09 p.m.) Review request for hive, Gopal V and Jesús Camacho Rodríguez. Changes --- - When removing invalid entries during lookup, make sure we have exited read lock section. - Add results_cache_transactional.q to testconfiguration.properties Bugs: HIVE-18609 https://issues.apache.org/jira/browse/HIVE-18609 Repository: hive-git Description --- - Save ValidTxnWriteIdList when saving query to the results cache. - Compare the write ID list for each transactional table during results cache lookup. - Add configuration to determine if queries with non-transactional tables should be cached. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd itests/src/test/resources/testconfiguration.properties 48d62a8bf9 ql/src/java/org/apache/hadoop/hive/ql/Driver.java a88453c978 ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java b1a3646624 ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 44a7496136 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 ql/src/test/queries/clientpositive/results_cache_1.q 4aea60e1e5 ql/src/test/queries/clientpositive/results_cache_2.q 96a90925f6 ql/src/test/queries/clientpositive/results_cache_capacity.q 9f54577009 ql/src/test/queries/clientpositive/results_cache_empty_result.q 621367141e ql/src/test/queries/clientpositive/results_cache_invalidation.q PRE-CREATION ql/src/test/queries/clientpositive/results_cache_lifetime.q 60ffe96a04 ql/src/test/queries/clientpositive/results_cache_quoted_identifiers.q 4802f43ba9 ql/src/test/queries/clientpositive/results_cache_temptable.q 9e0de765cb ql/src/test/queries/clientpositive/results_cache_transactional.q PRE-CREATION ql/src/test/queries/clientpositive/results_cache_with_masking.q b4fcdd57eb ql/src/test/results/clientpositive/llap/results_cache_invalidation.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/results_cache_transactional.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_invalidation.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_transactional.q.out PRE-CREATION Diff: https://reviews.apache.org/r/66368/diff/4/ Changes: https://reviews.apache.org/r/66368/diff/3-4/ Testing --- qtests added. Thanks, Jason Dere
Re: Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66514/ --- (Updated April 11, 2018, 7:58 p.m.) Review request for hive, Eugene Koifman and Sergey Shelukhin. Changes --- Added comment to SessionState.getTxnMgr() about avoiding use of this call. Repository: hive-git Description --- Replace usage of SessionState.getTxnMgr() from several places, by doing some refactoring to make the TxnManager available in fields passed in during construction/initialization: - SemanticAnalyzer.genFileSinkPlan() - ReplicationSemanticAnalyzer.analyzeReplLoad() - LoadSemanticAnalyzer.analyzeExternal() - ImportSemanticAnalyzer.prepareImport() - DDLSemanticAnalyzer.handleTransactionalTable() Diffs (updated) - llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 3aec46be51 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bda2af3a04 ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java 6b333d7184 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java 60c85f58e5 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java bc7d0ad0b9 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java 06adc64727 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java 1395027159 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java bb51f36a25 ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7a7bdea89d ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java f38b0bc546 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 8b639f7922 ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java e49089b91e ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java e5af95b121 ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 79b2e48ee2 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java 3ccd639d62 ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java 4cd75d8128 ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6003ced27e ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java fe570f0f8e Diff: https://reviews.apache.org/r/66514/diff/3/ Changes: https://reviews.apache.org/r/66514/diff/2-3/ Testing --- Thanks, Jason Dere
Re: Review Request 66533: HIVE-19154 Poll notification events to invalidate the results cache
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66533/ --- (Updated April 11, 2018, 6:01 p.m.) Review request for hive, Gopal V and Thejas Nair. Changes --- Using SessionState query timestamp as the cache entry's query time. Bugs: HIVE-19154 https://issues.apache.org/jira/browse/HIVE-19154 Repository: hive-git Description --- - Create NotificationEventPoll to periodically query for notification events, and pass the events to any registered EventConsumers. - Create InvalidationEventConsumer in QueryResultsCache to use the events to invalidate any results cache entries using the updated table. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd itests/src/test/resources/testconfiguration.properties 48d62a8bf9 itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 3cdad284ef ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java b1a3646624 ql/src/java/org/apache/hadoop/hive/ql/metadata/events/EventConsumer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/events/NotificationEventPoll.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 ql/src/test/queries/clientpositive/results_cache_invalidation2.q PRE-CREATION ql/src/test/results/clientpositive/llap/results_cache_invalidation2.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_invalidation2.q.out PRE-CREATION service/src/java/org/apache/hive/service/server/HiveServer2.java 47f84b5e73 Diff: https://reviews.apache.org/r/66533/diff/2/ Changes: https://reviews.apache.org/r/66533/diff/1-2/ Testing --- Thanks, Jason Dere
Re: Review Request 66533: HIVE-19154 Poll notification events to invalidate the results cache
> On April 11, 2018, 5:26 a.m., Gopal V wrote: > > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java > > Lines 470 (patched) > > <https://reviews.apache.org/r/66533/diff/1/?file=1995232#file1995232line470> > > > > SessionState.get().getQueryCurrentTimestamp() > > > > Possibly pass it in via QueryInfo? Good suggestion, will make the change. - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66533/#review200887 --- On April 10, 2018, 7:19 p.m., Jason Dere wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66533/ > --- > > (Updated April 10, 2018, 7:19 p.m.) > > > Review request for hive, Gopal V and Thejas Nair. > > > Bugs: HIVE-19154 > https://issues.apache.org/jira/browse/HIVE-19154 > > > Repository: hive-git > > > Description > --- > > - Create NotificationEventPoll to periodically query for notification events, > and pass the events to any registered EventConsumers. > - Create InvalidationEventConsumer in QueryResultsCache to use the events to > invalidate any results cache entries using the updated table. > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd > itests/src/test/resources/testconfiguration.properties 48d62a8bf9 > itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java > 3cdad284ef > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java > b1a3646624 > ql/src/java/org/apache/hadoop/hive/ql/metadata/events/EventConsumer.java > PRE-CREATION > > ql/src/java/org/apache/hadoop/hive/ql/metadata/events/NotificationEventPoll.java > PRE-CREATION > ql/src/test/queries/clientpositive/results_cache_invalidation2.q > PRE-CREATION > ql/src/test/results/clientpositive/llap/results_cache_invalidation2.q.out > PRE-CREATION > ql/src/test/results/clientpositive/results_cache_invalidation2.q.out > PRE-CREATION > service/src/java/org/apache/hive/service/server/HiveServer2.java 47f84b5e73 > > > Diff: https://reviews.apache.org/r/66533/diff/1/ > > > Testing > --- > > > Thanks, > > Jason Dere > >
Re: Review Request 66368: HIVE-18609: Results cache invalidation based on table updates
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66368/ --- (Updated April 11, 2018, 4:37 a.m.) Review request for hive, Gopal V and Jesús Camacho Rodríguez. Changes --- Rebase with master Bugs: HIVE-18609 https://issues.apache.org/jira/browse/HIVE-18609 Repository: hive-git Description --- - Save ValidTxnWriteIdList when saving query to the results cache. - Compare the write ID list for each transactional table during results cache lookup. - Add configuration to determine if queries with non-transactional tables should be cached. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd itests/src/test/resources/testconfiguration.properties 48d62a8bf9 ql/src/java/org/apache/hadoop/hive/ql/Driver.java a88453c978 ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java b1a3646624 ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 44a7496136 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 ql/src/test/queries/clientpositive/results_cache_1.q 4aea60e1e5 ql/src/test/queries/clientpositive/results_cache_2.q 96a90925f6 ql/src/test/queries/clientpositive/results_cache_capacity.q 9f54577009 ql/src/test/queries/clientpositive/results_cache_empty_result.q 621367141e ql/src/test/queries/clientpositive/results_cache_invalidation.q PRE-CREATION ql/src/test/queries/clientpositive/results_cache_lifetime.q 60ffe96a04 ql/src/test/queries/clientpositive/results_cache_quoted_identifiers.q 4802f43ba9 ql/src/test/queries/clientpositive/results_cache_temptable.q 9e0de765cb ql/src/test/queries/clientpositive/results_cache_transactional.q PRE-CREATION ql/src/test/queries/clientpositive/results_cache_with_masking.q b4fcdd57eb ql/src/test/results/clientpositive/llap/results_cache_invalidation.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/results_cache_transactional.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_invalidation.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_transactional.q.out PRE-CREATION Diff: https://reviews.apache.org/r/66368/diff/3/ Changes: https://reviews.apache.org/r/66368/diff/2-3/ Testing --- qtests added. Thanks, Jason Dere
Re: Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66514/ --- (Updated April 11, 2018, 1:58 a.m.) Review request for hive, Eugene Koifman and Sergey Shelukhin. Changes --- Updating patch - missed a couple of uses of SessionState.getTxnMgr() from CalcitePlanner/MaterializedViewRebuildSemanticAnalyzer. Also adding a couple of fixes to fix TestAcidOnTez which also depend on the rest of this patch. Repository: hive-git Description --- Replace usage of SessionState.getTxnMgr() from several places, by doing some refactoring to make the TxnManager available in fields passed in during construction/initialization: - SemanticAnalyzer.genFileSinkPlan() - ReplicationSemanticAnalyzer.analyzeReplLoad() - LoadSemanticAnalyzer.analyzeExternal() - ImportSemanticAnalyzer.prepareImport() - DDLSemanticAnalyzer.handleTransactionalTable() Diffs (updated) - llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 3aec46be51 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bda2af3a04 ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java 6b333d7184 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java 60c85f58e5 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java bc7d0ad0b9 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java 06adc64727 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java 1395027159 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java bb51f36a25 ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7a7bdea89d ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java f38b0bc546 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 8b639f7922 ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java e49089b91e ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java e5af95b121 ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 79b2e48ee2 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7f0010855b ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java 3ccd639d62 ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java 4cd75d8128 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java fe570f0f8e Diff: https://reviews.apache.org/r/66514/diff/2/ Changes: https://reviews.apache.org/r/66514/diff/1-2/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-19156) TestMiniLlapLocalCliDriver.vectorized_dynamic_semijoin_reduction.q is broken
Jason Dere created HIVE-19156: - Summary: TestMiniLlapLocalCliDriver.vectorized_dynamic_semijoin_reduction.q is broken Key: HIVE-19156 URL: https://issues.apache.org/jira/browse/HIVE-19156 Project: Hive Issue Type: Bug Components: Tests Reporter: Jason Dere Assignee: Jason Dere Looks like this test has been broken for some time -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 66533: HIVE-19154 Poll notification events to invalidate the results cache
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66533/ --- Review request for hive, Gopal V and Thejas Nair. Bugs: HIVE-19154 https://issues.apache.org/jira/browse/HIVE-19154 Repository: hive-git Description --- - Create NotificationEventPoll to periodically query for notification events, and pass the events to any registered EventConsumers. - Create InvalidationEventConsumer in QueryResultsCache to use the events to invalidate any results cache entries using the updated table. Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd itests/src/test/resources/testconfiguration.properties 48d62a8bf9 itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 3cdad284ef ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java b1a3646624 ql/src/java/org/apache/hadoop/hive/ql/metadata/events/EventConsumer.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/events/NotificationEventPoll.java PRE-CREATION ql/src/test/queries/clientpositive/results_cache_invalidation2.q PRE-CREATION ql/src/test/results/clientpositive/llap/results_cache_invalidation2.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_invalidation2.q.out PRE-CREATION service/src/java/org/apache/hive/service/server/HiveServer2.java 47f84b5e73 Diff: https://reviews.apache.org/r/66533/diff/1/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-19154) Poll notification events to invalidate the results cache
Jason Dere created HIVE-19154: - Summary: Poll notification events to invalidate the results cache Key: HIVE-19154 URL: https://issues.apache.org/jira/browse/HIVE-19154 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere Related to the work for HIVE-18609. HIVE-18609 will only invalidate entries in the cache if that query looked up again, which could potentially leave a lot of undetected invalid entries in the cache taking up space which could cause other entries to be evicted. To remove these entries in a more timely fashion, have a background thread to periodically check the notification events for updates to the tables used in the results cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66368: HIVE-18609: Results cache invalidation based on table updates
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66368/ --- (Updated April 10, 2018, 12:29 a.m.) Review request for hive, Gopal V and Jesús Camacho Rodríguez. Changes --- Rebase with master Bugs: HIVE-18609 https://issues.apache.org/jira/browse/HIVE-18609 Repository: hive-git Description --- - Save ValidTxnWriteIdList when saving query to the results cache. - Compare the write ID list for each transactional table during results cache lookup. - Add configuration to determine if queries with non-transactional tables should be cached. Diffs (updated) - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 0627c35378 itests/src/test/resources/testconfiguration.properties 28c14ebc4c ql/src/java/org/apache/hadoop/hive/ql/Driver.java 79db006c74 ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java ac5ae573d6 ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 44a7496136 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b74abacf3 ql/src/test/queries/clientpositive/results_cache_1.q 4aea60e1e5 ql/src/test/queries/clientpositive/results_cache_2.q 96a90925f6 ql/src/test/queries/clientpositive/results_cache_capacity.q 9f54577009 ql/src/test/queries/clientpositive/results_cache_empty_result.q 621367141e ql/src/test/queries/clientpositive/results_cache_invalidation.q PRE-CREATION ql/src/test/queries/clientpositive/results_cache_lifetime.q 60ffe96a04 ql/src/test/queries/clientpositive/results_cache_quoted_identifiers.q 4802f43ba9 ql/src/test/queries/clientpositive/results_cache_temptable.q 9e0de765cb ql/src/test/queries/clientpositive/results_cache_transactional.q PRE-CREATION ql/src/test/queries/clientpositive/results_cache_with_masking.q b4fcdd57eb ql/src/test/results/clientpositive/llap/results_cache_invalidation.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/results_cache_transactional.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_invalidation.q.out PRE-CREATION ql/src/test/results/clientpositive/results_cache_transactional.q.out PRE-CREATION Diff: https://reviews.apache.org/r/66368/diff/2/ Changes: https://reviews.apache.org/r/66368/diff/1-2/ Testing --- qtests added. Thanks, Jason Dere
Re: Review Request 66516: HIVE-19138: Results cache: allow queries waiting on pending cache entries to check cache again if pending query fails
> On April 9, 2018, 10:23 p.m., Gopal V wrote: > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > > Line 14642 (original), 14643 (patched) > > <https://reviews.apache.org/r/66516/diff/1/?file=1994429#file1994429line14643> > > > > Does the loop only exit if cacheEntry is non-null? The loop is a do .. while(false), which normally should exit after a single iteration. The loop should only continue to iterate in the event that cacheEntry.waitForValidStatus() returned false. - Jason --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66516/#review200772 --- On April 9, 2018, 9:53 p.m., Jason Dere wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66516/ > --- > > (Updated April 9, 2018, 9:53 p.m.) > > > Review request for hive, Deepak Jaiswal and Gopal V. > > > Repository: hive-git > > > Description > --- > > If the pending query fails, allow Hive to try to check the cache again in > case the cache has another cached/pending result that can be used to answer > the query. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java > 3b74abacf3 > > > Diff: https://reviews.apache.org/r/66516/diff/1/ > > > Testing > --- > > > Thanks, > > Jason Dere > >
Review Request 66516: HIVE-19138: Results cache: allow queries waiting on pending cache entries to check cache again if pending query fails
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66516/ --- Review request for hive, Deepak Jaiswal and Gopal V. Repository: hive-git Description --- If the pending query fails, allow Hive to try to check the cache again in case the cache has another cached/pending result that can be used to answer the query. Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b74abacf3 Diff: https://reviews.apache.org/r/66516/diff/1/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-19138) Results cache: allow queries waiting on pending cache entries to check cache again if pending query fails
Jason Dere created HIVE-19138: - Summary: Results cache: allow queries waiting on pending cache entries to check cache again if pending query fails Key: HIVE-19138 URL: https://issues.apache.org/jira/browse/HIVE-19138 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere HIVE-18846 allows the results cache to refer to currently executing queries so that another query can wait for these results to become ready in the results cache. If the pending query fails then Hive will automatically skip the cache and do the full query compilation. Make a fix here so that if the pending query fails, Hive will still try to check the cache again in case the cache has another cached/pending result that can be used to answer the query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66514/ --- Review request for hive, Eugene Koifman and Sergey Shelukhin. Repository: hive-git Description --- Replace usage of SessionState.getTxnMgr() from several places, by doing some refactoring to make the TxnManager available in fields passed in during construction/initialization: - SemanticAnalyzer.genFileSinkPlan() - ReplicationSemanticAnalyzer.analyzeReplLoad() - LoadSemanticAnalyzer.analyzeExternal() - ImportSemanticAnalyzer.prepareImport() - DDLSemanticAnalyzer.handleTransactionalTable() Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java fb1efe01dc ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java 6b333d7184 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java 60c85f58e5 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java bc7d0ad0b9 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java 06adc64727 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java 1395027159 ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java bb51f36a25 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 9e66422904 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 8b639f7922 ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java e49089b91e ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 79b2e48ee2 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ff0a2e6a1b ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java 3ccd639d62 ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java 4cd75d8128 Diff: https://reviews.apache.org/r/66514/diff/1/ Testing --- Thanks, Jason Dere
Review Request 66486: HIVE-19127 Concurrency fixes in QueryResultsCache
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66486/ --- Review request for hive, Deepak Jaiswal and Gopal V. Bugs: HIVE-19127 https://issues.apache.org/jira/browse/HIVE-19127 Repository: hive-git Description --- - Take a lock on the cache entry when in the process of setting the cache entry from PENDING state to VALID state, so that other threads cannot invalidate the entry - The write lock on the cache was not being taken when removing an entry from the cache. - synchronize access when iterating through the lru keyset Diffs - ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java ac5ae573d6 Diff: https://reviews.apache.org/r/66486/diff/1/ Testing --- Thanks, Jason Dere
[jira] [Created] (HIVE-19127) Concurrency fixes in QueryResultsCache
Jason Dere created HIVE-19127: - Summary: Concurrency fixes in QueryResultsCache Key: HIVE-19127 URL: https://issues.apache.org/jira/browse/HIVE-19127 Project: Hive Issue Type: Sub-task Reporter: Jason Dere Assignee: Jason Dere A few fixes around concurrent access in the results cache - Take a lock on the cache entry when in the process of setting the cache entry from PENDING state to VALID state, so that other threads cannot invalidate the entry - The write lock on the cache was not being taken when removing an entry from the cache. - synchronize access when iterating through the lru keyset -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Review Request 66201: HIVE-19014 utilize YARN-8028 (queue ACL check) in Hive Tez session pool
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66201/#review200326 --- ql/src/java/org/apache/hadoop/hive/ql/exec/tez/YarnQueueHelper.java Lines 117 (patched) <https://reviews.apache.org/r/66201/#comment281026> clean up whitespace. - Jason Dere On April 2, 2018, 10:13 p.m., Sergey Shelukhin wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66201/ > --- > > (Updated April 2, 2018, 10:13 p.m.) > > > Review request for hive and Thejas Nair. > > > Repository: hive-git > > > Description > --- > > see jira > > > Diffs > - > > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 02367eb433 > ql/src/java/org/apache/hadoop/hive/ql/Driver.java ed3984efe8 > ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 1de333e985 > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java > a051f90195 > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java a5f4cb7539 > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/YarnQueueHelper.java > PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java > ed1c0abdf2 > > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLoggedInUser.java > 3ed793ec48 > service/src/java/org/apache/hive/service/server/HiveServer2.java 6308c5cd4f > > > Diff: https://reviews.apache.org/r/66201/diff/6/ > > > Testing > --- > > > Thanks, > > Sergey Shelukhin > >