from:"Jason Dere"

[jira] [Created] (HIVE-23972) Add external client ID to LLAP external client

2020-07-31 Thread Jason Dere (Jira)

Jason Dere created HIVE-23972:
-

 Summary: Add external client ID to LLAP external client
 Key: HIVE-23972
 URL: https://issues.apache.org/jira/browse/HIVE-23972
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Jason Dere
Assignee: Jason Dere


There currently is not a good way to tell which currently running LLAP tasks 
are from external LLAP clients, and also no good way to know which application 
is submitting these external LLAP requests.
One possible solution for this is to add an option for the external LLAP client 
to pass in an external client ID, which can get logged by HiveServer2 during 
the getSplits request, as well as displayed from the LLAP executorsStatus.

cc [~ShubhamChaurasia]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-23868) Windowing function spec: support 0 preceeding/following

2020-07-16 Thread Jason Dere (Jira)

Jason Dere created HIVE-23868:
-

 Summary: Windowing function spec: support 0 preceeding/following
 Key: HIVE-23868
 URL: https://issues.apache.org/jira/browse/HIVE-23868
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Jason Dere
Assignee: Jason Dere


HIVE-12574 removed support for 0 PRECEDING/FOLLOWING in window function 
specifications. We can restore support for this by converting 0 
PRECEDING/FOLLOWING to CURRENT ROW in the query plan, which should be the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-23068) Error when submitting fragment to LLAP via external client: IllegalStateException: Only a single registration allowed per entity

2020-03-23 Thread Jason Dere (Jira)

Jason Dere created HIVE-23068:
-

 Summary: Error when submitting fragment to LLAP via external 
client: IllegalStateException: Only a single registration allowed per entity
 Key: HIVE-23068
 URL: https://issues.apache.org/jira/browse/HIVE-23068
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Jason Dere
Assignee: Jason Dere


LLAP external client (via hive-warehouse-connector) somehow seems to be sending 
duplicate submissions for the same fragment/attempt. When the 2nd request is 
sent this results in the following error:

{noformat}
2020-03-17T06:49:11,239 WARN  [IPC Server handler 2 on 15001 ()] 
org.apache.hadoop.ipc.Server: IPC Server handler 2 on 15001, call Call#75 
Retry#0 org.apache.hadoop.hive.llap.protocol.LlapProtocolBlockingPB.submitWork 
from 19.40.252.114:33906
java.lang.IllegalStateException: Only a single registration allowed per entity. 
Duplicate for TaskWrapper{task=attempt_1854104024183112753_6052_0_00_000128_1, 
inWaitQueue=true, inPreemptionQueue=false, registeredForNotifications=true, 
canFinish=true, canFinish(in queue)=true, isGuaranteed=false, 
firstAttemptStartTime=1584442003327, dagStartTime=1584442003327, 
withinDagPriority=0, vertexParallelism= 2132, selfAndUpstreamParallelism= 2132, 
selfAndUpstreamComplete= 0}
at 
org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.registerForUpdates(QueryInfo.java:233)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.QueryInfo.registerForFinishableStateUpdates(QueryInfo.java:205)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.QueryFragmentInfo.registerForFinishableStateUpdates(QueryFragmentInfo.java:160)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$TaskWrapper.maybeRegisterForFinishedStateNotifications(TaskExecutorService.java:1167)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.schedule(TaskExecutorService.java:564)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService.schedule(TaskExecutorService.java:93)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.ContainerRunnerImpl.submitWork(ContainerRunnerImpl.java:292)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.submitWork(LlapDaemon.java:610)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.LlapProtocolServerImpl.submitWork(LlapProtocolServerImpl.java:122)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.rpc.LlapDaemonProtocolProtos$LlapDaemonProtocol$2.callBlockingMethod(LlapDaemonProtocolProtos.java:22695)
 ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?]
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) 
~[hadoop-common-3.1.1.3.1.4.26-3.jar:?]
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) 
~[hadoop-common-3.1.1.3.1.4.26-3.jar:?]
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) 
~[hadoop-common-3.1.1.3.1.4.26-3.jar:?]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_191]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_191]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 ~[hadoop-common-3.1.1.3.1.4.26-3.jar:?]
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) 
~[hadoop-common-3.1.1.3.1.4.26-3.jar:?]
{noformat}

I think the issue here is that this error occurred too late - based on the 
stack trace, LLAP has already accepted/registered the fragment. The subsequent 
cleanup of this fragment/attempt also affects the first request. Which results 
in the LLAP crash described in HIVE-23061:

{noformat}
2020-03-17T06:49:11,304 ERROR [ExecutionCompletionThread #0 ()] 
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread 
Thread[ExecutionCompletionThread #0,5,main] threw an Exception. Shutting down 
now...
java.lang.IllegalStateException: Cannot invoke unregister on an entity which 
has not been registered
at com.google.common.base.Preconditions.checkState(Preconditions.java:508) 
~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
at 
org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.unregisterForUpdates(QueryInfo.java:256)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3

[jira] [Created] (HIVE-23061) LLAP crash due to unhandled exception: Cannot invoke unregister on an entity which has not been registered

2020-03-20 Thread Jason Dere (Jira)

Jason Dere created HIVE-23061:
-

 Summary: LLAP crash due to unhandled exception: Cannot invoke 
unregister on an entity which has not been registered
 Key: HIVE-23061
 URL: https://issues.apache.org/jira/browse/HIVE-23061
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Jason Dere
Assignee: Jason Dere


The following exception goes uncaught and causes the entire LLAP daemon to shut 
down:
{noformat}
2020-03-17T06:49:11,304 ERROR [ExecutionCompletionThread #0 ()] 
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread 
Thread[ExecutionCompletionThread #0,5,main] threw an Exception. Shutting down 
now...
java.lang.IllegalStateException: Cannot invoke unregister on an entity which 
has not been registered
at com.google.common.base.Preconditions.checkState(Preconditions.java:508) 
~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
at 
org.apache.hadoop.hive.llap.daemon.impl.QueryInfo$FinishableStateTracker.unregisterForUpdates(QueryInfo.java:256)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.QueryInfo.unregisterFinishableStateUpdate(QueryInfo.java:209)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.QueryFragmentInfo.unregisterForFinishableStateUpdates(QueryFragmentInfo.java:166)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$TaskWrapper.maybeUnregisterForFinishedStateNotifications(TaskExecutorService.java:1177)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:980)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$InternalCompletionListener.onSuccess(TaskExecutorService.java:944)
 ~[hive-llap-server-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.26-3]
at 
com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1021)
 ~[hive-exec-3.1.0.3.1.4.26-3.jar:3.1.0.3.1.4.32-1]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_191]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_191]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22946) HIVE-20082 removed conversion of complex types to string

2020-02-28 Thread Jason Dere (Jira)

Jason Dere created HIVE-22946:
-

 Summary: HIVE-20082 removed conversion of complex types to string
 Key: HIVE-22946
 URL: https://issues.apache.org/jira/browse/HIVE-22946
 Project: Hive
  Issue Type: Bug
  Components: Types, UDF
Reporter: Jason Dere
Assignee: Jason Dere


Looks like we used to support cast/conversion of complex data types (array, 
map, struct) to string, and HIVE-20082 removed that.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22714) TestScheduledQueryService is flaky

2020-01-09 Thread Jason Dere (Jira)

Jason Dere created HIVE-22714:
-

 Summary: TestScheduledQueryService is flaky
 Key: HIVE-22714
 URL: https://issues.apache.org/jira/browse/HIVE-22714
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere


{noformat}
[ERROR] Failures: 
[ERROR]   TestScheduledQueryService.testScheduledQueryExecution:152 
Expected: <5>
 but: was <0>
[INFO] 
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
{noformat}

Looks like sometimes we are not waiting long enough for the INSERT query to 
complete and the SELECT runs before it finishes:
{noformat}
$ egrep "insert|select" 
target/surefire-reports/org.apache.hadoop.hive.ql.schq.TestScheduledQueryService-output.txt
 | grep HOOK
PREHOOK: query: insert into tu values(1),(2),(3),(4),(5)
2020-01-09T14:49:09,497  INFO [SchQ 0] SessionState: PREHOOK: query: insert 
into tu values(1),(2),(3),(4),(5)
PREHOOK: query: select 1 from tu
2020-01-09T14:49:11,452  INFO [main] SessionState: PREHOOK: query: select 1 
from tu
POSTHOOK: query: select 1 from tu
2020-01-09T14:49:11,452  INFO [main] SessionState: POSTHOOK: query: select 1 
from tu
POSTHOOK: query: insert into tu values(1),(2),(3),(4),(5)
2020-01-09T14:49:12,062  INFO [SchQ 0] SessionState: POSTHOOK: query: insert 
into tu values(1),(2),(3),(4),(5)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22709) NullPointerException during query compilation after HIVE-22578

2020-01-08 Thread Jason Dere (Jira)

Jason Dere created HIVE-22709:
-

 Summary: NullPointerException during query compilation after 
HIVE-22578
 Key: HIVE-22709
 URL: https://issues.apache.org/jira/browse/HIVE-22709
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
 Attachments: results_cache_with_auth.q

Getting a NPE during query compilation, when query results cache and Ranger 
auth is enabled. This seems to have been caused by HIVE-22578.

{noformat}
 java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getQueryStringFromAst(SemanticAnalyzer.java:14987)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getQueryStringForCache(SemanticAnalyzer.java:15036)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.createLookupInfoForQuery(SemanticAnalyzer.java:15077)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12513)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:358)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:283)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:283)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:219)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:103)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:215)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:828)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:774)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:768)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:249)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:415)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:346)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:708)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:678)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:169)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22599) Query results cache: 733 permissions check is not necessary

2019-12-08 Thread Jason Dere (Jira)

Jason Dere created HIVE-22599:
-

 Summary: Query results cache: 733 permissions check is not 
necessary
 Key: HIVE-22599
 URL: https://issues.apache.org/jira/browse/HIVE-22599
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


The query results cache initialization makes a call to 
Utilties.ensurePathIsWritable(), which checks the results cache directory for 
733 permissions (default cache dir is

{{/tmp/hive/_resultscache_).}}

The 733 permissions (at least the 033 part) are not actually necessary - we 
actually don't really want the results cache directory to be world-writable, 
and the subdirectories we create within this one are actually done with 700 
perms. So I think the call to Utilties.ensurePathIsWritable() can be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22595) Dynamic partition inserts fail on Avro table table with external schema

2019-12-06 Thread Jason Dere (Jira)

Jason Dere created HIVE-22595:
-

 Summary: Dynamic partition inserts fail on Avro table table with 
external schema
 Key: HIVE-22595
 URL: https://issues.apache.org/jira/browse/HIVE-22595
 Project: Hive
  Issue Type: Bug
  Components: Avro, Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere


Example qfile test:
{noformat}
create external table avro_extschema_insert1 (name string) partitioned by (p1 
string)
  stored as avro tblproperties 
('avro.schema.url'='${system:test.tmp.dir}/table1.avsc');

create external table avro_extschema_insert2 like avro_extschema_insert1;

insert overwrite table avro_extschema_insert1 partition (p1='part1') values 
('col1_value', 1, 'col3_value');

insert overwrite table avro_extschema_insert2 partition (p1) select * from 
avro_extschema_insert1;
{noformat}

The last statement fails with the following error:
{noformat}
], TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
attempt_1575484789169_0003_4_00_00_3:java.lang.RuntimeException: 
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Hive Runtime Error while processing row
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:576)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
... 19 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Number of input columns 
was different than output columns (in = 2 vs out = 1
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1047)
at 
org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:927)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:994)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:940)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555)
... 20 more
Caused by: org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Number of 
input columns was different than output columns (in = 2 vs out = 1

[jira] [Created] (HIVE-22530) Connection pool timeout in TxnHandler.java is hardcoded to 30 secs

2019-11-22 Thread Jason Dere (Jira)

Jason Dere created HIVE-22530:
-

 Summary: Connection pool timeout in TxnHandler.java is hardcoded 
to 30 secs
 Key: HIVE-22530
 URL: https://issues.apache.org/jira/browse/HIVE-22530
 Project: Hive
  Issue Type: Bug
  Components: Locking
Reporter: Jason Dere


If the time to acquire locks gets long enough, we can end up running into the 
time limit for acquiring DB connections in TxnHandler:

{noformat}
2019-07-23 11:49:54,285 ERROR [HiveServer2-Background-Pool: Thread-3881156]: 
operation.Operation (SQLOperation.java:run(258)) - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Error in acquiring locks: Error communicating with the metastore

Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating 
with the metastore
Caused by: MetaException(message:Unable to update transaction database 
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error 
Timeout waiting for idle object
{noformat}

This appears to be hard-coded to 30 seconds here: 
https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L2359
It may sense to either make this configurable or eliminate the timeout 
altogether.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22391) NPE while checking Hive query results cache

2019-10-22 Thread Jason Dere (Jira)

Jason Dere created HIVE-22391:
-

 Summary: NPE while checking Hive query results cache
 Key: HIVE-22391
 URL: https://issues.apache.org/jira/browse/HIVE-22391
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Jason Dere
Assignee: Jason Dere


NPE when results cache was enabled:
{noformat}
2019-10-21T14:51:55,718 ERROR [b7d7bea8-eef0-4ea4-ae12-951cb5dc96e3 
HiveServer2-Handler-Pool: Thread-210]: ql.Driver (:()) - FAILED: 
NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.checkResultsCache(SemanticAnalyzer.java:15061)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12320)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:360)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1869)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1816)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1811)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197)
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:575)
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:561)
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:566)
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:647)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22275) OperationManager.queryIdOperation does not properly clean up multiple queryIds

2019-09-30 Thread Jason Dere (Jira)

Jason Dere created HIVE-22275:
-

 Summary: OperationManager.queryIdOperation does not properly clean 
up multiple queryIds
 Key: HIVE-22275
 URL: https://issues.apache.org/jira/browse/HIVE-22275
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Jason Dere
Assignee: Jason Dere


In the case that multiple statements are run by a single Session before being 
cleaned up, it appears that OperationManager.queryIdOperation is not cleaned up 
properly.
See the log statements below - with the exception of the first "Removed 
queryId:" log line, the queryId listed during cleanup is the same, when each of 
these handles should have their own queryId. Looks like only the last queryId 
executed is being cleaned up.

As a result, HS2 can run out of memory as OperationManager.queryIdOperation 
grows and never cleans these queryIds/Operations up.

{noformat}
2019-09-13T08:37:36,785 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9]
2019-09-13T08:37:38,432 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Removed queryId: hive_20190913083736_c49cf3cc-cfe8-48a1-bd22-8b924dfb0396 
corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9] with tag: null
2019-09-13T08:37:38,469 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb]
2019-09-13T08:37:52,662 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=b983802c-1dec-4fa0-8680-d05ab555321b]
2019-09-13T08:37:56,239 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=75dbc531-2964-47b2-84d7-85b59f88999c]
2019-09-13T08:38:02,551 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=72c79076-9d67-4894-a526-c233fa5450b2]
2019-09-13T08:38:10,558 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=17b30a62-612d-4b70-9ba7-4287d2d9229b]
2019-09-13T08:38:16,930 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=ea97e99d-cc77-470b-b49a-b869c73a4615]
2019-09-13T08:38:20,440 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=a277b789-ebb8-4925-878f-6728d3e8c5fb]
2019-09-13T08:38:26,303 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=9a023ab8-aa80-45db-af88-94790cc83033]
2019-09-13T08:38:30,791 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=b697c801-7da0-4544-bcfa-442eb1d3bd77]
2019-09-13T08:39:10,187 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=bda93c8f-0822-4592-a61c-4701720a1a5c]
2019-09-13T08:39:15,471 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 
corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb] with tag: null
2019-09-13T08:39:15,507 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a 
HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - 
Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 
corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, 
getHandleIdentifier()=b983802c-1dec

[jira] [Created] (HIVE-22050) Enable download of just UDF resources using LlapServiceDriver command line

2019-07-25 Thread Jason Dere (JIRA)

Jason Dere created HIVE-22050:
-

 Summary: Enable download of just UDF resources using 
LlapServiceDriver command line
 Key: HIVE-22050
 URL: https://issues.apache.org/jira/browse/HIVE-22050
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


LlapServiceDriver currently has several components that it downloads as part of 
the LLAP packaging: Tez jars, UDF jars, aux jars, configs.
I'd like to add some options to the LlapServiceDriver command line to enable 
selective downloading of these components, for example to be able to download 
just the UDF jars.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (HIVE-22035) HiveStrictManagedMigration settings do not always get set with --hiveconf arguments

2019-07-23 Thread Jason Dere (JIRA)

Jason Dere created HIVE-22035:
-

 Summary: HiveStrictManagedMigration settings do not always get set 
with --hiveconf arguments
 Key: HIVE-22035
 URL: https://issues.apache.org/jira/browse/HIVE-22035
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Currently the --hiveconf arguments get added to the System properties. While 
this allows official HiveConf variables to be set in the conf that is loaded by 
the HiveStrictManagedMigration utility, there are utility-specific 
configuration settings which we would want to be set from the command line. For 
example since Ambari knows what the Hive system user name is it would make 
sense to be able to set strict.managed.tables.migration.owner on the command 
line when running this utility.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (HIVE-22034) HiveStrictManagedMigration updates DB location even with --dryRun setting on

2019-07-23 Thread Jason Dere (JIRA)

Jason Dere created HIVE-22034:
-

 Summary: HiveStrictManagedMigration updates DB location even with 
--dryRun setting on
 Key: HIVE-22034
 URL: https://issues.apache.org/jira/browse/HIVE-22034
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


The logic at the end of procesDatabase() to update the DB location in the 
Metastore should only run if runOptions.dryRun == false.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Created] (HIVE-22001) AcidUtils.getAcidState() can fail if Cleaner is removing files at the same time

2019-07-16 Thread Jason Dere (JIRA)

Jason Dere created HIVE-22001:
-

 Summary: AcidUtils.getAcidState() can fail if Cleaner is removing 
files at the same time
 Key: HIVE-22001
 URL: https://issues.apache.org/jira/browse/HIVE-22001
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: Jason Dere


Had one user hit the following error during getSplits

{noformat}
2019-07-06T14:33:03,067 ERROR [4640181a-3eb7-4b3e-9a40-d7a8de9a570c 
HiveServer2-HttpHandler-Pool: Thread-415519]: SessionState 
(SessionState.java:printError(1247)) - Vertex failed, vertexName=Map 1, 
vertexId=vertex_1560947172646_2452_6199_00, diagnostics=[Vertex 
vertex_1560947172646_2452_6199_00 [Map 1] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: hive_table initializer failed, 
vertex=vertex_1560947172646_2452_6199_00 [Map 1], java.lang.RuntimeException: 
ORC split generation failed with exception: java.io.FileNotFoundException: File 
hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
not exist.
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1870)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1958)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:524)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:779)
at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: 
java.io.FileNotFoundException: File 
hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
not exist.
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809)
... 17 more
Caused by: java.io.FileNotFoundException: File 
hdfs://path/to/hive_table/oiddatemmdd=20190706/delta_0987070_0987070 does 
not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1059)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1119)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1116)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1126)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1868)
at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1953)
at 
org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.chooseFile(AcidUtils.java:1903)
at 
org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.isRawFormat(AcidUtils.java:1913)
at 
org.apache.hadoop.hive.ql.io.AcidUtils.parsedDelta(AcidUtils.java:947)
at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java:935)
at 
org.apache.hadoop.hive.ql.io.AcidUtils.getChildState(AcidUtils.java:1250)  
<---
at 
org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:1071)   
<---
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$FileGenerator.callInternal(OrcInputFormat.jav

[jira] [Created] (HIVE-21963) TransactionalValidationListener.validateTableStructure should check the partition directories in the case of partitioned tables

2019-07-05 Thread Jason Dere (JIRA)

Jason Dere created HIVE-21963:
-

 Summary: TransactionalValidationListener.validateTableStructure 
should check the partition directories in the case of partitioned tables
 Key: HIVE-21963
 URL: https://issues.apache.org/jira/browse/HIVE-21963
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Reporter: Jason Dere
Assignee: Jason Dere


The transactional validation check is checking just the base table directory, 
but for partitioned tables this should be checking the partitioned directories 
(some of which may not even be in the base table directory).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-21878) Metric for AM to show whether it is currently running a DAG

2019-06-14 Thread Jason Dere (JIRA)

Jason Dere created HIVE-21878:
-

 Summary: Metric for AM to show whether it is currently running a 
DAG
 Key: HIVE-21878
 URL: https://issues.apache.org/jira/browse/HIVE-21878
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-21878.1.patch

Add a basic gauge metric to indicate whether a Tez AM is currently running a 
DAG for a Hive query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-21799) NullPointerException in DynamicPartitionPruningOptimization, when join key is on aggregation column

2019-05-28 Thread Jason Dere (JIRA)

Jason Dere created HIVE-21799:
-

 Summary: NullPointerException in 
DynamicPartitionPruningOptimization, when join key is on aggregation column
 Key: HIVE-21799
 URL: https://issues.apache.org/jira/browse/HIVE-21799
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Jason Dere
Assignee: Jason Dere


Following table/query results in NPE:

{noformat}
create table tez_no_dynpart_hashjoin_on_agg(id int, outcome string, eventid 
int) stored as orc;

explain select a.id, b.outcome from (select id, max(eventid) as event_id_max 
from tez_no_dynpart_hashjoin_on_agg group by id) a 
LEFT OUTER JOIN tez_no_dynpart_hashjoin_on_agg b 
on a.event_id_max = b.eventid;
{noformat}

Stack trace:
{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.generateSemiJoinOperatorPlan(DynamicPartitionPruningOptimization.java:608)
at 
org.apache.hadoop.hive.ql.optimizer.DynamicPartitionPruningOptimization.process(DynamicPartitionPruningOptimization.java:239)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
at 
org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:74)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.runDynamicPartitionPruning(TezCompiler.java:584)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:165)
at 
org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:159)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12562)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:370)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:671)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1905)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1852)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1847)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:219)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:340)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:676)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:647)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:59)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-21746) ArrayIndexOutOfBoundsException during dynamically partitioned hash join, with CBO disabled

2019-05-16 Thread Jason Dere (JIRA)

Jason Dere created HIVE-21746:
-

 Summary: ArrayIndexOutOfBoundsException during dynamically 
partitioned hash join, with CBO disabled
 Key: HIVE-21746
 URL: https://issues.apache.org/jira/browse/HIVE-21746
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Jason Dere
Assignee: Jason Dere


ArrayIndexOutOfBounds exception during query execution with dynamically 
partitioned hash join.
Found on Hive 2.x. Seems to occur with CBO disabled/failed.
Disabling constant propagation seems to allow the query to succeed.

{noformat}
java.lang.ArrayIndexOutOfBoundsException: 203
at 
org.apache.hadoop.hive.serde2.io.TimestampWritable.getTotalLength(TimestampWritable.java:217)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryUtils.checkObjectByteInfo(LazyBinaryUtils.java:205)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.parse(LazyBinaryStruct.java:142)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.serde2.lazybinary.LazyBinaryStruct.getFieldsAsList(LazyBinaryStruct.java:281)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.unpack(MapJoinBytesTableContainer.java:744)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.next(MapJoinBytesTableContainer.java:730)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$ReusableRowContainer.next(MapJoinBytesTableContainer.java:605)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.next(UnwrapRowContainer.java:70)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.persistence.UnwrapRowContainer.next(UnwrapRowContainer.java:34)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:819)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:924)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:456)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:359)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:290)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:319)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:189)
 ~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172) 
~[hive-exec-2.1.0.2.6.4.119-3.jar:2.1.0.2.6.4.119-3]
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:377)
 ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3]
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
 ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3]
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
 ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_112]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
 ~[hadoop-common-2.7.3.2.6.4.119-3.jar:?]
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
 ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3]
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
 ~[tez-runtime-internals-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3]
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
~[tez-common-0.8.4.2.6.4.119-3.jar:0.8.4.2.6.4.119-3]
at 
org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
 ~[hive-llap-server

Re: Review Request 70372: HIVE-21427: Syslog storage handler

2019-04-03 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70372/#review214337
---




llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java
Line 615 (original), 615 (patched)
<https://reviews.apache.org/r/70372/#comment300536>

Just curious about this one, was there a difference between 
rbCtx.getRowColumnTypeInfos() and rbCtx.getDataColumnCount()? Or just the fact 
that rbCtx.getDataColumnCount() directly returns an int value?



ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogSerDe.java
Lines 57 (patched)
<https://reviews.apache.org/r/70372/#comment300537>

Is the list of columns from SyslogSerDe fixed to  (facility, severity, 
version, ts, hostname, app_name, proc_id, msg_id, structured_data, msg, 
unmatched)? If so then should the column list/types be hardcoded rather than 
set via LIST_COLUMNS/LIST_COLUMNS_TYPES properties?


- Jason Dere


On April 2, 2019, 10:29 p.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70372/
> ---
> 
> (Updated April 2, 2019, 10:29 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jason Dere.
> 
> 
> Bugs: HIVE-21427
> https://issues.apache.org/jira/browse/HIVE-21427
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21427: Syslog storage handler
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 777f8b51215523fca8e396ddf77139420666311a 
>   data/files/syslog-hs2-2.log PRE-CREATION 
>   data/files/syslog-hs2.log PRE-CREATION 
>   itests/src/test/resources/testconfiguration.properties 
> 96dfbc4b56b6eb3dff6b8e1e42a2371d090426e7 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapRecordReader.java
>  9ef7af4eb0c9787a33d2aa4c9a4528b8f356106b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/ConvertAstToSearchArg.java 
> 27fe828b7531584138cd002956a9fcc20f238f71 
>   ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogInputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogParser.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogSerDe.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/log/syslog/SyslogStorageHandler.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/log/TestSyslogInputFormat.java 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/syslog_parser.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/syslog_parser_file_pruning.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/syslog_parser.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/syslog_parser_file_pruning.q.out 
> PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/MetastoreSchemaTool.java
>  eafe0c6d46d448bce287e61fabac0384b12b9295 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolCommandLine.java
>  6282078411c4c728beed8e957aa857ed3c02133c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskCreateLogsTable.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70372/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>

[jira] [Created] (HIVE-21561) Revert removal of TableType.INDEX_TABLE enum

2019-04-01 Thread Jason Dere (JIRA)

Jason Dere created HIVE-21561:
-

 Summary: Revert removal of TableType.INDEX_TABLE enum
 Key: HIVE-21561
 URL: https://issues.apache.org/jira/browse/HIVE-21561
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-21561.1.patch

Index tables have been removed from Hive as of HIVE-18715.
However, in case users still have index tables defined in the metastore, we 
should keep the TableType.INDEX_TABLE enum around so that users can drop these 
tables. Without the enum defined Hive cannot do anything with them as it fails 
with IllegalArgumentException errors when trying to call TableType.valueOf() on 
INDEX_TABLE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-21528) Add metric to track the number of queries waiting for tez session

2019-03-27 Thread Jason Dere (JIRA)

Jason Dere created HIVE-21528:
-

 Summary: Add metric to track the number of queries waiting for tez 
session
 Key: HIVE-21528
 URL: https://issues.apache.org/jira/browse/HIVE-21528
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-21518) GenericUDFOPNotEqualNS does not run in LLAP

2019-03-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-21518:
-

 Summary: GenericUDFOPNotEqualNS does not run in LLAP
 Key: HIVE-21518
 URL: https://issues.apache.org/jira/browse/HIVE-21518
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-21518.1.patch

GenericUDFOPNotEqualNS (Not equal nullsafe operator) does not run in LLAP mode, 
because it is not registered as a built-in function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 69903: HIVE-21214

2019-02-05 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69903/#review212581
---




ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Line 1829 (original), 1838 (patched)
<https://reviews.apache.org/r/69903/#comment298407>

No "if" - this dedup strategy does not work with speculative execution 
enabled.


- Jason Dere


On Feb. 5, 2019, 10:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69903/
> ---
> 
> (Updated Feb. 5, 2019, 10:10 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Bugs: HIVE-21214
> https://issues.apache.org/jira/browse/HIVE-21214
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> MoveTask : Use attemptId instead of file size for deduplication of files 
> compareTempOrDuplicateFiles()
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43811 
> 
> 
> Diff: https://reviews.apache.org/r/69903/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>

Re: Review Request 69903: HIVE-21214

2019-02-05 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69903/#review212580
---




ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Lines 1876 (patched)
<https://reviews.apache.org/r/69903/#comment298406>

nit: add the filenames to the error message


- Jason Dere


On Feb. 5, 2019, 10:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69903/
> ---
> 
> (Updated Feb. 5, 2019, 10:10 p.m.)
> 
> 
> Review request for hive and Jason Dere.
> 
> 
> Bugs: HIVE-21214
> https://issues.apache.org/jira/browse/HIVE-21214
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> MoveTask : Use attemptId instead of file size for deduplication of files 
> compareTempOrDuplicateFiles()
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43811 
> 
> 
> Diff: https://reviews.apache.org/r/69903/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>

[jira] [Created] (HIVE-20998) HiveStrictManagedMigration utility should update DB/Table location as last migration steps

2018-12-03 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20998:
-

 Summary: HiveStrictManagedMigration utility should update DB/Table 
location as last migration steps
 Key: HIVE-20998
 URL: https://issues.apache.org/jira/browse/HIVE-20998
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


When processing a database or table, the HiveStrictManagedMigration utility 
currently changes the database/table locations as the first step in processing 
that database/table. Unfortunately if an error occurs while processing this 
database or table, then there may still be migration work that needs to 
continue for that db/table by running the migration again. However the 
migration tool only processes dbs/tables that have the old warehouse location, 
then the tool will skip over the db/table when the migration is run again.
 One fix here is to set the new location as the last step after all of the 
migration work is done:
 - The new table location will not be set until all of its partitions have been 
successfully migrated.
 - The new database location will not be set until all of its tables have been 
successfully migrated.

For existing migrations that failed with an error, the following workaround can 
be done so that the db/tables can be re-processed by the migration tool:
 1) Use the migration tool logs to find which databases/tables failed during 
processing.
 2) For each db/table, change location of of the database and table back to old 
location:
 ALTER DATABASE tpcds_bin_partitioned_orc_10 SET LOCATION 
'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db';
 ALTER TABLE tpcds_bin_partitioned_orc_10.store_sales SET LOCATION 
'hdfs://ns1/apps/hive/warehouse/tpcds_bin_partitioned_orc_10.db/store_sales';
 2) Rerun the migration tool



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20900) serde2.JsonSerDe no longer supports timestamp.formats

2018-11-09 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20900:
-

 Summary: serde2.JsonSerDe no longer supports timestamp.formats
 Key: HIVE-20900
 URL: https://issues.apache.org/jira/browse/HIVE-20900
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere


Looks like HIVE-18545 broke this.
Also json_serde_tsformat.q only tested the hcat version of JsonSerde, and the 
format in that test used the ISO timestamp format which apparently is now 
parsed by the default timestamp parsing, so the test was too simple.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20839) "Cannot find field" error during dynamically partitioned hash join

2018-10-30 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20839:
-

 Summary: "Cannot find field" error during dynamically partitioned 
hash join
 Key: HIVE-20839
 URL: https://issues.apache.org/jira/browse/HIVE-20839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Jason Dere
Assignee: Jason Dere


{noformat}
2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 
(1539092085144_8944_1085_28_000996_2)] tez.ReduceRecordProcessor: Hit error 
while closing operators - failing tree
2018-10-11T04:40:22,724 ERROR [TezTR-85144_8944_1085_28_996_2 
(1539092085144_8944_1085_28_000996_2)] tez.TezProcessor: 
java.lang.RuntimeException: cannot find field _col304 from [0:_col0, 1:_col1, 
2:_col2, 3:_col3, 4:_col4, 5:_col5, 6:_col6, 7:_col7, 8:_col8, 9:_col9, 
10:_col10, 11:_col11, 12:_col12, 13:_col13, 14:_col15, 15:_col16, 16:_col17, 
17:_col18, 18:_col19, 19:_col20, 20:_col21, 21:_col22, 22:_col23, 23:_col24, 
24:_col25, 25:_col26, 26:_col27, 27:_col28, 28:_col29, 29:_col30, 30:_col31, 
31:_col32, 32:_col33, 33:_col34, 34:_col35, 35:_col36, 36:_col37, 37:_col38, 
38:_col39, 39:_col40, 40:_col41, 41:_col42, 42:_col43, 43:_col44, 44:_col45, 
45:_col46, 46:_col47, 47:_col48, 48:_col49, 49:_col50, 50:_col51, 51:_col52, 
52:_col53, 53:_col54, 54:_col55, 55:_col56, 56:_col57, 57:_col58, 58:_col59, 
59:_col60, 60:_col61, 61:_col62, 62:_col63, 63:_col64, 64:_col65, 65:_col66, 
66:_col67, 67:_col68, 68:_col70, 69:_col72, 70:_col73, 71:_col74, 72:_col75, 
73:_col76, 74:_col77, 75:_col78, 76:_col79, 77:_col80, 78:_col81, 79:_col82, 
80:_col83, 81:_col84, 82:_col85, 83:_col86, 84:_col87, 85:_col88, 86:_col89, 
87:_col90, 88:_col91, 89:_col92, 90:_col93, 91:_col94, 92:_col95, 93:_col96, 
94:_col97, 95:_col98, 96:_col99, 97:_col100, 98:_col101, 99:_col102, 
100:_col103, 101:_col104, 102:_col105, 103:_col106, 104:_col107, 105:_col108, 
106:_col109, 107:_col110, 108:_col111, 109:_col112, 110:_col113, 111:_col114, 
112:_col115, 113:_col116, 114:_col117, 115:_col118, 116:_col119, 117:_col120, 
118:_col121, 119:_col122, 120:_col123, 121:_col124, 122:_col125, 123:_col126, 
124:_col127, 125:_col128, 126:_col129, 127:_col130, 128:_col131, 129:_col132, 
130:_col133, 131:_col134, 132:_col135, 133:_col136, 134:_col137, 135:_col138, 
136:_col139, 137:_col140, 138:_col141, 139:_col142, 140:_col143, 141:_col144, 
142:_col145, 143:_col146, 144:_col147, 145:_col148, 146:_col149, 147:_col150, 
148:_col151, 149:_col152, 150:_col153, 151:_col154, 152:_col155, 153:_col156, 
154:_col157, 155:_col158, 156:_col159, 157:_col160, 158:_col161, 159:_col162, 
160:_col163, 161:_col164, 162:_col165, 163:_col166, 164:_col167, 165:_col168, 
166:_col169, 167:_col170, 168:_col171, 169:_col318]
at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:485)
at 
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:153)
at 
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:80)
at 
org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:91)
at 
org.apache.hadoop.hive.ql.exec.AbstractMapJoinOperator.initializeOp(AbstractMapJoinOperator.java:74)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.initializeOp(MapJoinOperator.java:144)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:374)
at 
org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.init(ReduceRecordProcessor.java:195)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:188)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:172)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20834) Hive QueryResultCache entries keeping reference to SemanticAnalyzer from cached query

2018-10-29 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20834:
-

 Summary: Hive QueryResultCache entries keeping reference to 
SemanticAnalyzer from cached query
 Key: HIVE-20834
 URL: https://issues.apache.org/jira/browse/HIVE-20834
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


QueryResultCache.LookupInfo ends up keeping a reference to the SemanticAnalyzer 
from the cached query, for as long as the cached entry is in the cache. We 
should not be keeping the SemanticAnalyzer around after the query is done 
executing since they can hold on to quite a bit of memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Review Request 69173: HIVE-20259 Cleanup of results cache directory

2018-10-25 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69173/
---

Review request for hive and Gopal V.


Bugs: HIVE-20259
https://issues.apache.org/jira/browse/HIVE-20259


Repository: hive-git


Description
---

Attached patch with utility DirectoryMarkerUpdate/Cleanup classes to create 
.cacheupdate files in the cache directory, to indicate that this directory 
should not be cleaned up by any other process performing 
DirectoryMarkerCleanup. This uses the last modify date of the .cacheupdate file 
to determine whether the file should be cleaned up, if the instance running 
cleanup determines this date is too old then the directory will be deleted.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e226a1f82d 
  common/src/java/org/apache/hive/common/util/DirectoryMarkerCleanup.java 
PRE-CREATION 
  common/src/java/org/apache/hive/common/util/DirectoryMarkerUpdate.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
a51b7e750b 


Diff: https://reviews.apache.org/r/69173/diff/1/


Testing
---


Thanks,

Jason Dere

Re: Review Request 68946: HIVE-20707: Automatic MSCK REPAIR for external tables

2018-10-15 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68946/#review209582
---




ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
Lines 4761 (patched)
<https://reviews.apache.org/r/68946/#comment294106>

Should this be on by default? If there are a lot of external tables 
(especially on s3), the metastore could be spending a lot of time doing auto 
discover. Could also affect the running of other MetastoreTaskThreads.



ql/src/test/results/clientpositive/msck_repair_drop.q.out
Line 127 (original), 127 (patched)
<https://reviews.apache.org/r/68946/#comment294105>

What is the new ordering of these messages? Looks like it could be 
potential issue when diffing golden files?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
Lines 141 (patched)
<https://reviews.apache.org/r/68946/#comment294108>

Is this variable used? It's logged, but I think retentionSeconds should be 
used instead.



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java
Lines 142 (patched)
<https://reviews.apache.org/r/68946/#comment294107>

Might want to check for exception from TimeValidator.validate() in 
getRententionPeriodInSeconds, or else a bad setting in one table can fail here 
and prevent this from running for any tables.
But if you do skip that table, make sure the countdown latch is updated 
appropriately.


- Jason Dere


On Oct. 16, 2018, 12:21 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68946/
> ---
> 
> (Updated Oct. 16, 2018, 12:21 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jason Dere.
> 
> 
> Bugs: HIVE-20707
> https://issues.apache.org/jira/browse/HIVE-20707
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20707: Automatic partition management
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 92a1c31 
>   hbase-handler/src/test/results/positive/external_table_ppd.q.out edcbe7e 
>   hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
> 1209c88 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out ccd4148 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out eeb97f0 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 5a4aea9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  a9d7468 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 807f159 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java 46bf088 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/CheckResult.java 0b4240f 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 
> 598bb2e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java cff32d3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 29f6ecf 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 27f677e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckCreatePartitionsInBatches.java
>  ce2b186 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckDropPartitionsInBatches.java
>  9480d38 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java 
> a2a0583 
>   ql/src/test/queries/clientpositive/msck_repair_acid.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/partition_discovery.q PRE-CREATION 
>   ql/src/test/results/clientpositive/create_like.q.out f4a5ed5 
>   ql/src/test/results/clientpositive/create_like_view.q.out 870f280 
>   ql/src/test/results/clientpositive/default_file_format.q.out 0adf5ae 
>   ql/src/test/results/clientpositive/druid_topn.q.out 179902a 
>   ql/src/test/results/clientpositive/explain_locks.q.out ed7f1e8 
>   ql/src/test/results/clientpositive/llap/external_table_purge.q.out 24c778e 
>   ql/src/test/results/clientpositive/llap/mm_exim.q.out ee6cf06 
>   ql/src/test/results/clientpositive/llap/strict_managed_tables2.q.out 
> f3b6152 
>   ql/src/test/results/clientpositive/llap/whroot_external1.q.out cac158c 
>   ql/src/test/results/clientpositive/msck_repair_acid.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/msck_repair_drop.q.out 2456734 
>   ql/src/test/results/clientpositive/partition_discovery.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/rename_external_partition_location.q.out 
> 02cd814 
>

Re: Review Request 68946: HIVE-20707: Automatic MSCK REPAIR for external tables

2018-10-08 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68946/#review209341
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java
Lines 88 (patched)
<https://reviews.apache.org/r/68946/#comment293671>

Can you use FileUtils.HIDDEN_FILES_PATH_FILTER? I believe 
standalone-metastore also has a FileUtils.java


- Jason Dere


On Oct. 8, 2018, 4:16 p.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68946/
> ---
> 
> (Updated Oct. 8, 2018, 4:16 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jason Dere.
> 
> 
> Bugs: HIVE-20707
> https://issues.apache.org/jira/browse/HIVE-20707
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20707: Automatic MSCK REPAIR for external tables
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> d0adc35544cb8ae9d007a1d2ccb9b9565eedca88 
>   data/conf/hive-site.xml 0daf9adc717bc1c4413d2e34691c26a3e2585c77 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  cffa21af33d5abb2162fa16b6b990a469075f03d 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 
> e91346228e8724b8253364114145a348a7cbee26 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/CheckResult.java 
> 0b4240f5665f0b544b2fc5864fc098eb286a281e 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java 
> 598bb2ee8b72f1b7f75be7802b4eaae0204c988d 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckCreatePartitionsInBatches.java
>  ce2b186b4dceda780106776daa022f18388ec76f 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/TestMsckDropPartitionsInBatches.java
>  7e768dacb0b00a0f1a9e64efbe778f9c2daaa31b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHiveMetaStoreChecker.java 
> a2a0583d4dbdfe9aece1a14ecac24e0e6189cafa 
>   ql/src/test/queries/clientpositive/auto_msck_repair_0.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/auto_msck_repair_1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/auto_msck_repair_2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/auto_msck_repair_3.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/auto_msck_repair_4.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/auto_msck_repair_batchsize.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/msck_repair_0.q 
> aeb4820af5b6687f7ae4163a94bdd2be25a8b0cd 
>   ql/src/test/queries/clientpositive/msck_repair_2.q 
> be745b2d607d8c727b862c71f153f09d5622a8b5 
>   ql/src/test/queries/clientpositive/msck_repair_3.q 
> 140a6904ddc98b165d71a8b24314c56888ccbb9c 
>   ql/src/test/queries/clientpositive/msck_repair_batchsize.q 
> 5a7afcca5b86c1887308626c0dc4d99916811bea 
>   ql/src/test/queries/clientpositive/msck_repair_drop.q 
> 9923fb50cbdbdf9e8e07276ccaec073c490770e6 
>   ql/src/test/results/clientpositive/auto_msck_repair_0.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/auto_msck_repair_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/auto_msck_repair_2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/auto_msck_repair_3.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/auto_msck_repair_4.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/auto_msck_repair_batchsize.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/msck_repair_0.q.out 
> fa6e4a988273a71b0f9dab64a48ddda6320d5f2f 
>   ql/src/test/results/clientpositive/msck_repair_2.q.out 
> 7fbd934e118e81b9c5f028191c7ea6582a34db75 
>   ql/src/test/results/clientpositive/msck_repair_3.q.out 
> 0e153fbe69ba39819fac4629ef1bf5f90c17f37f 
>   ql/src/test/results/clientpositive/msck_repair_batchsize.q.out 
> ab4b83137dcf1ce36846ce74e0a546528e81358b 
>   ql/src/test/results/clientpositive/msck_repair_drop.q.out 
> 971c1381276fa626bd91d34488a65e3bfb2781ae 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/Warehouse.java
>  294dfb728e12efaa13d239ea7b8949587a50fe1f 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/api/MetastoreException.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  7b01678a10f4f0667844fec64ae76695d835bd6e 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
&

[jira] [Created] (HIVE-20603) "Wrong FS" error when inserting to partition after changing table location filesystem

2018-09-19 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20603:
-

 Summary: "Wrong FS" error when inserting to partition after 
changing table location filesystem
 Key: HIVE-20603
 URL: https://issues.apache.org/jira/browse/HIVE-20603
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Inserting into an existing partition, after changing a table's location to 
point to a different HDFS filesystem:
{noformat}
   query += "CREATE TABLE test_managed_tbl (id int, name string, dept string) 
PARTITIONED BY (year int);\n"
query += "INSERT INTO test_managed_tbl PARTITION (year=2016) VALUES 
(8,'Henry','CSE');\n"
query += "ALTER TABLE test_managed_tbl ADD PARTITION (year=2017);\n"
query += "ALTER TABLE test_managed_tbl SET LOCATION 
  
'hdfs://ns2/warehouse/tablespace/managed/hive/test_managed_tbl'"
query += "INSERT INTO test_managed_tbl PARTITION (year=2017) VALUES 
(9,'Harris','CSE');\n"
{noformat}

Results in the following error:
{noformat}
java.lang.IllegalArgumentException: Wrong FS: 
hdfs://ns1/warehouse/tablespace/managed/hive/test_managed_tbl/year=2017, 
expected: hdfs://ns2
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:240)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1580)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1595)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1734)
at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:4141)
at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1966)
at 
org.apache.hadoop.hive.ql.exec.MoveTask.handleStaticParts(MoveTask.java:477)
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:397)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:210)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2701)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2372)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2048)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1740)
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20515) Empty query results when using results cache and query temp dir, results cache dir in different filesystems

2018-09-06 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20515:
-

 Summary: Empty query results when using results cache and query 
temp dir, results cache dir in different filesystems
 Key: HIVE-20515
 URL: https://issues.apache.org/jira/browse/HIVE-20515
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


If the scratchdir for temporary query results and the results cache dir are in 
different filesystems, moving the query from the temp directory to results 
cache will fail.

Looking at the moveResultsToCacheDirectory() logic in QueryResultsCache.java, I 
see the following issues:
- FileSystem.rename() is used, which only works if the files are on the same 
filesystem. Need to use something like Hive.mvFile or something similar which 
can work between different filesystems.
- The return code from rename() was not checked which might possibly have 
caught the error here. This may not be applicable if a different method from 
FS.rename() is used in the proper fix.

With some filesystems (noticed this with WASB), if FileSystem.rename() returns 
false on failure rather than throwing an exception, then this results in empty 
results showing up for the query because the return code was not checked 
properly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20412) NPE in HiveMetaHook

2018-08-17 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20412:
-

 Summary: NPE in HiveMetaHook
 Key: HIVE-20412
 URL: https://issues.apache.org/jira/browse/HIVE-20412
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jason Dere


{noformat}
java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.metastore.HiveMetaHook.preAlterTable(HiveMetaHook.java:113)
 ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:427)
 ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104]
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table(SessionHiveMetaStoreClient.java:415)
 ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_112]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_112]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
 ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104]
at com.sun.proxy.$Proxy37.alter_table(Unknown Source) ~[?:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_112]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_112]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2933)
 ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104]
at com.sun.proxy.$Proxy37.alter_table(Unknown Source) ~[?:?]
at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:708) 
~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104]
at 
org.apache.hadoop.hive.ql.util.HiveStrictManagedMigration$HiveUpdater.updateTableProperties(HiveStrictManagedMigration.java:954)
 ~[hive-exec-3.1.0.3.0.1.0-104.jar:3.1.0.3.0.1.0-104]
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20397) HiveStrictManagedMigration updates

2018-08-15 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20397:
-

 Summary: HiveStrictManagedMigration updates
 Key: HIVE-20397
 URL: https://issues.apache.org/jira/browse/HIVE-20397
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


- Switch from using Driver instance to using metastore calls via 
Hive.alterDatabase/Hive.alterTable
- For tables converted from ORC to ACID tables, handle renaming of the files 
- Fix error handling so utility does not terminate after the first error 
encountered




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20298) Illegal null value in column `TBLS`.`WRITE_ID`

2018-08-02 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20298:
-

 Summary: Illegal null value in column `TBLS`.`WRITE_ID`
 Key: HIVE-20298
 URL: https://issues.apache.org/jira/browse/HIVE-20298
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jason Dere


Manually upgraded my existing local metastore using 
upgrade-3.0.0-to-3.1.0.mysql.sql, upgrade-3.1.0-to-3.2.0.mysql.sql, 
upgrade-3.2.0-to-4.0.0.mysql.sql. When running DESCRIBE EXTENDED of an existing 
table, I was getting the following error in hive.log. It looks like the 
ObjectStore/MTable classes don't seem to be able to support null values in the 
new writeId column that was added to the TBLS table in the metastore.

cc [~sershe] [~ekoifman]

{noformat}
Caused by: javax.jdo.JDODataStoreException: Illegal null value in column 
`TBLS`.`WRITE_ID`
NestedThrowables:
org.datanucleus.store.rdbms.exceptions.NullValueException: Illegal null value 
in column `TBLS`.`WRITE_ID`
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:553)
at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:255)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:1802)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:1838)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getTable(ObjectStore.java:1424)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
at com.sun.proxy.$Proxy39.getTable(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:2950)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:2898)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:2882)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
... 36 more
Caused by: org.datanucleus.store.rdbms.exceptions.NullValueException: Illegal 
null value in column `TBLS`.`WRITE_ID`
at 
org.datanucleus.store.rdbms.mapping.datastore.BigIntRDBMSMapping.getLong(BigIntRDBMSMapping.java:140)
at 
org.datanucleus.store.rdbms.mapping.java.SingleFieldMapping.getLong(SingleFieldMapping.java:155)
at 
org.datanucleus.store.rdbms.fieldmanager.ResultSetGetter.fetchLongField(ResultSetGetter.java:124)
at 
org.datanucleus.state.AbstractStateManager.replacingLongField(AbstractStateManager.java:1549)
at 
org.datanucleus.state.StateManagerImpl.replacingLongField(StateManagerImpl.java:120)
at 
org.apache.hadoop.hive.metastore.model.MTable.dnReplaceField(MTable.java)
at 
org.apache.hadoop.hive.metastore.model.MTable.dnReplaceFields(MTable.java)
at 
org.datanucleus.state.StateManagerImpl.replaceFields(StateManagerImpl.java:3109)
at 
org.datanucleus.store.rdbms.query.PersistentClassROF$1.fetchFields(PersistentClassROF.java:465)
at 
org.datanucleus.state.StateManagerImpl.loadFieldValues(StateManagerImpl.java:2238)
at 
org.datanucleus.state.StateManagerImpl.initialiseForHollow(StateManagerImpl.java:263)
at 
org.datanucleus.state.ObjectProviderFactoryImpl.newForHollow(ObjectProviderFactoryImpl.java:112)
at 
org.datanucleus.ExecutionContextImpl.findObject(ExecutionContextImpl.java:3097)
at 
org.datanucleus.store.rdbms.query.PersistentClassROF.getObjectForDatastoreId(PersistentClassROF.java:460)
at 
org.datanucleus.store.rdbms.query.PersistentClassROF.getObject(PersistentClassROF.java:385)
at 
org.datanucleus.store.rdbms.query.ForwardQueryResult.nextResultSetElement(ForwardQueryResult.java:188)
at 
org.datanucleus.store.rdbms.query.ForwardQueryResult$QueryResultIterator.next(ForwardQueryResult.java:416)
at 
org.datanucleus.store.rdbms.query.ForwardQueryResult.processNumberOfResults(ForwardQueryResult.java:143)
at 
org.datanucleus.store.rdbms.query.ForwardQueryResult.advanceToEndOfResultSet(ForwardQueryResult.java:171

[jira] [Created] (HIVE-20259) Cleanup of results cache directory

2018-07-27 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20259:
-

 Summary: Cleanup of results cache directory
 Key: HIVE-20259
 URL: https://issues.apache.org/jira/browse/HIVE-20259
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


The query results cache directory is currently deleted at process exit. This 
does not work in the case of a kill -9 or a sudden process exit of Hive. There 
should be some cleanup mechanism in place to take care of any old cache 
directories that were not deleted at process exit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20250) Option to allow external tables to use query results cache

2018-07-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20250:
-

 Summary: Option to allow external tables to use query results cache
 Key: HIVE-20250
 URL: https://issues.apache.org/jira/browse/HIVE-20250
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-20242) Query results cache: Improve ability of queries to use pending query results

2018-07-25 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20242:
-

 Summary: Query results cache: Improve ability of queries to use 
pending query results
 Key: HIVE-20242
 URL: https://issues.apache.org/jira/browse/HIVE-20242
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


HIVE-19138 allowed a currently running query to wait on the pending results of 
an already running query. [~gopalv], after testing with high concurrency, 
suggested further improving this by having a way to use the switch to using the 
results cache even at the end of query compilation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master

2018-07-23 Thread Jason Dere

/clientpositive/spark/groupby8_noskew.q.out 2ef72b7c18 
  ql/src/test/results/clientpositive/spark/groupby9.q.out 316f936db3 
  ql/src/test/results/clientpositive/spark/groupby_position.q.out 7bb5f18e41 
  ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 873717273d 
  ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 
571203089d 
  
ql/src/test/results/clientpositive/spark/infer_bucket_sort_map_operators.q.out 
268dd10450 
  ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out 
22fe91cb2b 
  ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out 0dde265f8d 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_20.q.out fd0f1c0b26 
  
ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 
cecee578db 
  
ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out
 c41dba93ee 
  ql/src/test/results/clientpositive/spark/stats1.q.out b755b4cc3a 
  ql/src/test/results/clientpositive/spark/subquery_multi.q.out f90b353818 
  ql/src/test/results/clientpositive/spark/union17.q.out 93086a03fe 
  ql/src/test/results/clientpositive/spark/union18.q.out 4b6c32daa7 
  ql/src/test/results/clientpositive/spark/union19.q.out 6d47270aee 
  ql/src/test/results/clientpositive/spark/union20.q.out b9674089fe 
  ql/src/test/results/clientpositive/spark/union32.q.out 925392b500 
  ql/src/test/results/clientpositive/spark/union33.q.out 190b6c0128 
  ql/src/test/results/clientpositive/spark/union6.q.out fca52a3dda 
  ql/src/test/results/clientpositive/spark/union_remove_19.q.out bf8abf1b42 
  ql/src/test/results/clientpositive/spark/vector_string_concat.q.out 
cee7995a99 
  ql/src/test/results/clientpositive/stats1.q.out 10291ce4b5 
  ql/src/test/results/clientpositive/tablevalues.q.out 74fda005d5 
  ql/src/test/results/clientpositive/udf3.q.out 0f7c859db8 
  ql/src/test/results/clientpositive/udf_string.q.out 71b9b293df 
  ql/src/test/results/clientpositive/union17.q.out b7748c0270 
  ql/src/test/results/clientpositive/union18.q.out 109fa8d4ff 
  ql/src/test/results/clientpositive/union19.q.out f57d8fb4f9 
  ql/src/test/results/clientpositive/union20.q.out 6cc5eff503 
  ql/src/test/results/clientpositive/union32.q.out 92ed7d1d19 
  ql/src/test/results/clientpositive/union33.q.out 1b8b35b9c6 
  ql/src/test/results/clientpositive/union6.q.out 37c75214c3 
  ql/src/test/results/clientpositive/union_remove_19.q.out 0c67e67ca5 
  ql/src/test/results/clientpositive/vector_case_when_1.q.out 59d813371d 
  ql/src/test/results/clientpositive/vector_char_mapjoin1.q.out 73012578b8 
  ql/src/test/results/clientpositive/vector_decimal_1.q.out e61691273c 
  ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 
0193f3bc88 
  ql/src/test/results/clientpositive/vector_string_concat.q.out 68b011d2e5 
  ql/src/test/results/clientpositive/vector_varchar_mapjoin1.q.out f956d58c5f 
  ql/src/test/results/clientpositive/vectorized_casts.q.out a19b5ee67a 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
 1e12ccaf3e 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 6362f2ef57 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java
 32fab314a5 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/primitive/TestPrimitiveObjectInspectorUtils.java
 3c2797e979 


Diff: https://reviews.apache.org/r/68013/diff/3/

Changes: https://reviews.apache.org/r/68013/diff/2-3/


Testing
---


Thanks,

Jason Dere

Re: Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master

2018-07-23 Thread Jason Dere

/groupby7_map_multi_single_reducer.q.out
 9d09491a46 
  ql/src/test/results/clientpositive/spark/groupby7_map_skew.q.out 5868f7abf9 
  ql/src/test/results/clientpositive/spark/groupby7_noskew.q.out 53345aac9e 
  
ql/src/test/results/clientpositive/spark/groupby7_noskew_multi_single_reducer.q.out
 68809005e1 
  ql/src/test/results/clientpositive/spark/groupby8.q.out c6cac1bf80 
  ql/src/test/results/clientpositive/spark/groupby8_map.q.out 40d3e7c103 
  ql/src/test/results/clientpositive/spark/groupby8_map_skew.q.out 053c717d09 
  ql/src/test/results/clientpositive/spark/groupby8_noskew.q.out 2ef72b7c18 
  ql/src/test/results/clientpositive/spark/groupby9.q.out 316f936db3 
  ql/src/test/results/clientpositive/spark/groupby_position.q.out 7bb5f18e41 
  ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out 873717273d 
  ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 
571203089d 
  
ql/src/test/results/clientpositive/spark/infer_bucket_sort_map_operators.q.out 
268dd10450 
  ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out 
22fe91cb2b 
  ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out 0dde265f8d 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_20.q.out fd0f1c0b26 
  
ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 
cecee578db 
  
ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out
 c41dba93ee 
  ql/src/test/results/clientpositive/spark/stats1.q.out b755b4cc3a 
  ql/src/test/results/clientpositive/spark/subquery_multi.q.out f90b353818 
  ql/src/test/results/clientpositive/spark/union17.q.out 93086a03fe 
  ql/src/test/results/clientpositive/spark/union18.q.out 4b6c32daa7 
  ql/src/test/results/clientpositive/spark/union19.q.out 6d47270aee 
  ql/src/test/results/clientpositive/spark/union20.q.out b9674089fe 
  ql/src/test/results/clientpositive/spark/union32.q.out 925392b500 
  ql/src/test/results/clientpositive/spark/union33.q.out 190b6c0128 
  ql/src/test/results/clientpositive/spark/union6.q.out fca52a3dda 
  ql/src/test/results/clientpositive/spark/union_remove_19.q.out bf8abf1b42 
  ql/src/test/results/clientpositive/spark/vector_string_concat.q.out 
cee7995a99 
  ql/src/test/results/clientpositive/stats1.q.out 10291ce4b5 
  ql/src/test/results/clientpositive/tablevalues.q.out 74fda005d5 
  ql/src/test/results/clientpositive/udf3.q.out 0f7c859db8 
  ql/src/test/results/clientpositive/udf_string.q.out 71b9b293df 
  ql/src/test/results/clientpositive/union17.q.out b7748c0270 
  ql/src/test/results/clientpositive/union18.q.out 109fa8d4ff 
  ql/src/test/results/clientpositive/union19.q.out f57d8fb4f9 
  ql/src/test/results/clientpositive/union20.q.out 6cc5eff503 
  ql/src/test/results/clientpositive/union32.q.out 92ed7d1d19 
  ql/src/test/results/clientpositive/union33.q.out 1b8b35b9c6 
  ql/src/test/results/clientpositive/union6.q.out 37c75214c3 
  ql/src/test/results/clientpositive/union_remove_19.q.out 0c67e67ca5 
  ql/src/test/results/clientpositive/vector_case_when_1.q.out 59d813371d 
  ql/src/test/results/clientpositive/vector_char_mapjoin1.q.out 73012578b8 
  ql/src/test/results/clientpositive/vector_decimal_1.q.out e61691273c 
  ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 
0193f3bc88 
  ql/src/test/results/clientpositive/vector_string_concat.q.out 68b011d2e5 
  ql/src/test/results/clientpositive/vector_varchar_mapjoin1.q.out f956d58c5f 
  ql/src/test/results/clientpositive/vectorization_parquet_ppd_decimal.q.out 
49d7354b60 
  ql/src/test/results/clientpositive/vectorized_casts.q.out a19b5ee67a 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
 1e12ccaf3e 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 6362f2ef57 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java
 32fab314a5 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/primitive/TestPrimitiveObjectInspectorUtils.java
 3c2797e979 


Diff: https://reviews.apache.org/r/68013/diff/2/

Changes: https://reviews.apache.org/r/68013/diff/1-2/


Testing
---


Thanks,

Jason Dere

Re: Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master

2018-07-23 Thread Jason Dere



> On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToString.java
> > Lines 20-21 (patched)
> > <https://reviews.apache.org/r/68013/diff/1/?file=2062580#file2062580line20>
> >
> > Need to use slf4j.

Will fix


> On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToString.java
> > Lines 35 (patched)
> > <https://reviews.apache.org/r/68013/diff/1/?file=2062580#file2062580line35>
> >
> > Don't see a deleted file of earlier udf in patch. We shall delete that.

Deleting UDFToString in the new patch


> On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToString.java
> > Lines 53 (patched)
> > <https://reviews.apache.org/r/68013/diff/1/?file=2062580#file2062580line53>
> >
> > I guess there can be a string representation for map,array,struct. 
> > Wasn't earlier udf supporting it? If not, lets leave a TODO here.

Just tried a query on Hive master casting complex types to String, it fails 
elsewhere during query compilation. So there seem to be other obstacles here 
besides this.

2018-07-23T14:40:18,381 ERROR [7c182ca7-d2aa-4949-a3f3-ba034754c3c2 main] 
ql.Driver: FAILED: ClassCastException 
org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to 
org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.typeinfo.ListTypeInfo cannot be cast to 
org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.isRedundantConversionFunction(TypeCheckProcFactory.java:893)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:996)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1468)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
at 
org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:240)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:186)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:12684)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12639)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genSelectLogicalPlan(CalcitePlanner.java:4614)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:4951)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1740)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1688)


> On July 23, 2018, 7:09 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/char_pad_convert.q.out
> > Line 133 (original), 133 (patched)
> > <https://reviews.apache.org/r/68013/diff/1/?file=2062590#file2062590line133>
> >
> > Lets add test for cast from dec to char/varchar as well. Both cases 
> > where size of char/varchar is bigger as well as smaller than decimal's 
> > scale.

Added test in TestObjectInspectorConverters


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68013/#review206338
---


On July 23, 2018, 6:37 a.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68013/
> ---
> 
> (Updated July 23, 2018, 6:37 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-20082
> https://issues.apache.org/jira/browse/HIVE-20082
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> preserve decimal 0-padding during decimal-to-string conversion
> 
> 
> Diffs
> -
>

Review Request 68013: HIVE-20082 HiveDecimal to string conversion doesn't format the decimal correctly - master

2018-07-23 Thread Jason Dere

/clientpositive/spark/infer_bucket_sort_map_operators.q.out 
268dd10450 
  ql/src/test/results/clientpositive/spark/multi_insert_lateral_view.q.out 
22fe91cb2b 
  ql/src/test/results/clientpositive/spark/multi_insert_mixed.q.out 0dde265f8d 
  ql/src/test/results/clientpositive/spark/smb_mapjoin_20.q.out fd0f1c0b26 
  
ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out 
cecee578db 
  
ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out
 c41dba93ee 
  ql/src/test/results/clientpositive/spark/stats1.q.out b755b4cc3a 
  ql/src/test/results/clientpositive/spark/subquery_multi.q.out f90b353818 
  ql/src/test/results/clientpositive/spark/union17.q.out 93086a03fe 
  ql/src/test/results/clientpositive/spark/union18.q.out 4b6c32daa7 
  ql/src/test/results/clientpositive/spark/union19.q.out 6d47270aee 
  ql/src/test/results/clientpositive/spark/union20.q.out b9674089fe 
  ql/src/test/results/clientpositive/spark/union32.q.out 925392b500 
  ql/src/test/results/clientpositive/spark/union33.q.out 190b6c0128 
  ql/src/test/results/clientpositive/spark/union6.q.out fca52a3dda 
  ql/src/test/results/clientpositive/spark/union_remove_19.q.out bf8abf1b42 
  ql/src/test/results/clientpositive/spark/vector_string_concat.q.out 
cee7995a99 
  ql/src/test/results/clientpositive/stats1.q.out 10291ce4b5 
  ql/src/test/results/clientpositive/tablevalues.q.out 74fda005d5 
  ql/src/test/results/clientpositive/udf3.q.out 0f7c859db8 
  ql/src/test/results/clientpositive/udf_string.q.out 71b9b293df 
  ql/src/test/results/clientpositive/union17.q.out b7748c0270 
  ql/src/test/results/clientpositive/union18.q.out 109fa8d4ff 
  ql/src/test/results/clientpositive/union19.q.out f57d8fb4f9 
  ql/src/test/results/clientpositive/union20.q.out 6cc5eff503 
  ql/src/test/results/clientpositive/union32.q.out 92ed7d1d19 
  ql/src/test/results/clientpositive/union33.q.out 1b8b35b9c6 
  ql/src/test/results/clientpositive/union6.q.out 37c75214c3 
  ql/src/test/results/clientpositive/union_remove_19.q.out 0c67e67ca5 
  ql/src/test/results/clientpositive/vector_case_when_1.q.out 59d813371d 
  ql/src/test/results/clientpositive/vector_char_mapjoin1.q.out 73012578b8 
  ql/src/test/results/clientpositive/vector_decimal_1.q.out e61691273c 
  ql/src/test/results/clientpositive/vector_decimal_expressions.q.out 
0193f3bc88 
  ql/src/test/results/clientpositive/vector_string_concat.q.out 68b011d2e5 
  ql/src/test/results/clientpositive/vector_varchar_mapjoin1.q.out f956d58c5f 
  ql/src/test/results/clientpositive/vectorized_casts.q.out a19b5ee67a 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorConverter.java
 1e12ccaf3e 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java
 6362f2ef57 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestObjectInspectorConverters.java
 32fab314a5 
  
serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/primitive/TestPrimitiveObjectInspectorUtils.java
 3c2797e979 


Diff: https://reviews.apache.org/r/68013/diff/1/


Testing
---


Thanks,

Jason Dere

Re: Review Request 67974: HIVE-20164

2018-07-22 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67974/#review206321
---




ql/src/test/queries/clientpositive/murmur_hash_migration.q
Lines 57 (patched)
<https://reviews.apache.org/r/67974/#comment289246>

Can you make this count(*)? Kind of hard to verify.


- Jason Dere


On July 20, 2018, 11:10 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67974/
> ---
> 
> (Updated July 20, 2018, 11:10 p.m.)
> 
> 
> Review request for hive, Gopal V and Jason Dere.
> 
> 
> Bugs: HIVE-20164
> https://issues.apache.org/jira/browse/HIVE-20164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties d5a33bd8ca 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1661aeccd7 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e 
>   ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/67974/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>

Re: Review Request 67974: HIVE-20164

2018-07-20 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67974/#review206291
---




ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1672 (patched)
<https://reviews.apache.org/r/67974/#comment289187>

Remove these comments?



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1695 (patched)
<https://reviews.apache.org/r/67974/#comment289188>

please add curly braces



ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java
Lines 1701 (patched)
<https://reviews.apache.org/r/67974/#comment289189>

curly braces



ql/src/test/queries/clientpositive/murmur_hash_migration.q
Lines 36 (patched)
<https://reviews.apache.org/r/67974/#comment289194>

Does this test also need to query the inserted tables to show that things 
are working properly?


- Jason Dere


On July 19, 2018, 6:02 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67974/
> ---
> 
> (Updated July 19, 2018, 6:02 p.m.)
> 
> 
> Review request for hive, Gopal V and Jason Dere.
> 
> 
> Bugs: HIVE-20164
> https://issues.apache.org/jira/browse/HIVE-20164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Murmur Hash : Make sure CTAS and IAS use correct bucketing version
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties d08528f319 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 1b433c7498 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java bbce940c2e 
>   ql/src/test/queries/clientpositive/murmur_hash_migration.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/murmur_hash_migration.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/67974/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>

Re: Review Request 67970: HIVE-20204 Type conversion during IN () comparisons is using different rules from other comparison operations

2018-07-19 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67970/
---

(Updated July 19, 2018, 8:12 p.m.)


Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez.


Changes
---

Update to fix failures in join45.q,join47.q,mapjoin47.q


Bugs: HIVE-20204
https://issues.apache.org/jira/browse/HIVE-20204


Repository: hive-git


Description
---

Change GenericUDFIn to use FunctionRegistry.getCommonClassForComparison() to 
match type conversion done during other comparison operations.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 0800a10541 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java
 2ae015adf4 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIn.java 
cf26fce00f 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java 
c91865b173 
  ql/src/test/queries/clientpositive/orc_ppd_decimal.q 2134a9f207 
  ql/src/test/queries/clientpositive/parquet_ppd_decimal.q e8e118d541 
  ql/src/test/queries/clientpositive/vectorization_parquet_ppd_decimal.q 
0b0811b055 
  ql/src/test/results/clientpositive/llap/orc_ppd_decimal.q.out 4b535d4480 
  ql/src/test/results/clientpositive/parquet_ppd_decimal.q.out c9a4338dbf 
  ql/src/test/results/clientpositive/vectorization_parquet_ppd_decimal.q.out 
49d7354b60 


Diff: https://reviews.apache.org/r/67970/diff/2/

Changes: https://reviews.apache.org/r/67970/diff/1-2/


Testing
---


Thanks,

Jason Dere

Review Request 67970: HIVE-20204 Type conversion during IN () comparisons is using different rules from other comparison operations

2018-07-18 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67970/
---

Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez.


Bugs: HIVE-20204
https://issues.apache.org/jira/browse/HIVE-20204


Repository: hive-git


Description
---

Change GenericUDFIn to use FunctionRegistry.getCommonClassForComparison() to 
match type conversion done during other comparison operations.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 0800a10541 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIn.java 
cf26fce00f 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java 
c91865b173 
  ql/src/test/queries/clientpositive/orc_ppd_decimal.q 2134a9f207 
  ql/src/test/queries/clientpositive/parquet_ppd_decimal.q e8e118d541 
  ql/src/test/queries/clientpositive/vectorization_parquet_ppd_decimal.q 
0b0811b055 
  ql/src/test/results/clientpositive/llap/orc_ppd_decimal.q.out 4b535d4480 
  ql/src/test/results/clientpositive/parquet_ppd_decimal.q.out c9a4338dbf 
  ql/src/test/results/clientpositive/vectorization_parquet_ppd_decimal.q.out 
49d7354b60 


Diff: https://reviews.apache.org/r/67970/diff/1/


Testing
---


Thanks,

Jason Dere

[jira] [Created] (HIVE-20204) Type conversion during IN () comparisons is using different rules from other comparison operations

2018-07-18 Thread Jason Dere (JIRA)

Jason Dere created HIVE-20204:
-

 Summary: Type conversion during IN () comparisons is using 
different rules from other comparison operations
 Key: HIVE-20204
 URL: https://issues.apache.org/jira/browse/HIVE-20204
 Project: Hive
  Issue Type: Bug
  Components: Types
Reporter: Jason Dere
Assignee: Jason Dere


Noticed this while looking at HIVE-20082.
The type conversion done during GenericUDFIn (via 
ReturnObjectInspectorResolver) uses FunctionRegistry.getCommonClass(), whereas 
the other comparison operators (=, <, >, <=, >=) use 
FunctionRegistry.getCommonClassForComparison(). As a result, dec_column IN 
('1.1', '2.2') compares the values as strings, whereas dec_column = '1.1' would 
compare the values as doubles. This makes a difference for HIVE-20082 since it 
is related to changing the 0-padding during decimal-to-string conversions.

cc [~ashutoshc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19981) Managed tables converted to external tables by the HiveStrictManagedMigration utility should be set to delete data when the table is dropped

2018-06-25 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19981:
-

 Summary: Managed tables converted to external tables by the 
HiveStrictManagedMigration utility should be set to delete data when the table 
is dropped
 Key: HIVE-19981
 URL: https://issues.apache.org/jira/browse/HIVE-19981
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Using the HiveStrictManagedMigration utility, tables can be converted to 
conform to the Hive strict managed tables mode.
For managed tables that are converted to external tables by the utility, these 
tables should keep the "drop data on delete" semantics they had when they were 
managed tables.

One way to do this is to introduce a table property "external.table.purge", 
which if true (and if the table is an external table), will let Hive know to 
delete the table data when the table is dropped. This property will be set by 
the HiveStrictManagedMigration utility when managed tables are converted to 
external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Review Request 67608: HIVE-19898 Disable TransactionalValidationListener when the table is not in the Hive catalog

2018-06-14 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67608/
---

Review request for hive and Eugene Koifman.


Bugs: HIVE-19898
https://issues.apache.org/jira/browse/HIVE-19898


Repository: hive-git


Description
---

- Only run TransactionalValidationListener for hive catalog
- Added unit test
- Listener also did not seem to be getting the configuration that the metastore 
was being initialized with - made a change to how the conf was being retrieved.


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestTransactionalValidationListener.java
 PRE-CREATION 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/TransactionalValidationListener.java
 56da1151cc 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/MetaStoreClientTest.java
 a0e9d32546 


Diff: https://reviews.apache.org/r/67608/diff/1/


Testing
---


Thanks,

Jason Dere

[jira] [Created] (HIVE-19898) Disable TransactionalValidationListener when the table is not in the Hive catalog

2018-06-14 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19898:
-

 Summary: Disable TransactionalValidationListener when the table is 
not in the Hive catalog
 Key: HIVE-19898
 URL: https://issues.apache.org/jira/browse/HIVE-19898
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Standalone Metastore
Reporter: Jason Dere
Assignee: Jason Dere


The TransactionalValidationListener does validation of tables specified as 
transactional tables, as well as enforcing create.as.acid. While this can be 
useful to Hive, this may not be useful to other catalogs which do not support 
transactional tables, and would not benefit from being automatically tagged as 
a transactional table. This should be changed so the 
TransactionalValidationListener does not run for non-hive catalogs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19892) Disable query results cache for for HiveServer2 doAs=true

2018-06-14 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19892:
-

 Summary: Disable query results cache for for HiveServer2 doAs=true
 Key: HIVE-19892
 URL: https://issues.apache.org/jira/browse/HIVE-19892
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


If running HS2 with doAs=true, the temp query results directory will have 
ownership/permissions based on the doAs user. A subsequent query running as a 
different user may not be able to access this query results directory. Results 
caching will have to be disabled in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19883) QTestUtil: initDataset() can be affected by the settings of the previous test

2018-06-13 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19883:
-

 Summary: QTestUtil: initDataset() can be affected by the settings 
of the previous test
 Key: HIVE-19883
 URL: https://issues.apache.org/jira/browse/HIVE-19883
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Jason Dere


Tried creating a test that set 
metastore.create.as.acid/hive.create.as.insert.only, and I found that the 
built-in table default.src was being created as an insert-only transactional 
table, which will cause errors in other tests that do not set the TxnManager to 
one that supports transactional tables.

It appears that initDataset() uses the old CliDriver that was used for the 
previous test, which has any settings used during that test:
{noformat}
java.lang.Exception: Creating src
at 
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4926) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:428) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2659) 
[hive-exec-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2311) 
[hive-exec-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1982) 
[hive-exec-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1683) 
[hive-exec-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1677) 
[hive-exec-4.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.QTestUtil.initDataset(QTestUtil.java:1277) 
[classes/:?]
at 
org.apache.hadoop.hive.ql.QTestUtil.initDataSetForTest(QTestUtil.java:1259) 
[classes/:?]
at org.apache.hadoop.hive.ql.QTestUtil.cliInit(QTestUtil.java:1328) 
[classes/:?]
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:176)
 [classes/:?]
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) 
[classes/:?]
at 
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:59)
 [test-classes/:?]
{noformat}

A new CliDriver is created for the new test, but only after we've created the 
dataset tables for the next test (see the line numbers for QTestUtil.cliInit() 
in both stack traces).
{noformat}
CliSessionState(SessionState).getConf() line: 317   
CliDriver.() line: 110
QTestUtil.cliInit(File, boolean) line: 1360 
CoreCliDriver.runTest(String, String, String) line: 176 
CoreCliDriver(CliAdapter).runTest(String, File) line: 104   
TestMiniLlapLocalCliDriver.testCliDriver() line: 59 
{noformat}

I think fix is to move the creation of the new CliDriver higher up in 
QTestUtil.cliInit(), before we call initDataset().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Review Request 67540: HIVE-19861 Fix temp table path generation for acid table export

2018-06-11 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67540/
---

Review request for hive and Eugene Koifman.


Bugs: HIVE-19861
https://issues.apache.org/jira/browse/HIVE-19861


Repository: hive-git


Description
---

Change DDLTask so temp tables do not get location generated.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e06949928d 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java 
209fdfb287 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 2e055aba4b 
  ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 04292787a8 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 83490d2d53 


Diff: https://reviews.apache.org/r/67540/diff/1/


Testing
---


Thanks,

Jason Dere

[jira] [Created] (HIVE-19861) Fix temp table path generation for acid table export

2018-06-11 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19861:
-

 Summary: Fix temp table path generation for acid table export
 Key: HIVE-19861
 URL: https://issues.apache.org/jira/browse/HIVE-19861
 Project: Hive
  Issue Type: Bug
  Components: Import/Export, Transactions
Reporter: Jason Dere
Assignee: Jason Dere


Temp tables that are analyzed by the SemanticAnalyzer get their default 
location set to a location in the session directory. Export of Acid tables also 
creates temp tables, but this is done via a plan transformation, and the temp 
table creation never goes through the SemanticAnalyzer, meaning the location is 
not set. There is some other logic in DDLTask (which I am changing in 
HIV-19837) which ends up automatically setting this path to the default table 
location in the warehouse directory. This should be fixed so that the path 
defaults to a location in the session directory, like with normal temp tables.

cc [~ekoifman]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19837) Setting to have different default location for external tables

2018-06-08 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19837:
-

 Summary: Setting to have different default location for external 
tables
 Key: HIVE-19837
 URL: https://issues.apache.org/jira/browse/HIVE-19837
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Allow external tables to have a different default location than managed tables



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19778) Flaky test: TestCliDriver#input31

2018-06-03 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19778:
-

 Summary: Flaky test: TestCliDriver#input31
 Key: HIVE-19778
 URL: https://issues.apache.org/jira/browse/HIVE-19778
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Jason Dere


Noticed this one has been failing occasionally on precommit test runs.

{noformat}
Running: diff -a 
/home/hiveptest/35.193.227.186-hiveptest-1/apache-github-source-source/itests/qtest/target/qfile-results/clientpositive/input31.q.out
 
/home/hiveptest/35.193.227.186-hiveptest-1/apache-github-source-source/ql/src/test/results/clientpositive/input31.q.out
128c128
< 496
---
> 242
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19777) NPE in TezSessionState

2018-06-03 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19777:
-

 Summary: NPE in TezSessionState
 Key: HIVE-19777
 URL: https://issues.apache.org/jira/browse/HIVE-19777
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Jason Dere


Encountered while running "insert into table values (..)"

Looks like it is due to the fact that TezSessionState.close() sets console to 
null at the start of the method, and then calls getSession() which attempts to 
log to console.

{noformat}
java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.getSession(TezSessionState.java:711)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:646)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.closeIfNotDefault(TezSessionPoolManager.java:353)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.getSession(TezSessionPoolManager.java:467)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getUnmanagedSession(WorkloadManagerFederation.java:66)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.WorkloadManagerFederation.getSession(WorkloadManagerFederation.java:38)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:184) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2497) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2149) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1826) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1569) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1563) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) 
~[hive-cli-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_121]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_121]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_121]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_121]
at org.apache.hadoop.util.RunJar.run(RunJar.java:308) 
~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.util.RunJar.main(RunJar.java:222) 
~[hadoop-common-3.0.0.3.0.0.0-SNAPSHOT.jar:?]
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19768) Utility to convert tables to conform to Hive strict managed tables mode

2018-06-01 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19768:
-

 Summary: Utility to convert tables to conform to Hive strict 
managed tables mode
 Key: HIVE-19768
 URL: https://issues.apache.org/jira/browse/HIVE-19768
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


Create a utility that can check existing hive tables and convert them if 
necessary to conform to strict managed tables mode.
- Managed non-transactional ORC tables will be converted to full transactional 
tables
- Managed non-transactional tables of other types will be converted to 
insert-only transactional tables
- Tables with non-native storage/schema will be converted to external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19753) Strict managed tables mode in Hive

2018-05-31 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19753:
-

 Summary: Strict managed tables mode in Hive
 Key: HIVE-19753
 URL: https://issues.apache.org/jira/browse/HIVE-19753
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Create a mode in Hive which enforces that all managed tables are transactional 
(both full or insert-only tables allowed). Non-transactional tables, as well as 
non-native tables, must be created as external tables when this mode is enabled.
The idea would be that in strict managed tables mode all of the data written to 
managed tables would have been done through Hive.
The mode would be enabled using config setting hive.strict.managed.tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19563) Flaky test: TestMiniLlapLocalCliDriver.tez_vector_dynpart_hashjoin_1

2018-05-15 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19563:
-

 Summary: Flaky test: 
TestMiniLlapLocalCliDriver.tez_vector_dynpart_hashjoin_1
 Key: HIVE-19563
 URL: https://issues.apache.org/jira/browse/HIVE-19563
 Project: Hive
  Issue Type: Sub-task
  Components: Tests
Reporter: Jason Dere


{noformat}
Client Execution succeeded but contained differences (error code = 1) after 
executing tez_vector_dynpart_hashjoin_1.q 
407c407
< -13036 1
---
> -8915 1
410c410
< -8915 1
---
> -13036 1
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Review Request 67138: HIVE-4367 enhance TRUNCATE syntex to drop data of external table

2018-05-15 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67138/
---

Review request for hive and Teddy Choi.


Bugs: HIVE-4367
https://issues.apache.org/jira/browse/HIVE-4367


Repository: hive-git


Description
---

Allow TRUNCATE TABLE for external tables with FORCE option


Diffs
-

  itests/src/test/resources/testconfiguration.properties cf6d19a593 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
f0b9edaf01 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 09a4368984 
  ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g 3712a53521 
  ql/src/test/queries/clientpositive/truncate_external_force.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/truncate_external_force.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/67138/diff/1/


Testing
---

qtest


Thanks,

Jason Dere

[jira] [Created] (HIVE-19489) Disable stats autogather for external tables

2018-05-10 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19489:
-

 Summary: Disable stats autogather for external tables
 Key: HIVE-19489
 URL: https://issues.apache.org/jira/browse/HIVE-19489
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Jason Dere
Assignee: Jason Dere


Hive auto-gather of table statistics can result in incorrect generation of 
stats (and the stats being marked as accurate) in the case of external tables 
where the data is being written by external apps.
To avoid this issue, stats autogather will be disabled on external tables when 
loading/inserting into a table with existing data, if 
HIVE_DISABLE_UNSAFE_EXTERNALTABLE_OPERATIONS is enabled. In this situation, 
users should rely on explicitly calling ANALYZE TABLE on their external tables 
to make sure the stats are kept up-to-date.
Autogather of stats will still be allowed to occur on external tables in the 
case of INSERT OVERWRITE or LOAD DATA OVERWRITE, since the existing data is 
being removed and so the stats calculated on the inserted/loaded data should be 
accurate.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 66999: HIVE-19453

2018-05-08 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66999/#review202699
---




ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
Line 838 (original), 839 (patched)
<https://reviews.apache.org/r/66999/#comment284680>

Should the inputFileFormat expression be aliased, like 
'(inputFileFmt=inputFileFormat)?', and referenced in the line below as 
'$inputFileFmt?'



ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
Line 839 (original), 840 (patched)
<https://reviews.apache.org/r/66999/#comment284686>

Might be useful to be able to pass in SerDe params which are used to 
initialize the SerDe - this could be useful for some SerDes. For example 
LazySimpleSerDe allows you to pass in the field separator, or set the timestamp 
format etc.



ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
Lines 475 (patched)
<https://reviews.apache.org/r/66999/#comment284684>

Is this supposed to be set using the class name (String), or the actual 
class object (Class)?
Do the inputFormat/serde classes need to be validated here?



ql/src/test/queries/clientpositive/load_data_using_job.q
Lines 90 (patched)
<https://reviews.apache.org/r/66999/#comment284685>

Previously what would indicate to Hive that an INSERT plan was required, as 
opposed to just saving the data as-is like is done for a traditional LOAD DATA?


- Jason Dere


On May 8, 2018, 6:12 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66999/
> ---
> 
> (Updated May 8, 2018, 6:12 a.m.)
> 
> 
> Review request for hive, Jason Dere and Prasanth_J.
> 
> 
> Bugs: HIVE-19453
> https://issues.apache.org/jira/browse/HIVE-19453
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extend the load data statement to take the inputformat of the source files 
> and the serde to interpret it as parameter. For eg,
>  
> load data local inpath 
> '../../data/files/load_data_job/partitions/load_data_2_partitions.txt' INTO 
> TABLE srcbucket_mapjoin
> INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
> SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe';
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g a837d67b96 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 2b88ea651b 
>   ql/src/test/queries/clientpositive/load_data_using_job.q 3928f1fa07 
>   ql/src/test/results/clientpositive/llap/load_data_using_job.q.out 
> 116630c237 
> 
> 
> Diff: https://reviews.apache.org/r/66999/diff/1/
> 
> 
> Testing
> ---
> 
> Added a test to load_data_using_job.q
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>

[jira] [Created] (HIVE-19467) Make storage format configurable for temp tables created using LLAP external client

2018-05-08 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19467:
-

 Summary: Make storage format configurable for temp tables created 
using LLAP external client
 Key: HIVE-19467
 URL: https://issues.apache.org/jira/browse/HIVE-19467
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


Temp tables created for complex queries when using the LLAP external client are 
created using the default storage format. Default to orc, and make configurable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 66862: HIVE-19258 add originals support to MM tables (and make the conversion a metadata only operation)

2018-05-04 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66862/#review202395
---




ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java
Lines 553 (patched)
<https://reviews.apache.org/r/66862/#comment284304>

'fi' - comment chopped off?



ql/src/test/queries/clientpositive/mm_conversions.q
Lines 28 (patched)
<https://reviews.apache.org/r/66862/#comment284204>

No golden file changes for this test in this patch.


- Jason Dere


On May 3, 2018, 2:23 a.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66862/
> ---
> 
> (Updated May 3, 2018, 2:23 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 6358ff3002 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  7e17d5d888 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 3141a7e981 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 969c591917 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 183515a0ed 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java b25bb1de49 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 2337a350e6 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> b698c84080 
>   ql/src/test/queries/clientpositive/mm_conversions.q 55565a9428 
> 
> 
> Diff: https://reviews.apache.org/r/66862/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

Review Request 66887: HIVE-19336 Disable SMB/Bucketmap join for external tables

2018-05-01 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66887/
---

Review request for hive and Deepak Jaiswal.


Bugs: HIVE-19336
https://issues.apache.org/jira/browse/HIVE-19336


Repository: hive-git


Description
---

Disable SMB/Bucketmap join for external tables by default


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
7121bceb22 
  ql/src/test/queries/clientpositive/bucket_map_join_tez2.q 1361e32c1a 
  ql/src/test/queries/clientpositive/tez_smb_1.q ecfb0dcf79 
  ql/src/test/results/clientpositive/llap/bucket_map_join_tez2.q.out fa90ccd556 
  ql/src/test/results/clientpositive/llap/tez_smb_1.q.out faa948627e 


Diff: https://reviews.apache.org/r/66887/diff/1/


Testing
---

qfile tests


Thanks,

Jason Dere

Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing

2018-04-27 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66567/#review202074
---




ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
Line 341 (original), 352 (patched)
<https://reviews.apache.org/r/66567/#comment283728>

Can you just add a comment here describing why it is ok to hardcode 
bucketing version to 2 here?


- Jason Dere


On April 27, 2018, 1:14 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66567/
> ---
> 
> (Updated April 27, 2018, 1:14 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
> Matt McCline.
> 
> 
> Bugs: HIVE-18910
> https://issues.apache.org/jira/browse/HIVE-18910
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.
> 
> To keep backward compatibility, bucket_version is added as a table property, 
> resulting in high number of result updates.
> 
> 
> Diffs
> -
> 
>   hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 
>   hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
> 153613e6d0 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  924e233293 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  fe2b1c1f3c 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
>  03c28a33c8 
>   
> hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java
>  996329195c 
>   
> hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
>  f9ee9d9a03 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  caa00292b8 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> ab8ad77074 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  2b28a6677e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  cdb67dd786 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  2c23a7e94f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  a1be085ea5 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  82ba775286 
>   itests/src/test/resources/testconfiguration.properties 1a346593fd 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
> d4363fdf91 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 25035433c7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  a42c299537 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/keyseries/VectorKeySeriesSerializedImpl.java
>  86f466fc4e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
>  1bc3fdabac 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
> 71498a125c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 019682fb10 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java a51fdd322f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java 
> 7121bceb22 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/FixedBucketPruningOptimizer.java
>  5f65f638ca 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerOperatorFactory.java 
> 2be3c9b9a2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedDynPartitionOpti

[jira] [Created] (HIVE-19336) Disable SMB/Bucketmap join for external tables

2018-04-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19336:
-

 Summary: Disable SMB/Bucketmap join for external tables
 Key: HIVE-19336
 URL: https://issues.apache.org/jira/browse/HIVE-19336
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19335) Disable runtime filtering (semijoin reduction opt with bloomfilter) for external tables

2018-04-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19335:
-

 Summary: Disable runtime filtering (semijoin reduction opt with 
bloomfilter) for external tables
 Key: HIVE-19335
 URL: https://issues.apache.org/jira/browse/HIVE-19335
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere


Even with good stats runtime filtering can cause issues, if they are out of 
date things are even worse. Disable by default for external tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19334) Use actual file size rather than stats for fetch task optimization with external tables

2018-04-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19334:
-

 Summary: Use actual file size rather than stats for fetch task 
optimization with external tables
 Key: HIVE-19334
 URL: https://issues.apache.org/jira/browse/HIVE-19334
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19333) Disable operator tree branch removal using stats

2018-04-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19333:
-

 Summary: Disable operator tree branch removal using stats 
 Key: HIVE-19333
 URL: https://issues.apache.org/jira/browse/HIVE-19333
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere


Can result in wrong results if branch removal occurs due to out-of-date stats



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19332) Disable compute.query.using.stats for external table

2018-04-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19332:
-

 Summary: Disable compute.query.using.stats for external table
 Key: HIVE-19332
 URL: https://issues.apache.org/jira/browse/HIVE-19332
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-19329) Disallow some optimizations/behaviors for external tables

2018-04-26 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19329:
-

 Summary: Disallow some optimizations/behaviors for external tables
 Key: HIVE-19329
 URL: https://issues.apache.org/jira/browse/HIVE-19329
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere


External tables in Hive are often used in situations where the data is being 
created and managed by other applications outside of Hive. There are several 
issues that can occur when data being written to table directories by external 
apps:
- If an application is writing files to a table/partition at the same time that 
Hive tries to merge files for the same table/partition (ALTER TABLE 
CONCATENATE, or hive.merge.tezfiles during insert) data can be lost.
- When new data has been added to the table by external applications, the Hive 
table statistics are often way out of date with the current state of the data. 
This can result in wrong results in the case of answering queries using stats, 
or bad query plans being generated.

Some of these operations should be blocked in Hive. It looks like some already 
have been (HIVE-17403).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing

2018-04-25 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66567/#review201975
---




hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java
Lines 179 (patched)
<https://reviews.apache.org/r/66567/#comment283611>

Check the existing table params for bucketing_version before hard-coding to 
v2.



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
Lines 143 (patched)
<https://reviews.apache.org/r/66567/#comment283612>

This derives from Operator? So it should already have the bucketingVersion 
field from that?



ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java
Line 339 (original), 339 (patched)
<https://reviews.apache.org/r/66567/#comment283613>

I think this change is no longer necessary.



standalone-metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/hive_metastoreConstants.java
Lines 89 (patched)
<https://reviews.apache.org/r/66567/#comment283614>

Is this no longer used?


- Jason Dere


On April 25, 2018, 7:21 a.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66567/
> ---
> 
> (Updated April 25, 2018, 7:21 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
> Matt McCline.
> 
> 
> Bugs: HIVE-18910
> https://issues.apache.org/jira/browse/HIVE-18910
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.
> 
> To keep backward compatibility, bucket_version is added as a table property, 
> resulting in high number of result updates.
> 
> 
> Diffs
> -
> 
>   hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 
>   hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
> 153613e6d0 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  924e233293 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  fe2b1c1f3c 
>   
> hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java
>  996329195c 
>   
> hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
>  f9ee9d9a03 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  caa00292b8 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> ab8ad77074 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  2b28a6677e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  cdb67dd786 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  2c23a7e94f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  a1be085ea5 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  82ba775286 
>   itests/src/test/resources/testconfiguration.properties 2c1a76d89b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
> d4363fdf91 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6395c31ec7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/keyseries/VectorKeySeriesSerializedImpl.java
>  86f466fc4e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkCommonOperator.java
>  4077552a56 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/reducesink/VectorReduceSinkObjectHashOperator.java
>  1bc3fdabac 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
> 71498a125c

Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing

2018-04-24 Thread Jason Dere



> On April 24, 2018, 11:29 p.m., Jason Dere wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
> > Line 156 (original), 156 (patched)
> > <https://reviews.apache.org/r/66567/diff/1-5/?file=1996135#file1996135line156>
> >
> > What is the point of the conf and HIVE_BUCKETING_JAVA_HASH, is it 
> > supposed to be for testing? I don't see this setting being used anywhere.
> 
> Deepak Jaiswal wrote:
> The setting lets user to use old bucketing logic if they want to. I am 
> working on a testcase to cover it.

Why not just allow users to set bucketing_version in the tble prpperties?


> On April 24, 2018, 11:29 p.m., Jason Dere wrote:
> > ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out
> > Lines 181 (patched)
> > <https://reviews.apache.org/r/66567/diff/1/?file=1996191#file1996191line181>
> >
> > Why did bucketing version disappear here?
> 
> Deepak Jaiswal wrote:
> I removed a place where I was setting it, possibly due to that.

Is it a bug that it is not being set here now?


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66567/#review201866
---


On April 23, 2018, 5:26 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66567/
> ---
> 
> (Updated April 23, 2018, 5:26 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
> Matt McCline.
> 
> 
> Bugs: HIVE-18910
> https://issues.apache.org/jira/browse/HIVE-18910
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.
> 
> To keep backward compatibility, bucket_version is added as a table property, 
> resulting in high number of result updates.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2403d7ac6c 
>   hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 
>   hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
> 153613e6d0 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  924e233293 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolver.java
>  5dd0b8ea5b 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java
>  ad14c7265f 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  3733e3d02f 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
>  03c28a33c8 
>   
> hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java
>  996329195c 
>   
> hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
>  f9ee9d9a03 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  caa00292b8 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> ab8ad77074 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  2b28a6677e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  cdb67dd786 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  2c23a7e94f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  a1be085ea5 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  82ba775286 
>   itests/src/test/resources/testconfiguration.properties 3aaa68b11f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a 
>   ql/src/jav

Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing

2018-04-24 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66567/#review201866
---




ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java
Line 138 (original), 139 (patched)
<https://reviews.apache.org/r/66567/#comment283474>

Remove?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
Line 156 (original), 156 (patched)
<https://reviews.apache.org/r/66567/#comment283478>

What is the point of the conf and HIVE_BUCKETING_JAVA_HASH, is it supposed 
to be for testing? I don't see this setting being used anywhere.



ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out
Lines 181 (patched)
<https://reviews.apache.org/r/66567/#comment283479>

Why did bucketing version disappear here?



ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
Lines 1605 (patched)
<https://reviews.apache.org/r/66567/#comment283473>

Can you use bucketing version from OpTraits, rather than having to redefine 
it here?



ql/src/test/results/clientpositive/results_cache_invalidation2.q.out
Lines 88 (patched)
<https://reviews.apache.org/r/66567/#comment283475>

This plan should not change to not using the cache. It's possible this is 
because of HIVE-19232.


- Jason Dere


On April 23, 2018, 5:26 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66567/
> ---
> 
> (Updated April 23, 2018, 5:26 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan, Eugene Koifman, Jason Dere, and 
> Matt McCline.
> 
> 
> Bugs: HIVE-18910
> https://issues.apache.org/jira/browse/HIVE-18910
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.
> 
> To keep backward compatibility, bucket_version is added as a table property, 
> resulting in high number of result updates.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2403d7ac6c 
>   hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 
>   hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
> 153613e6d0 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  924e233293 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolver.java
>  5dd0b8ea5b 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java
>  ad14c7265f 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  3733e3d02f 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
>  03c28a33c8 
>   
> hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java
>  996329195c 
>   
> hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
>  f9ee9d9a03 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  caa00292b8 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_table.q.out 
> ab8ad77074 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_directory.q.out
>  2b28a6677e 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_dynamic_partitions.q.out
>  cdb67dd786 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_overwrite_table.q.out
>  2c23a7e94f 
>   
> itests/hive-blobstore/src/test/results/clientpositive/write_final_output_blobstore.q.out
>  a1be085ea5 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  82ba775286 
>   itests/src/test/resources/testconfiguration.properties 3aaa68b11f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java c084fa054c 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d59bf1fb6e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java c28ef99621 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 21ca04d78a 
>

Re: Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2018-04-16 Thread Jason Dere



> On April 16, 2018, 7:45 p.m., Sergey Shelukhin wrote:
> > llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java
> > Lines 331 (patched)
> > <https://reviews.apache.org/r/66514/diff/3/?file=1996069#file1996069line331>
> >
> > is this related?

I threw that in, since this patch (plus this fix) also fixed 
TestAcidOnTez#testGetSplitsLocks


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66514/#review201250
---


On April 11, 2018, 7:58 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66514/
> ---
> 
> (Updated April 11, 2018, 7:58 p.m.)
> 
> 
> Review request for hive, Eugene Koifman and Sergey Shelukhin.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Replace usage of SessionState.getTxnMgr() from several places, by doing some 
> refactoring to make the TxnManager available in fields passed in during 
> construction/initialization:
> - SemanticAnalyzer.genFileSinkPlan()
> - ReplicationSemanticAnalyzer.analyzeReplLoad()
> - LoadSemanticAnalyzer.analyzeExternal()
> - ImportSemanticAnalyzer.prepareImport()
> - DDLSemanticAnalyzer.handleTransactionalTable()
> 
> 
> Diffs
> -
> 
>   
> llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 
> 3aec46be51 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bda2af3a04 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java 
> 6b333d7184 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java
>  60c85f58e5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java
>  bc7d0ad0b9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
>  06adc64727 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java
>  1395027159 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java
>  bb51f36a25 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7a7bdea89d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> f38b0bc546 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
> 8b639f7922 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> e49089b91e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java
>  e5af95b121 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
> 79b2e48ee2 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 10982ddbd1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java
>  3ccd639d62 
>   
> ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java
>  4cd75d8128 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6003ced27e 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
> fe570f0f8e 
> 
> 
> Diff: https://reviews.apache.org/r/66514/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Dere
> 
>

Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing

2018-04-16 Thread Jason Dere



> On April 14, 2018, 1:13 a.m., Jason Dere wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
> > Lines 813 (patched)
> > <https://reviews.apache.org/r/66567/diff/1/?file=1996736#file1996736line814>
> >
> > For these primitive types, might make sense to pre-allocate fixed size 
> > ByteBuffers of size 2/4/8 which can be used here rather than having to 
> > allocate new ones for every value.
> 
> Deepak Jaiswal wrote:
> That is how I did it before but it would send a byte array of length 8 
> all the time. The murmur function would consider all 8 bytes to generate 
> hash. When I noticed it was creating different hashes for same key, I found 
> the bug, hence the specific size allocation. Also, it wont affect the 
> efficiency.

What I mean is this is performing an allocation for every call to hashCode() 
here, which I think could affect the efficiency. This could be avoided by 
passing in pre-allocated arrays of each size to this method. Also, could you 
use the other version of hash32() where you can also pass in the array length - 
that way you could just use the same array of size 8, but pass in length 2/4/8 
depending on which type you are hashing.


> On April 14, 2018, 1:13 a.m., Jason Dere wrote:
> > serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
> > Lines 858 (patched)
> > <https://reviews.apache.org/r/66567/diff/1/?file=1996736#file1996736line859>
> >
> > Old impl (based on DateWritable.hashCode()) did hashCode based on 
> > daysSinceEpoc value, will be faster than doing toString()
> 
> Deepak Jaiswal wrote:
> The new one converts it into string format to get bytes array. Are you 
> suggesting what we get from getPrimitiveWritableObject is daysSinceEpoc? And 
> since it is integer, it is faster to convert it into byte array directly 
> rethet than doing "toString"?

Yes, DateWritable.toString() converts to Date, which then has to call 
toString() which means date conversion/formatting. Simpler to base it on the 
int value.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66567/#review201133
---


On April 12, 2018, 6:24 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66567/
> ---
> 
> (Updated April 12, 2018, 6:24 p.m.)
> 
> 
> Review request for hive, Eugene Koifman, Jason Dere, and Matt McCline.
> 
> 
> Bugs: HIVE-18910
> https://issues.apache.org/jira/browse/HIVE-18910
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive uses JAVA hash which is not as good as murmur for better distribution 
> and efficiency in bucketing a table.
> Migrate to murmur hash but still keep backward compatibility for existing 
> users so that they dont have to reload the existing tables.
> 
> To keep backward compatibility, bucket_version is added as a table property, 
> resulting in high number of result updates.
> 
> 
> Diffs
> -
> 
>   hbase-handler/src/test/results/positive/external_table_ppd.q.out cdc43ee560 
>   hbase-handler/src/test/results/positive/hbase_binary_storage_queries.q.out 
> 153613e6d0 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out ef3f5f704e 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 5d000d2f4f 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/AbstractRecordWriter.java
>  924e233293 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolver.java
>  5dd0b8ea5b 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolverImpl.java
>  7c2cadefa7 
>   
> hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/MutatorCoordinator.java
>  ad14c7265f 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  3733e3d02f 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
>  03c28a33c8 
>   
> hcatalog/webhcat/java-client/src/main/java/org/apache/hive/hcatalog/api/HCatTable.java
>  996329195c 
>   
> hcatalog/webhcat/java-client/src/test/java/org/apache/hive/hcatalog/api/TestHCatClient.java
>  f9ee9d9a03 
>   
> itests/hive-blobstore/src/test/results/clientpositive/insert_into_dynamic_partitions.q.out
>  caa00292b8 
>   
> itests/h

Re: Review Request 66567: Migrate to Murmur hash for shuffle and bucketing

2018-04-13 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66567/#review201133
---




hbase-handler/src/test/results/positive/external_table_ppd.q.out
Lines 59 (patched)
<https://reviews.apache.org/r/66567/#comment282148>

Are there any tests for the old-style bucketing, to make sure that 
previously created bucketed tables still work properly?



hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/mutate/worker/BucketIdResolverImpl.java
Lines 25 (patched)
<https://reviews.apache.org/r/66567/#comment282146>

Unnecessary change?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java
Lines 850 (patched)
<https://reviews.apache.org/r/66567/#comment282150>

missing comment?



ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
Line 1053 (original), 1051 (patched)
<https://reviews.apache.org/r/66567/#comment282162>

If this occurs every row, I wonder if it would be better to determine the 
bucketing version once during initializeOp() and create some object which knows 
which knows which bucketing hash code method to call here



ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
Lines 469 (patched)
<https://reviews.apache.org/r/66567/#comment282170>

should we validate that this is a valid bucketing version that we support?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java
Lines 639 (patched)
<https://reviews.apache.org/r/66567/#comment282173>

Do we also need to check the bucketing type in the case that op is not a 
TableScan? If op is a ReduceSink or Join, would that end up being 
bucketingVersion 2?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/AnnotateWithOpTraits.java
Lines 72 (patched)
<https://reviews.apache.org/r/66567/#comment282176>

Was this commented code for testing?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/metainfo/annotation/OpTraitsRulesProcFactory.java
Lines 411 (patched)
<https://reviews.apache.org/r/66567/#comment282178>

It seems to me a lot of the logic will treat -1 as bucketing version 1, 
since there are a lot of (bucketingVersion == 2 ? doVersion2 : doVersion1) 
statements. Where in the code would SMB be disabled because of -1 
bucketingVersion?



ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java
Lines 187 (patched)
<https://reviews.apache.org/r/66567/#comment282180>

Maybe make some common utility to parse/validate bucketing version, that 
both places can use?



ql/src/java/org/apache/hadoop/hive/ql/plan/TableDesc.java
Lines 198 (patched)
<https://reviews.apache.org/r/66567/#comment282179>

Validate bucketing version number?



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java
Lines 32 (patched)
<https://reviews.apache.org/r/66567/#comment282181>

Docs for this UDF will probably need to mention that this uses the old 
hashing/bucketing scheme which and that a new one has replaced it.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFMurmurHash.java
Lines 1 (patched)
<https://reviews.apache.org/r/66567/#comment282182>

Missing Apache header



serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
Lines 813 (patched)
<https://reviews.apache.org/r/66567/#comment282184>

For these primitive types, might make sense to pre-allocate fixed size 
ByteBuffers of size 2/4/8 which can be used here rather than having to allocate 
new ones for every value.



serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
Lines 858 (patched)
<https://reviews.apache.org/r/66567/#comment282185>

Old impl (based on DateWritable.hashCode()) did hashCode based on 
daysSinceEpoc value, will be faster than doing toString()



serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
Lines 866 (patched)
<https://reviews.apache.org/r/66567/#comment282187>

Faster to do hashcode based on the underlying values (totalMonths) rather 
than toString



serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
Lines 869 (patched)
<https://reviews.apache.org/r/66567/#comment282186>

Faster to do hashcode based on the underlying values (totalSeconds/nanos) 
rather than toString


- Jason Dere


On April 12, 2018, 6:24 p.m., Deepak Jaiswal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66567/
> -------
> 
> (Updated April 12, 2018, 6:24 p.m.)
> 
> 
> Review request for hive, Eugene Koifman, Jason Dere, and Matt McCline.
> 
> 
> Bugs: HIVE

Re: Review Request 64511: HIVE-18252 Limit the size of the object inspector caches

2018-04-12 Thread Jason Dere

/hive/serde2/objectinspector/primitive/WritableConstantIntObjectInspector.java
 129b681795 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantLongObjectInspector.java
 0452def8b4 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantShortObjectInspector.java
 3343b1ffc4 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantStringObjectInspector.java
 ba3183bf82 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantTimestampLocalTZObjectInspector.java
 bf461c0255 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableConstantTimestampObjectInspector.java
 dc8fedfdd8 
  
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/WritableVoidObjectInspector.java
 cdd87018f6 
  
serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroObjectInspectorGenerator.java
 3736a1f8fc 


Diff: https://reviews.apache.org/r/64511/diff/2/

Changes: https://reviews.apache.org/r/64511/diff/1-2/


Testing
---

Added Junit tests


Thanks,

Jason Dere

Re: Review Request 66368: HIVE-18609: Results cache invalidation based on table updates

2018-04-12 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66368/
---

(Updated April 12, 2018, 8:09 p.m.)


Review request for hive, Gopal V and Jesús Camacho Rodríguez.


Changes
---

- When removing invalid entries during lookup, make sure we have exited read 
lock section.
- Add results_cache_transactional.q to testconfiguration.properties


Bugs: HIVE-18609
https://issues.apache.org/jira/browse/HIVE-18609


Repository: hive-git


Description
---

- Save ValidTxnWriteIdList when saving query to the results cache.
- Compare the write ID list for each transactional table during results cache 
lookup.
- Add configuration to determine if queries with non-transactional tables 
should be cached.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd 
  itests/src/test/resources/testconfiguration.properties 48d62a8bf9 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java a88453c978 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
b1a3646624 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 44a7496136 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 
  ql/src/test/queries/clientpositive/results_cache_1.q 4aea60e1e5 
  ql/src/test/queries/clientpositive/results_cache_2.q 96a90925f6 
  ql/src/test/queries/clientpositive/results_cache_capacity.q 9f54577009 
  ql/src/test/queries/clientpositive/results_cache_empty_result.q 621367141e 
  ql/src/test/queries/clientpositive/results_cache_invalidation.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_lifetime.q 60ffe96a04 
  ql/src/test/queries/clientpositive/results_cache_quoted_identifiers.q 
4802f43ba9 
  ql/src/test/queries/clientpositive/results_cache_temptable.q 9e0de765cb 
  ql/src/test/queries/clientpositive/results_cache_transactional.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_with_masking.q b4fcdd57eb 
  ql/src/test/results/clientpositive/llap/results_cache_invalidation.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/results_cache_transactional.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_invalidation.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_transactional.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/66368/diff/4/

Changes: https://reviews.apache.org/r/66368/diff/3-4/


Testing
---

qtests added.


Thanks,

Jason Dere

Re: Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2018-04-11 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66514/
---

(Updated April 11, 2018, 7:58 p.m.)


Review request for hive, Eugene Koifman and Sergey Shelukhin.


Changes
---

Added comment to SessionState.getTxnMgr() about avoiding use of this call.


Repository: hive-git


Description
---

Replace usage of SessionState.getTxnMgr() from several places, by doing some 
refactoring to make the TxnManager available in fields passed in during 
construction/initialization:
- SemanticAnalyzer.genFileSinkPlan()
- ReplicationSemanticAnalyzer.analyzeReplLoad()
- LoadSemanticAnalyzer.analyzeExternal()
- ImportSemanticAnalyzer.prepareImport()
- DDLSemanticAnalyzer.handleTransactionalTable()


Diffs (updated)
-

  llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 
3aec46be51 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bda2af3a04 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 
  ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java 
6b333d7184 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java
 60c85f58e5 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java
 bc7d0ad0b9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 06adc64727 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java
 1395027159 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java
 bb51f36a25 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7a7bdea89d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
f38b0bc546 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
8b639f7922 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
e49089b91e 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java
 e5af95b121 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
79b2e48ee2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java
 3ccd639d62 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java 
4cd75d8128 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6003ced27e 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
fe570f0f8e 


Diff: https://reviews.apache.org/r/66514/diff/3/

Changes: https://reviews.apache.org/r/66514/diff/2-3/


Testing
---


Thanks,

Jason Dere

Re: Review Request 66533: HIVE-19154 Poll notification events to invalidate the results cache

2018-04-11 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66533/
---

(Updated April 11, 2018, 6:01 p.m.)


Review request for hive, Gopal V and Thejas Nair.


Changes
---

Using SessionState query timestamp as the cache entry's query time.


Bugs: HIVE-19154
https://issues.apache.org/jira/browse/HIVE-19154


Repository: hive-git


Description
---

- Create NotificationEventPoll to periodically query for notification events, 
and pass the events to any registered EventConsumers.
- Create InvalidationEventConsumer in QueryResultsCache to use the events to 
invalidate any results cache entries using the updated table.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd 
  itests/src/test/resources/testconfiguration.properties 48d62a8bf9 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 3cdad284ef 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
b1a3646624 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/events/EventConsumer.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/events/NotificationEventPoll.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 
  ql/src/test/queries/clientpositive/results_cache_invalidation2.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/results_cache_invalidation2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_invalidation2.q.out 
PRE-CREATION 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 47f84b5e73 


Diff: https://reviews.apache.org/r/66533/diff/2/

Changes: https://reviews.apache.org/r/66533/diff/1-2/


Testing
---


Thanks,

Jason Dere

Re: Review Request 66533: HIVE-19154 Poll notification events to invalidate the results cache

2018-04-11 Thread Jason Dere



> On April 11, 2018, 5:26 a.m., Gopal V wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java
> > Lines 470 (patched)
> > <https://reviews.apache.org/r/66533/diff/1/?file=1995232#file1995232line470>
> >
> > SessionState.get().getQueryCurrentTimestamp()
> > 
> > Possibly pass it in via QueryInfo?

Good suggestion, will make the change.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66533/#review200887
---


On April 10, 2018, 7:19 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66533/
> ---
> 
> (Updated April 10, 2018, 7:19 p.m.)
> 
> 
> Review request for hive, Gopal V and Thejas Nair.
> 
> 
> Bugs: HIVE-19154
> https://issues.apache.org/jira/browse/HIVE-19154
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> - Create NotificationEventPoll to periodically query for notification events, 
> and pass the events to any registered EventConsumers.
> - Create InvalidationEventConsumer in QueryResultsCache to use the events to 
> invalidate any results cache entries using the updated table.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd 
>   itests/src/test/resources/testconfiguration.properties 48d62a8bf9 
>   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 
> 3cdad284ef 
>   ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
> b1a3646624 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/events/EventConsumer.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/events/NotificationEventPoll.java
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/results_cache_invalidation2.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/results_cache_invalidation2.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/results_cache_invalidation2.q.out 
> PRE-CREATION 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 47f84b5e73 
> 
> 
> Diff: https://reviews.apache.org/r/66533/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Dere
> 
>

Re: Review Request 66368: HIVE-18609: Results cache invalidation based on table updates

2018-04-10 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66368/
---

(Updated April 11, 2018, 4:37 a.m.)


Review request for hive, Gopal V and Jesús Camacho Rodríguez.


Changes
---

Rebase with master


Bugs: HIVE-18609
https://issues.apache.org/jira/browse/HIVE-18609


Repository: hive-git


Description
---

- Save ValidTxnWriteIdList when saving query to the results cache.
- Compare the write ID list for each transactional table during results cache 
lookup.
- Add configuration to determine if queries with non-transactional tables 
should be cached.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd 
  itests/src/test/resources/testconfiguration.properties 48d62a8bf9 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java a88453c978 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
b1a3646624 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 44a7496136 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 10982ddbd1 
  ql/src/test/queries/clientpositive/results_cache_1.q 4aea60e1e5 
  ql/src/test/queries/clientpositive/results_cache_2.q 96a90925f6 
  ql/src/test/queries/clientpositive/results_cache_capacity.q 9f54577009 
  ql/src/test/queries/clientpositive/results_cache_empty_result.q 621367141e 
  ql/src/test/queries/clientpositive/results_cache_invalidation.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_lifetime.q 60ffe96a04 
  ql/src/test/queries/clientpositive/results_cache_quoted_identifiers.q 
4802f43ba9 
  ql/src/test/queries/clientpositive/results_cache_temptable.q 9e0de765cb 
  ql/src/test/queries/clientpositive/results_cache_transactional.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_with_masking.q b4fcdd57eb 
  ql/src/test/results/clientpositive/llap/results_cache_invalidation.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/results_cache_transactional.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_invalidation.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_transactional.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/66368/diff/3/

Changes: https://reviews.apache.org/r/66368/diff/2-3/


Testing
---

qtests added.


Thanks,

Jason Dere

Re: Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2018-04-10 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66514/
---

(Updated April 11, 2018, 1:58 a.m.)


Review request for hive, Eugene Koifman and Sergey Shelukhin.


Changes
---

Updating patch - missed a couple of uses of SessionState.getTxnMgr() from 
CalcitePlanner/MaterializedViewRebuildSemanticAnalyzer.
Also adding a couple of fixes to fix TestAcidOnTez which also depend on the 
rest of this patch.


Repository: hive-git


Description
---

Replace usage of SessionState.getTxnMgr() from several places, by doing some 
refactoring to make the TxnManager available in fields passed in during 
construction/initialization:
- SemanticAnalyzer.genFileSinkPlan()
- ReplicationSemanticAnalyzer.analyzeReplLoad()
- LoadSemanticAnalyzer.analyzeExternal()
- ImportSemanticAnalyzer.prepareImport()
- DDLSemanticAnalyzer.handleTransactionalTable()


Diffs (updated)
-

  llap-ext-client/src/java/org/apache/hadoop/hive/llap/LlapBaseInputFormat.java 
3aec46be51 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java bda2af3a04 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 
  ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java 
6b333d7184 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java
 60c85f58e5 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java
 bc7d0ad0b9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 06adc64727 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java
 1395027159 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java
 bb51f36a25 
  ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7a7bdea89d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
f38b0bc546 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
8b639f7922 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
e49089b91e 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/MaterializedViewRebuildSemanticAnalyzer.java
 e5af95b121 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
79b2e48ee2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7f0010855b 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java
 3ccd639d62 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java 
4cd75d8128 
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFGetSplits.java 
fe570f0f8e 


Diff: https://reviews.apache.org/r/66514/diff/2/

Changes: https://reviews.apache.org/r/66514/diff/1-2/


Testing
---


Thanks,

Jason Dere

[jira] [Created] (HIVE-19156) TestMiniLlapLocalCliDriver.vectorized_dynamic_semijoin_reduction.q is broken

2018-04-10 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19156:
-

 Summary: 
TestMiniLlapLocalCliDriver.vectorized_dynamic_semijoin_reduction.q is broken
 Key: HIVE-19156
 URL: https://issues.apache.org/jira/browse/HIVE-19156
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Jason Dere
Assignee: Jason Dere


Looks like this test has been broken for some time



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Review Request 66533: HIVE-19154 Poll notification events to invalidate the results cache

2018-04-10 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66533/
---

Review request for hive, Gopal V and Thejas Nair.


Bugs: HIVE-19154
https://issues.apache.org/jira/browse/HIVE-19154


Repository: hive-git


Description
---

- Create NotificationEventPoll to periodically query for notification events, 
and pass the events to any registered EventConsumers.
- Create InvalidationEventConsumer in QueryResultsCache to use the events to 
invalidate any results cache entries using the updated table.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e540d023bd 
  itests/src/test/resources/testconfiguration.properties 48d62a8bf9 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 3cdad284ef 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
b1a3646624 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/events/EventConsumer.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/metadata/events/NotificationEventPoll.java
 PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_invalidation2.q PRE-CREATION 
  ql/src/test/results/clientpositive/llap/results_cache_invalidation2.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_invalidation2.q.out 
PRE-CREATION 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 47f84b5e73 


Diff: https://reviews.apache.org/r/66533/diff/1/


Testing
---


Thanks,

Jason Dere

[jira] [Created] (HIVE-19154) Poll notification events to invalidate the results cache

2018-04-10 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19154:
-

 Summary: Poll notification events to invalidate the results cache
 Key: HIVE-19154
 URL: https://issues.apache.org/jira/browse/HIVE-19154
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


Related to the work for HIVE-18609. HIVE-18609 will only invalidate entries in 
the cache if that query looked up again, which could potentially leave a lot of 
undetected invalid entries in the cache taking up space which could cause other 
entries to be evicted. To remove these entries in a more timely fashion, have a 
background thread to periodically check the notification events for updates to 
the tables used in the results cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 66368: HIVE-18609: Results cache invalidation based on table updates

2018-04-09 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66368/
---

(Updated April 10, 2018, 12:29 a.m.)


Review request for hive, Gopal V and Jesús Camacho Rodríguez.


Changes
---

Rebase with master


Bugs: HIVE-18609
https://issues.apache.org/jira/browse/HIVE-18609


Repository: hive-git


Description
---

- Save ValidTxnWriteIdList when saving query to the results cache.
- Compare the write ID list for each transactional table during results cache 
lookup.
- Add configuration to determine if queries with non-transactional tables 
should be cached.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 0627c35378 
  itests/src/test/resources/testconfiguration.properties 28c14ebc4c 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 79db006c74 
  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
ac5ae573d6 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 44a7496136 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b74abacf3 
  ql/src/test/queries/clientpositive/results_cache_1.q 4aea60e1e5 
  ql/src/test/queries/clientpositive/results_cache_2.q 96a90925f6 
  ql/src/test/queries/clientpositive/results_cache_capacity.q 9f54577009 
  ql/src/test/queries/clientpositive/results_cache_empty_result.q 621367141e 
  ql/src/test/queries/clientpositive/results_cache_invalidation.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_lifetime.q 60ffe96a04 
  ql/src/test/queries/clientpositive/results_cache_quoted_identifiers.q 
4802f43ba9 
  ql/src/test/queries/clientpositive/results_cache_temptable.q 9e0de765cb 
  ql/src/test/queries/clientpositive/results_cache_transactional.q PRE-CREATION 
  ql/src/test/queries/clientpositive/results_cache_with_masking.q b4fcdd57eb 
  ql/src/test/results/clientpositive/llap/results_cache_invalidation.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/llap/results_cache_transactional.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_invalidation.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/results_cache_transactional.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/66368/diff/2/

Changes: https://reviews.apache.org/r/66368/diff/1-2/


Testing
---

qtests added.


Thanks,

Jason Dere

Re: Review Request 66516: HIVE-19138: Results cache: allow queries waiting on pending cache entries to check cache again if pending query fails

2018-04-09 Thread Jason Dere



> On April 9, 2018, 10:23 p.m., Gopal V wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> > Line 14642 (original), 14643 (patched)
> > <https://reviews.apache.org/r/66516/diff/1/?file=1994429#file1994429line14643>
> >
> > Does the loop only exit if cacheEntry is non-null?

The loop is a do .. while(false), which normally should exit after a single 
iteration. The loop should only continue to iterate in the event that 
cacheEntry.waitForValidStatus() returned false.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66516/#review200772
---


On April 9, 2018, 9:53 p.m., Jason Dere wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66516/
> ---
> 
> (Updated April 9, 2018, 9:53 p.m.)
> 
> 
> Review request for hive, Deepak Jaiswal and Gopal V.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> If the pending query fails, allow Hive to try to check the cache again in 
> case the cache has another cached/pending result that can be used to answer 
> the query.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 3b74abacf3 
> 
> 
> Diff: https://reviews.apache.org/r/66516/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jason Dere
> 
>

Review Request 66516: HIVE-19138: Results cache: allow queries waiting on pending cache entries to check cache again if pending query fails

2018-04-09 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66516/
---

Review request for hive, Deepak Jaiswal and Gopal V.


Repository: hive-git


Description
---

If the pending query fails, allow Hive to try to check the cache again in case 
the cache has another cached/pending result that can be used to answer the 
query.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 3b74abacf3 


Diff: https://reviews.apache.org/r/66516/diff/1/


Testing
---


Thanks,

Jason Dere

[jira] [Created] (HIVE-19138) Results cache: allow queries waiting on pending cache entries to check cache again if pending query fails

2018-04-09 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19138:
-

 Summary: Results cache: allow queries waiting on pending cache 
entries to check cache again if pending query fails
 Key: HIVE-19138
 URL: https://issues.apache.org/jira/browse/HIVE-19138
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


HIVE-18846 allows the results cache to refer to currently executing queries so 
that another query can wait for these results to become ready in the results 
cache. If the pending query fails then Hive will automatically skip the cache 
and do the full query compilation. Make a fix here so that if the pending query 
fails, Hive will still try to check the cache again in case the cache has 
another cached/pending result that can be used to answer the query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Review Request 66514: HIVE-17645 MM tables patch conflicts with HIVE-17482 (Spark/Acid integration)

2018-04-09 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66514/
---

Review request for hive, Eugene Koifman and Sergey Shelukhin.


Repository: hive-git


Description
---

Replace usage of SessionState.getTxnMgr() from several places, by doing some 
refactoring to make the TxnManager available in fields passed in during 
construction/initialization:
- SemanticAnalyzer.genFileSinkPlan()
- ReplicationSemanticAnalyzer.analyzeReplLoad()
- LoadSemanticAnalyzer.analyzeExternal()
- ImportSemanticAnalyzer.prepareImport()
- DDLSemanticAnalyzer.handleTransactionalTable()


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java fb1efe01dc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java a8d851fd81 
  ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/ReplLoadTask.java 
6b333d7184 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadConstraint.java
 60c85f58e5 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/LoadFunction.java
 bc7d0ad0b9 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadPartitions.java
 06adc64727 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/table/LoadTable.java
 1395027159 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/load/util/Context.java
 bb51f36a25 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
9e66422904 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
8b639f7922 
  ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
e49089b91e 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ReplicationSemanticAnalyzer.java 
79b2e48ee2 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ff0a2e6a1b 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/MessageHandler.java
 3ccd639d62 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/TableHandler.java 
4cd75d8128 


Diff: https://reviews.apache.org/r/66514/diff/1/


Testing
---


Thanks,

Jason Dere

Review Request 66486: HIVE-19127 Concurrency fixes in QueryResultsCache

2018-04-06 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66486/
---

Review request for hive, Deepak Jaiswal and Gopal V.


Bugs: HIVE-19127
https://issues.apache.org/jira/browse/HIVE-19127


Repository: hive-git


Description
---

- Take a lock on the cache entry when in the process of setting the cache entry 
from PENDING state to VALID state, so that other threads cannot invalidate the 
entry
- The write lock on the cache was not being taken when removing an entry from 
the cache.
- synchronize access when iterating through the lru keyset


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java 
ac5ae573d6 


Diff: https://reviews.apache.org/r/66486/diff/1/


Testing
---


Thanks,

Jason Dere

[jira] [Created] (HIVE-19127) Concurrency fixes in QueryResultsCache

2018-04-06 Thread Jason Dere (JIRA)

Jason Dere created HIVE-19127:
-

 Summary: Concurrency fixes in QueryResultsCache
 Key: HIVE-19127
 URL: https://issues.apache.org/jira/browse/HIVE-19127
 Project: Hive
  Issue Type: Sub-task
Reporter: Jason Dere
Assignee: Jason Dere


A few fixes around concurrent access in the results cache
- Take a lock on the cache entry when in the process of setting the cache entry 
from PENDING state to VALID state, so that other threads cannot invalidate the 
entry
- The write lock on the cache was not being taken when removing an entry from 
the cache.
- synchronize access when iterating through the lru keyset



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Re: Review Request 66201: HIVE-19014 utilize YARN-8028 (queue ACL check) in Hive Tez session pool

2018-04-02 Thread Jason Dere


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66201/#review200326
---




ql/src/java/org/apache/hadoop/hive/ql/exec/tez/YarnQueueHelper.java
Lines 117 (patched)
<https://reviews.apache.org/r/66201/#comment281026>

clean up whitespace.


- Jason Dere


On April 2, 2018, 10:13 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66201/
> ---
> 
> (Updated April 2, 2018, 10:13 p.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 02367eb433 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java ed3984efe8 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 1de333e985 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java 
> a051f90195 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java a5f4cb7539 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/YarnQueueHelper.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> ed1c0abdf2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFLoggedInUser.java 
> 3ed793ec48 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 6308c5cd4f 
> 
> 
> Diff: https://reviews.apache.org/r/66201/diff/6/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1978 matches

Mail list logo