[jira] [Created] (HIVE-22019) alter_table_update_status/alter_table_update_status_disable_bitvector/alter_partition_update_status fail when DbNotificationListener is installed

2019-07-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-22019:
-

 Summary: 
alter_table_update_status/alter_table_update_status_disable_bitvector/alter_partition_update_status
 fail when DbNotificationListener is installed
 Key: HIVE-22019
 URL: https://issues.apache.org/jira/browse/HIVE-22019
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Dai


Statement like:
ALTER TABLE src_stat_n0 UPDATE STATISTICS for column key SET 
('numDVs'='','avgColLen'='1.111')
fail when DbNotificationListener is installed with the message:
{code}
See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or 
check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ 
for specific test cases logs.
 org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.IllegalArgumentException: Could not serialize 
JSONUpdateTableColumnStatMessage : 
 at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:5350)
 at 
org.apache.hadoop.hive.ql.exec.ColumnStatsUpdateTask.persistColumnStats(ColumnStatsUpdateTask.java:339)
 at 
org.apache.hadoop.hive.ql.exec.ColumnStatsUpdateTask.execute(ColumnStatsUpdateTask.java:347)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2343)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1995)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1662)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1422)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1416)
 at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
 at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
 at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:242)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:189)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:408)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:340)
 at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:680)
 at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:651)
 at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:182)
 at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104)
 at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver(TestCliDriver.java:59)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 at org.junit.runners.Suite.runChild(Suite.java:127)
 at org.junit.runners.Suite.runChild(Suite.java:26)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
 at 
org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
 at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
 

[jira] [Created] (HIVE-22018) Add table id to HMS get methods

2019-07-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-22018:
-

 Summary: Add table id to HMS get methods
 Key: HIVE-22018
 URL: https://issues.apache.org/jira/browse/HIVE-22018
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Dai


It is possible we remove a table and immediately move another table to occupy 
the same name. CachedStore may retrieve the wrong table in this case. We shall 
add tableid in every get_(table/partition) api, so we can compare the one 
stored in TBLS (tableid is part of Table object) and check if the same id, if 
not, HMS shall fail the read request. The initial table id can be retrieved 
along with writeid (in DbTxnManager.getValidWriteIds call, to join the TBLS 
table)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22017) HMS interface backward compatible after HIVE-21637

2019-07-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-22017:
-

 Summary: HMS interface backward compatible after HIVE-21637
 Key: HIVE-22017
 URL: https://issues.apache.org/jira/browse/HIVE-22017
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Dai


HIVE-21637 changes a bunch HMS interface to add writeid into all get_xxx calls. 
Ideally we shall provide original version and forward it to the new api to make 
the change backward compatible. The downside is double the size of HMS methods. 
We shall mark those deprecated and remove in future version.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22016) Do not open transaction for readonly query

2019-07-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-22016:
-

 Summary: Do not open transaction for readonly query
 Key: HIVE-22016
 URL: https://issues.apache.org/jira/browse/HIVE-22016
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Dai


Open/abort/commit transaction would increment transaction id which is a burden 
unnecessarily. In addition, it spams the notification log and make CachedStore 
(and of cause other components rely on notification log) harder to catch up.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22015) Cache table constraints in CachedStore

2019-07-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-22015:
-

 Summary: Cache table constraints in CachedStore
 Key: HIVE-22015
 URL: https://issues.apache.org/jira/browse/HIVE-22015
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Dai


Currently table constraints are not cached. Hive will pull all constraints from 
tables involved in query, which results multiple db reads (including 
get_primary_keys, get_foreign_keys, get_unique_constraints, etc). The effort to 
cache this is small as it's just another table component.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22014) Tear down locks in CachedStore

2019-07-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-22014:
-

 Summary: Tear down locks in CachedStore
 Key: HIVE-22014
 URL: https://issues.apache.org/jira/browse/HIVE-22014
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Dai


There's a lot of locks in CachedStore. After HIVE-21637, only notification log 
puller thread will update the cache. And when it process event, the first thing 
is to mark the entry invalid. The only exception may be 
TableWrapperSizeUpdater, but we can also make it synchronous (maybe run it once 
after every iteration of notification log puller). There should be no 
synchronization issue and we can tear down existing locks to simplify the code.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-21697) Remove periodical full refresh in HMS cache

2019-05-06 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21697:
-

 Summary: Remove periodical full refresh in HMS cache
 Key: HIVE-21697
 URL: https://issues.apache.org/jira/browse/HIVE-21697
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


In HIVE-18661, we added periodical notification based refresh in HMS cache. We 
shall remove periodical full refresh to simplify the code as it will no longer 
be used. In the mean time, we introduced mechanism to provide monotonic reads 
through the CachedStore.commitTransaction. This will no longer be needed after 
HIVE-21637. So I will remove related code as well. This will provide some 
performance benefits include:
1. We don't have to slow down write to catch up notification logs. Write can be 
done immediately and tag the cache with writeids
2. We can read from cache even if updateUsingNotificationEvents is running. 
Read will compare the writeids of the cache so monotonic reads will be 
guaranteed

I'd like to put a patch separately with HIVE-21637 so it can be tested 
independently. HMW will use periodical notification based refresh to update 
cache. And it will temporarily lift the monotonic reads guarantee until 
HIVE-21637 checkin.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21637) Synchronized metastore cache

2019-04-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21637:
-

 Summary: Synchronized metastore cache
 Key: HIVE-21637
 URL: https://issues.apache.org/jira/browse/HIVE-21637
 Project: Hive
  Issue Type: New Feature
Reporter: Daniel Dai
Assignee: Daniel Dai


Currently, HMS has a cache implemented by CachedStore. The cache is 
asynchronized and in HMS HA setting, we can only get eventual consistency. In 
this Jira, we try to make it synchronized.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21625) Fix TxnIdUtils.checkEquivalentWriteIds, also provides a comparison method

2019-04-17 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21625:
-

 Summary: Fix TxnIdUtils.checkEquivalentWriteIds, also provides a 
comparison method
 Key: HIVE-21625
 URL: https://issues.apache.org/jira/browse/HIVE-21625
 Project: Hive
  Issue Type: Bug
 Environment: TxnIdUtils.checkEquivalentWriteIds has a bug which thinks 
({1,2,3,4}, 6) and ({1,2,3,4,5,6}, 8) compatible (the notation is (invalidlist, 
hwm)). Here is a patch to fix it, also provide a comparison method to check 
which is newer.
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-21625.1.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21583) KillTriggerActionHandler should use "hive" credential

2019-04-04 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21583:
-

 Summary: KillTriggerActionHandler should use "hive" credential
 Key: HIVE-21583
 URL: https://issues.apache.org/jira/browse/HIVE-21583
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Currently SessionState.username is set to null, which is invalid as 
KillQueryImplementation will valid the user privilege.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21479) NPE during metastore cache update

2019-03-19 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21479:
-

 Summary: NPE during metastore cache update
 Key: HIVE-21479
 URL: https://issues.apache.org/jira/browse/HIVE-21479
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Saw the following stack during a long periodical update:
{code}
2019-03-12T10:01:43,015 ERROR [CachedStore-CacheUpdateService: Thread-36] 
cache.CachedStore: Update failure:java.lang.NullPointerException
at 
org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.updateTableColStats(CachedStore.java:508)
at 
org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.update(CachedStore.java:461)
at 
org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:396)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

The reason is we get the table list at very early stage and then refresh table 
one by one. It is likely table is removed during the interim. We need to deal 
with this case during cache update.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21478) Metastore cache update shall capture exception

2019-03-19 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21478:
-

 Summary: Metastore cache update shall capture exception
 Key: HIVE-21478
 URL: https://issues.apache.org/jira/browse/HIVE-21478
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-21478.1.patch

We definitely need to capture any exception during 
CacheUpdateMasterWork.update(), otherwise, Java would refuse to schedule future 
update().



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21389) Hive distribution miss javax.ws.rs-api.jar after HIVE-21247

2019-03-04 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21389:
-

 Summary: Hive distribution miss javax.ws.rs-api.jar after 
HIVE-21247
 Key: HIVE-21389
 URL: https://issues.apache.org/jira/browse/HIVE-21389
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21379) Mask password in DDL commands for table properties

2019-03-04 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21379:
-

 Summary: Mask password in DDL commands for table properties
 Key: HIVE-21379
 URL: https://issues.apache.org/jira/browse/HIVE-21379
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-21379.1.patch

We need to mask password related table properties (such as 
hive.sql.dbcp.password) in DDL output, such as describe extended/describe 
formatted/show create table/show tblproperties.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21296) Dropping varchar partition throw exception

2019-02-19 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21296:
-

 Summary: Dropping varchar partition throw exception
 Key: HIVE-21296
 URL: https://issues.apache.org/jira/browse/HIVE-21296
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Drop partition fail if the partition column is varchar. For example:
{code:java}
create external table BS_TAB_0_211494(c_date_SAD_29630 date) PARTITIONED BY 
(part_varchar_37229 varchar(56)) STORED AS orc;

INSERT INTO BS_TAB_0_211494 values('4740-04-04','BrNTRsv3c');

ALTER TABLE BS_TAB_0_211494 DROP PARTITION 
(part_varchar_37229='BrNTRsv3c');{code}

Exception:
{code}
2019-02-19T22:12:55,843  WARN [HiveServer2-Handler-Pool: Thread-42] 
thrift.ThriftCLIService: Error executing statement: 
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: 
FAILED: SemanticException [Error 10006]: Partition not found 
(part_varchar_37229 = 'BrNTRsv3c')
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:356)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:206)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:269)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:268) 
~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:576)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:561)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_202]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_202]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_202]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_202]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_202]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 ~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy43.executeStatementAsync(Unknown Source) ~[?:?]
at 
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:568)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_202]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_202]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Partition not 
found (part_varchar_37229 = 'BrNTRsv3c')
at 
org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.addTableDropPartsOutputs(DDLSemanticAnalyzer.java:4110)
 

[jira] [Created] (HIVE-21295) StorageHandler shall convert date to string using Hive convention

2019-02-19 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21295:
-

 Summary: StorageHandler shall convert date to string using Hive 
convention
 Key: HIVE-21295
 URL: https://issues.apache.org/jira/browse/HIVE-21295
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-21295.1.patch

If we have date datatype in mysql, string datatype defined in hive, 
JdbcStorageHandler will translate the date to string with the format -MM-dd 
HH:mm:ss. However, Hive convention is -MM-dd, we shall follow Hive 
convention. Eg:

mysql: CREATE TABLE test ("datekey" DATE);
hive: CREATE TABLE test (datekey string) STORED BY 
'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES 
(.."hive.sql.table" = "test"..);

Then in hive, do: select datekey from test;

We get: 1999-03-24 00:00:00

But should be: 1999-03-24



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21255) Remove QueryConditionBuilder in JdbcStorageHandler

2019-02-12 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21255:
-

 Summary: Remove QueryConditionBuilder in JdbcStorageHandler
 Key: HIVE-21255
 URL: https://issues.apache.org/jira/browse/HIVE-21255
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai


QueryConditionBuilder is not correctly implemented. Always see the following 
exception even the query finish succeeded:
{code}
2019-02-13 01:09:53,406 [ERROR] [TezChild] |jdbc.QueryConditionBuilder|: Error 
during condition build
java.lang.ArrayIndexOutOfBoundsException: 0
at java.beans.XMLDecoder.readObject(XMLDecoder.java:250)
at 
org.apache.hive.storage.jdbc.QueryConditionBuilder.createConditionString(QueryConditionBuilder.java:125)
at 
org.apache.hive.storage.jdbc.QueryConditionBuilder.buildCondition(QueryConditionBuilder.java:74)
at 
org.apache.hive.storage.jdbc.conf.JdbcStorageConfigManager.getQueryToExecute(JdbcStorageConfigManager.java:155)
at 
org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getRecordIterator(GenericJdbcDatabaseAccessor.java:158)
at 
org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:58)
at 
org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:35)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
We don't actually need QueryConditionBuilder when cbo is enabled since 
predicate push down is handled by calcite (HIVE-20822). One can argue when cbo 
is disabled we might still need that since calcite will not do the push down, 
but that's a minor code path and removing QueryConditionBuilder won't cause any 
correctness issue. So I'd like to remove it for simplicity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21253) Support DB2 in JDBC StorageHandler

2019-02-12 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21253:
-

 Summary: Support DB2 in JDBC StorageHandler
 Key: HIVE-21253
 URL: https://issues.apache.org/jira/browse/HIVE-21253
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Daniel Dai
Assignee: Daniel Dai


Make DB2 a first class member of JdbcStorageHandler. It can even work before 
the patch by using POSTGRES as DB type and add db2 jdbc jar manually. This 
patch make it a standard feature. Note this is only for DB2 tables as external 
JdbcStorageHandler table. We haven't tested DB2 as a metastore backend and it's 
not a goal for this ticket.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21249) Reduce memory footprint in ObjectStore.refreshPrivileges

2019-02-11 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21249:
-

 Summary: Reduce memory footprint in ObjectStore.refreshPrivileges  
 Key: HIVE-21249
 URL: https://issues.apache.org/jira/browse/HIVE-21249
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


We found there're could be many records in TBL_COL_PRIVS for a single table (a 
table granted to many users), thus result a OOM in 
ObjectStore.listTableAllColumnGrants. We shall reduce the memory footprint for 
ObjectStore.refreshPrivileges. Here is the stack of OOM:
{code}
org.datanucleus.api.jdo.JDOPersistenceManager.retrieveAll(JDOPersistenceManager.java:690)
org.datanucleus.api.jdo.JDOPersistenceManager.retrieveAll(JDOPersistenceManager.java:710)
org.apache.hadoop.hive.metastore.ObjectStore.listTableAllColumnGrants(ObjectStore.java:6629)
org.apache.hadoop.hive.metastore.ObjectStore.refreshPrivileges(ObjectStore.java:6200)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
com.sun.proxy.$Proxy32.refreshPrivileges(, line not available)
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.refresh_privileges(HiveMetaStore.java:6507)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
com.sun.proxy.$Proxy34.refresh_privileges(, line not available)
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$refresh_privileges.getResult(ThriftHiveMetastore.java:17608)
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$refresh_privileges.getResult(ThriftHiveMetastore.java:17592)
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
java.security.AccessController.doPrivileged(Native method)
javax.security.auth.Subject.doAs(Subject.java:422)
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21248) WebHCat returns HTTP error code 500 rather than 429 when submitting large number of jobs in stress tests

2019-02-11 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21248:
-

 Summary: WebHCat returns HTTP error code 500 rather than 429 when 
submitting large number of jobs in stress tests
 Key: HIVE-21248
 URL: https://issues.apache.org/jira/browse/HIVE-21248
 Project: Hive
  Issue Type: Bug
  Components: WebHCat
Reporter: Daniel Dai
Assignee: Daniel Dai


Saw the exception in webhcat.log:
{code}
java.lang.NoSuchMethodError: 
javax.ws.rs.core.Response$Status$Family.familyOf(I)Ljavax/ws/rs/core/Response$Status$Family;
at 
org.glassfish.jersey.message.internal.Statuses$StatusImpl.(Statuses.java:63)
 ~[jersey-common-2.25.1.jar:?]
at 
org.glassfish.jersey.message.internal.Statuses$StatusImpl.(Statuses.java:54)
 ~[jersey-common-2.25.1.jar:?]
at 
org.glassfish.jersey.message.internal.Statuses.from(Statuses.java:132) 
~[jersey-common-2.25.1.jar:?]
at 
org.glassfish.jersey.message.internal.OutboundJaxrsResponse$Builder.status(OutboundJaxrsResponse.java:414)
 ~[jersey-common-2.25.1.jar:?]
at javax.ws.rs.core.Response.status(Response.java:128) 
~[jsr311-api-1.1.1.jar:?]
at 
org.apache.hive.hcatalog.templeton.SimpleWebException.buildMessage(SimpleWebException.java:67)
 ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50]
at 
org.apache.hive.hcatalog.templeton.SimpleWebException.getResponse(SimpleWebException.java:51)
 ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50]
at 
org.apache.hive.hcatalog.templeton.SimpleExceptionMapper.toResponse(SimpleExceptionMapper.java:33)
 ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50]
at 
org.apache.hive.hcatalog.templeton.SimpleExceptionMapper.toResponse(SimpleExceptionMapper.java:29)
 ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50]
at 
com.sun.jersey.spi.container.ContainerResponse.mapException(ContainerResponse.java:480)
 ~[jersey-server-1.19.jar:1.19]
at 
com.sun.jersey.spi.container.ContainerResponse.mapMappableContainerException(ContainerResponse.java:417)
 ~[jersey-server-1.19.jar:1.19]
at 
com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1477)
 ~[jersey-server-1.19.jar:1.19]
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1419)
 ~[jersey-server-1.19.jar:1.19]
at 
com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1409)
 ~[jersey-server-1.19.jar:1.19]
at 
com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:409)
 ~[jersey-servlet-1.19.jar:1.19]
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:558)
 ~[jersey-servlet-1.19.jar:1.19]
at 
com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:733)
 ~[jersey-servlet-1.19.jar:1.19]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
~[javax.servlet-api-3.1.0.jar:3.1.0]
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) 
~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
 ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.apache.hive.hcatalog.templeton.Main$XFrameOptionsFilter.doFilter(Main.java:299)
 ~[hive-webhcat-3.1.0.3.0.2.0-50.jar:3.1.0.3.0.2.0-50]
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644)
 ~[hadoop-auth-3.1.1.3.0.2.0-50.jar:?]
at 
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
 ~[hadoop-auth-3.1.1.3.0.2.0-50.jar:?]
at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:90) 
~[hadoop-hdfs-3.1.1.3.0.2.0-50.jar:?]
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
 ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) 
[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
 [jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) 
[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
 [jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) 

[jira] [Created] (HIVE-21247) Webhcat beeline in secure mode

2019-02-11 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21247:
-

 Summary: Webhcat beeline in secure mode
 Key: HIVE-21247
 URL: https://issues.apache.org/jira/browse/HIVE-21247
 Project: Hive
  Issue Type: Improvement
  Components: WebHCat
Reporter: Daniel Dai
Assignee: Daniel Dai


Follow up HIVE-20550, we need to make beeline work in secure mode. That means, 
we need to get a delegation token from hiveserver2, and pass that to beeline. 
This is similar to HIVE-5133, I make two changes:
1. Make a jdbc connection to hs2, pull delegation token from HiveConnection, 
and pass along
2. In hive jdbc driver, check for token file in HADOOP_TOKEN_FILE_LOCATION, and 
extract delegation token if exists



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21013) JdbcStorageHandler fail to find partition column in Oracle

2018-12-05 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-21013:
-

 Summary: JdbcStorageHandler fail to find partition column in Oracle
 Key: HIVE-21013
 URL: https://issues.apache.org/jira/browse/HIVE-21013
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Stack:
{code}
ERROR : Vertex failed, vertexName=Map 1, 
vertexId=vertex_1543830849610_0048_1_00, diagnostics=[Task failed, 
taskId=task_1543830849610_0048_1_00_05, diagnostics=[TaskAttempt 0 failed, 
info=[Error: Error while running task ( failure ) : 
attempt_1543830849610_0048_1_00_05_0:java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
java.io.IOException: 
org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught 
exception while trying to execute query:Cannot find salaries in sql query 
salaries 
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.io.IOException: java.io.IOException: 
org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught 
exception while trying to execute query:Cannot find salaries in sql query 
salaries 
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 16 more
Caused by: java.io.IOException: java.io.IOException: 
org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught 
exception while trying to execute query:Cannot find salaries in sql query 
salaries 
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
... 18 more
Caused by: java.io.IOException: 
org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught 
exception while trying to execute query:Cannot find salaries in sql query 
salaries 
at 
org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:85)
at 
org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:35)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
... 24 more
Caused by: 

[jira] [Created] (HIVE-20978) "hive.jdbc.*" should add to sqlStdAuthSafeVarNameRegexes

2018-11-27 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20978:
-

 Summary: "hive.jdbc.*" should add to sqlStdAuthSafeVarNameRegexes  
 Key: HIVE-20978
 URL: https://issues.apache.org/jira/browse/HIVE-20978
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-20978.1.patch

User should be able to change hive.jdbc settings, include 
"hive.jdbc.pushdown.enable".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20944) Not validate stats during query compilation

2018-11-19 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20944:
-

 Summary: Not validate stats during query compilation 
 Key: HIVE-20944
 URL: https://issues.apache.org/jira/browse/HIVE-20944
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


In a discussion with [~ashutoshc], we find currently query planning only use 
valid stats. If the stats are outdated, Hive will not get any stats. Hive shall 
use whatever we can find in metastore. It does not need to be up to date during 
query planning.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20937) Postgres jdbc query fail with "LIMIT must not be negative"

2018-11-16 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20937:
-

 Summary: Postgres jdbc query fail with "LIMIT must not be negative"
 Key: HIVE-20937
 URL: https://issues.apache.org/jira/browse/HIVE-20937
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-20937.1.patch

PostgresDatabaseAccessor does not handle limit=-1. Likely to affect 
Oracle/Mssql as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20921) Oracle backed DbLockManager fail when drop/truncate acid table with large partitions

2018-11-14 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20921:
-

 Summary: Oracle backed DbLockManager fail when drop/truncate acid 
table with large partitions
 Key: HIVE-20921
 URL: https://issues.apache.org/jira/browse/HIVE-20921
 Project: Hive
  Issue Type: Bug
  Components: Locking
Reporter: Daniel Dai
Assignee: Daniel Dai


Stack:
{code}
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Error in acquiring locks: Error communicating with the metastore 
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:324)
 
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:199)
 
at 
org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
 
at 
org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
at 
org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745) 
Caused by: org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating 
with the metastore 
at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:177) 
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:357)
 
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocksWithHeartbeatDelay(DbTxnManager.java:373)
 
at 
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.acquireLocks(DbTxnManager.java:182)
 
at org.apache.hadoop.hive.ql.Driver.acquireLocksAndOpenTxn(Driver.java:1082) 
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1284) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) 
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) 
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
 
... 11 more 
Caused by: MetaException(message:How did we get here, we heartbeated our lock 
before we started! ( lockid:466073 intLockId:701 txnid:0 db:v5x2442 
table:tbstcnf_load_stg_step 
partition:src_system_cd=MAXIMO/src_hostname_cd=PRD1310/src_table_name=LABTRANS 
state:WAITING type:EXCLUSIVE)) 
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2642) 
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:1187) 
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.check_lock(HiveMetaStore.java:6161)
 
at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:497) 
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
 
at com.sun.proxy.$Proxy14.check_lock(Unknown Source) 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.checkLock(HiveMetaStoreClient.java:1984)
 
at sun.reflect.GeneratedMethodAccessor134.invoke(Unknown Source) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:497) 
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
 
at com.sun.proxy.$Proxy15.checkLock(Unknown Source) 
at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.lock(DbLockManager.java:114) 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20896) CachedStore fail to cache stats in multiple code paths

2018-11-08 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20896:
-

 Summary: CachedStore fail to cache stats in multiple code paths
 Key: HIVE-20896
 URL: https://issues.apache.org/jira/browse/HIVE-20896
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


Bunch of issues discovered in CachedStore to keep up column statistics:
1. The criteria for partition/non-partition is wrong 
(table.isSetPartitionKeys() is always true)
2. In update(), partition column stats are removed when populate table basic 
stats
3. Dirty flags are true right after prewarm(), so the first update() does not 
do anything
4. Could invoke cacheLock without holding the lock, which results a freeze in 
update()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20830) JdbcStorageHandler range query assertion failure in some cases

2018-10-29 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20830:
-

 Summary: JdbcStorageHandler range query assertion failure in some 
cases
 Key: HIVE-20830
 URL: https://issues.apache.org/jira/browse/HIVE-20830
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Reporter: Daniel Dai
Assignee: Daniel Dai


{code}
2018-10-29T10:10:16,325 ERROR [b4bf5eb2-a986-4aae-908e-93b9908acd32 
HiveServer2-HttpHandler-Pool: Thread-124]: dao.GenericJdbcDatabaseAccessor 
(:()) - Caught exception while trying to execute query
java.lang.IllegalArgumentException: null
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:108) 
~[guava-19.0.jar:?]
at 
org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.addBoundaryToQuery(GenericJdbcDatabaseAccessor.java:238)
 ~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99]
at 
org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getRecordIterator(GenericJdbcDatabaseAccessor.java:161)
 ~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99]
at 
org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:58) 
~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99]
at 
org.apache.hive.storage.jdbc.JdbcRecordReader.next(JdbcRecordReader.java:35) 
~[hive-jdbc-handler-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-99]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:569) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:509) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2734) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:469)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:910)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:564) 
~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:790)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
 ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
 ~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at org.apache.thrift.server.TServlet.doPost(TServlet.java:83) 
~[hive-exec-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:208)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) 
~[javax.servlet-api-3.1.0.jar:3.1.0]
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) 
~[javax.servlet-api-3.1.0.jar:3.1.0]
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) 
~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:584) 
~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:224)
 ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
 ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) 
~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
 ~[jetty-runner-9.3.20.v20170531.jar:9.3.20.v20170531]
at 

[jira] [Created] (HIVE-20829) JdbcStorageHandler range split throws NPE

2018-10-29 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20829:
-

 Summary: JdbcStorageHandler range split throws NPE
 Key: HIVE-20829
 URL: https://issues.apache.org/jira/browse/HIVE-20829
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Reporter: Daniel Dai
Assignee: Daniel Dai


{code}
2018-10-29T06:37:14,982 ERROR [HiveServer2-Background-Pool: Thread-44466]: 
operation.Operation (:()) - Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, 
vertexId=vertex_1540588928441_0121_2_00, diagnostics=[Vertex 
vertex_1540588928441_0121_2_00 [Map 1] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: employees initializer failed, 
vertex=vertex_1540588928441_0121_2_00 [Map 1], java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:272)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1540588928441_0121_2_01, 
diagnostics=[Vertex received Kill in INITED state., Vertex 
vertex_1540588928441_0121_2_01 [Reducer 2] killed/failed due 
to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
failedVertices:1 killedVertices:1
at 
org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:228)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:318)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_161]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_161]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 ~[hadoop-common-3.1.1.3.0.3.0-150.jar:?]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:338)
 ~[hive-service-3.1.0.3.0.3.0-150.jar:3.1.0.3.0.3.0-150]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_161]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_161]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_161]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_161]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_161]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, 
vertexName=Map 1, vertexId=vertex_1540588928441_0121_2_00, diagnostics=[Vertex 
vertex_1540588928441_0121_2_00 [Map 1] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: employees initializer failed, 
vertex=vertex_1540588928441_0121_2_00 [Map 1], java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:272)
   

[jira] [Created] (HIVE-20815) JdbcRecordReader.next shall not eat exception

2018-10-25 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20815:
-

 Summary: JdbcRecordReader.next shall not eat exception
 Key: HIVE-20815
 URL: https://issues.apache.org/jira/browse/HIVE-20815
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20732) conf.HiveConf: HiveConf of name hive.metastore.cached.rawstore.cached.object.whitelist does not exist

2018-10-12 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20732:
-

 Summary: conf.HiveConf: HiveConf of name 
hive.metastore.cached.rawstore.cached.object.whitelist does not exist
 Key: HIVE-20732
 URL: https://issues.apache.org/jira/browse/HIVE-20732
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Vaibhav Gumashta


[~ndembla] saw this message in hs2 log. MetastoreConf properties should also 
add to HiveConf.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20731) keystore file in JdbcStorageHandler should be authorized

2018-10-11 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20731:
-

 Summary: keystore file in JdbcStorageHandler should be authorized
 Key: HIVE-20731
 URL: https://issues.apache.org/jira/browse/HIVE-20731
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Daniel Dai
Assignee: Daniel Dai


The keystore file introduced in HIVE-20651 shall be authorized with configured 
authorizer. Otherwise any user knows the keystore file location can access the 
password.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20720) Add partition column option to JDBC handler

2018-10-09 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20720:
-

 Summary: Add partition column option to JDBC handler
 Key: HIVE-20720
 URL: https://issues.apache.org/jira/browse/HIVE-20720
 Project: Hive
  Issue Type: New Feature
  Components: StorageHandler
Reporter: Daniel Dai
Assignee: Daniel Dai


Currently JdbcStorageHandler does not split input in Tez. The reason is 
numSplit of JdbcInputFormat.getSplits can only pass via "mapreduce.job.maps" in 
Tez. And "mapreduce.job.maps" is not a valid param if ranger is in use. User 
ends up always use 1 split.

We need to rely on this new feature if we want to support multi-splits. Here is 
my proposal:
1. Specify partitionColumn/numPartitions, and optional lowerBound/upperBound in 
tblproperties if user want to split jdbc data source. In case 
lowerBound/upperBound is not specified, JdbcStorageHandler will run max/min 
query to get this in planner. We can currently limit partitionColumn to only 
numeric/date/timestamp column for simplicity
2. If partitionColumn/numPartitions are not specified, don't split input
3. Splits are equal intervals without respect to data distribution
4. There is also a "hive.sql.query.split" flag vetos the split (can be set 
manually or automatically by calcite)
5. If partitionColumn is not defined, but numPartitions is defined, use 
original limit/offset logic (however, don't rely on numSplit).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20675) Log pollution from PrivilegeSynchronizer if zk is not configured

2018-10-02 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20675:
-

 Summary: Log pollution from PrivilegeSynchronizer if zk is not 
configured
 Key: HIVE-20675
 URL: https://issues.apache.org/jira/browse/HIVE-20675
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Daniel Dai
Assignee: Daniel Dai


Shall stop PrivilegeSynchronizer if "hive.zookeeper.quorum" is not configured. 
Note "hive.privilege.synchronizer" is on by default.

{code}
2018-10-02T16:04:12,488  WARN [main-SendThread(localhost:2181)] 
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing 
socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) 
~[?:1.8.0_91]
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) 
~[?:1.8.0_91]
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
 ~[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 
~[zookeeper-3.4.6.jar:3.4.6-1569965]
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20674) TestJdbcWithMiniLlapArrow.testKillQuery fail frequently

2018-10-02 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20674:
-

 Summary: TestJdbcWithMiniLlapArrow.testKillQuery fail frequently
 Key: HIVE-20674
 URL: https://issues.apache.org/jira/browse/HIVE-20674
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20658) "show tables" should show view as well

2018-09-28 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20658:
-

 Summary: "show tables" should show view as well
 Key: HIVE-20658
 URL: https://issues.apache.org/jira/browse/HIVE-20658
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


"show tables" changed behavior to show real table (no view) in HIVE-19408. This 
breaks backward compatibility and we shall restore default behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20653) Schema change in HIVE-19166 should also go to hive-schema-4.0.0.hive.sql

2018-09-28 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20653:
-

 Summary: Schema change in HIVE-19166 should also go to 
hive-schema-4.0.0.hive.sql
 Key: HIVE-20653
 URL: https://issues.apache.org/jira/browse/HIVE-20653
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20652) JdbcStorageHandler push join of two different datasource to jdbc driver

2018-09-28 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20652:
-

 Summary: JdbcStorageHandler push join of two different datasource 
to jdbc driver
 Key: HIVE-20652
 URL: https://issues.apache.org/jira/browse/HIVE-20652
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Daniel Dai
 Attachments: external_jdbc_table2.q

Test case attached. The following query fail:
{code}
SELECT * FROM ext_auth1 JOIN ext_auth2 ON ext_auth1.ikey = ext_auth2.ikey
{code}
Error message:
{code}
2018-09-28T00:36:23,860 DEBUG [17b954d9-3250-45a9-995e-1b3f8277a681 main] 
dao.GenericJdbcDatabaseAccessor: Query to execute is [SELECT *
FROM (SELECT *
FROM "SIMPLE_DERBY_TABLE1"
WHERE "ikey" IS NOT NULL) AS "t"
INNER JOIN (SELECT *
FROM "SIMPLE_DERBY_TABLE2"
WHERE "ikey" IS NOT NULL) AS "t0" ON "t"."ikey" = "t0"."ikey" {LIMIT 1}]
2018-09-28T00:36:23,864 ERROR [17b954d9-3250-45a9-995e-1b3f8277a681 main] 
dao.GenericJdbcDatabaseAccessor: Error while trying to get column names.
java.sql.SQLSyntaxErrorException: Table/View 'SIMPLE_DERBY_TABLE2' does not 
exist.
at 
org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) 
~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at 
org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedPreparedStatement.(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedPreparedStatement42.(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.jdbc.Driver42.newEmbedPreparedStatement(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at org.apache.derby.impl.jdbc.EmbedConnection.prepareStatement(Unknown 
Source) ~[derby-10.14.1.0.jar:?]
at 
org.apache.commons.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:281)
 ~[commons-dbcp-1.4.jar:1.4]
at 
org.apache.commons.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313)
 ~[commons-dbcp-1.4.jar:1.4]
at 
org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:74)
 [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:78) 
[hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:54) 
[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:540) 
[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:90)
 [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:77)
 [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:295)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:277) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genTablePlan(SemanticAnalyzer.java:11100)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11468)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11427)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:525)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12319)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:289)
 [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:669) 

[jira] [Created] (HIVE-20651) JdbcStorageHandler password should be encrypted

2018-09-28 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20651:
-

 Summary: JdbcStorageHandler password should be encrypted
 Key: HIVE-20651
 URL: https://issues.apache.org/jira/browse/HIVE-20651
 Project: Hive
  Issue Type: Improvement
  Components: StorageHandler
Reporter: Daniel Dai
Assignee: Daniel Dai


Currently, external jdbc table with JdbcStorageHandler store password as 
"hive.sql.dbcp.password" table property in clear text. We should put it in a 
keystore file. Here is the proposed change:
{code:java}
….
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES (
"hive.sql.dbcp.password.keystore" = 
"hdfs:///user/hive/credential/postgres.jceks",
"hive.sql.dbcp.password.key" = "mydb.password"
);
{code}
 
The jceks file is created with:
{code}
hadoop credential create mydb.password -provider 
hdfs:///user/hive/credential/postgres.jceks -v secretpassword
{code}

User can choose to put all db password in one jceks, or a separate jceks for 
each db.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20550) Switch WebHCat to use beeline to submit Hive queries

2018-09-13 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20550:
-

 Summary: Switch WebHCat to use beeline to submit Hive queries
 Key: HIVE-20550
 URL: https://issues.apache.org/jira/browse/HIVE-20550
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Since hive cli is deprecated, we shall switch WebHCat to use beeline instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20549) Allow user set query tag, and kill query with tag

2018-09-12 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20549:
-

 Summary: Allow user set query tag, and kill query with tag
 Key: HIVE-20549
 URL: https://issues.apache.org/jira/browse/HIVE-20549
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


HIVE-19924 add capacity for replication job set a query tag and kill the 
replication distcp job with the tag. Here I make it more general, user can set 
arbitrary "hive.query.tag" in sql script, and kill query with the tag. Hive 
will cancel the corresponding operation in hs2, along with Tez/MR application 
launched for the query. For example:
{code}
set hive.query.tag=mytag;
select . -- long running query
{code}

In another session:
{code}
kill query 'mytag';
{code}

There're limitations in the implementation:
1. No tag duplication check. There's nothing to prevent conflicting tag for 
same user, and kill query will kill queries share the same tag. However, kill 
query will not kill queries from different user unless admin. So different user 
might share the same tag
2. In multiple hs2 environment, kill statement should be issued to all hs2 to 
make sure the corresponding operation is canceled. When beeline/jdbc connects 
to hs2 using regular way (zookeeper url), the session will connect to random 
hs2, which might be different than the hs2 where query run on. User can use 
HiveConnection.getAllUrls or beeline --getUrlsFromBeelineSite (HIVE-20507) to 
get a list of all hs2 instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20494) GenericUDFRestrictInformationSchema is broken after HIVE-19440

2018-08-31 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20494:
-

 Summary: GenericUDFRestrictInformationSchema is broken after 
HIVE-19440
 Key: HIVE-20494
 URL: https://issues.apache.org/jira/browse/HIVE-20494
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20444) Parameter is not properly quoted in DbNotificationListener.addWriteNotificationLog

2018-08-22 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20444:
-

 Summary: Parameter is not properly quoted in 
DbNotificationListener.addWriteNotificationLog
 Key: HIVE-20444
 URL: https://issues.apache.org/jira/browse/HIVE-20444
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


See exception:
{code}
2018-08-22T04:44:22,758 INFO  [pool-8-thread-190]: 
listener.DbNotificationListener 
(DbNotificationListener.java:addWriteNotificationLog(765)) - Going to execute 
insert 
2018-08-22T04:44:22,773 ERROR [pool-8-thread-190]: metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable to 
add write notification log org.postgresql.util.PSQLException: ERROR: syntax 
error at or near "UTC"
  Position: 1032
at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2284)
at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2003)
at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:200)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:424)
at 
org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:321)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:313)
at com.zaxxer.hikari.pool.ProxyStatement.execute(ProxyStatement.java:92)
at 
com.zaxxer.hikari.pool.HikariProxyStatement.execute(HikariProxyStatement.java)
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.addWriteNotificationLog(DbNotificationListener.java:766)
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.onAcidWrite(DbNotificationListener.java:657)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$12(MetaStoreListenerNotifier.java:249)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEventWithDirectSql(MetaStoreListenerNotifier.java:305)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.addWriteNotificationLog(TxnHandler.java:1617)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.addTxnWriteNotificationLog(HiveMetaStore.java:7563)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_write_notification_log(HiveMetaStore.java:7589)
at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at com.sun.proxy.$Proxy34.add_write_notification_log(Unknown Source)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_write_notification_log.getResult(ThriftHiveMetastore.java:19071)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_write_notification_log.getResult(ThriftHiveMetastore.java:19056)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.onAcidWrite(DbNotificationListener.java:659)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.lambda$static$12(MetaStoreListenerNotifier.java:249)
at 
org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEventWithDirectSql(MetaStoreListenerNotifier.java:305)
at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.addWriteNotificationLog(TxnHandler.java:1617)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.addTxnWriteNotificationLog(HiveMetaStore.java:7563)
at 

[jira] [Created] (HIVE-20424) schematool shall not pollute beeline history

2018-08-20 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20424:
-

 Summary: schematool shall not pollute beeline history
 Key: HIVE-20424
 URL: https://issues.apache.org/jira/browse/HIVE-20424
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20420) Provide a fallback authorizer when no other authorizer is in use

2018-08-17 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20420:
-

 Summary: Provide a fallback authorizer when no other authorizer is 
in use
 Key: HIVE-20420
 URL: https://issues.apache.org/jira/browse/HIVE-20420
 Project: Hive
  Issue Type: New Feature
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20413) "cannot insert NULL" for TXN_WRITE_NOTIFICATION_LOG in Oracle

2018-08-17 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20413:
-

 Summary: "cannot insert NULL" for TXN_WRITE_NOTIFICATION_LOG in 
Oracle
 Key: HIVE-20413
 URL: https://issues.apache.org/jira/browse/HIVE-20413
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20389) NPE in SessionStateUserAuthenticator when authenticator=SessionStateUserAuthenticator

2018-08-14 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20389:
-

 Summary: NPE in SessionStateUserAuthenticator when 
authenticator=SessionStateUserAuthenticator
 Key: HIVE-20389
 URL: https://issues.apache.org/jira/browse/HIVE-20389
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai


Introduced in HIVE-20118, get the following stack in schematool:
{code}
Caused by: java.lang.IllegalArgumentException: Null user
at 
org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1221)
 ~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1208)
 ~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator.getGroupNames(SessionStateUserAuthenticator.java:44)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.session.SessionState.getGroupsFromAuthenticator(SessionState.java:1288)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFCurrentGroups.initialize(GenericUDFCurrentGroups.java:53)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:260)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:1215)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1516)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.ExpressionWalker.walk(ExpressionWalker.java:76) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:241)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:187)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:12752)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12707)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:12675)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3469)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3449)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:10549)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11526)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:11396)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:12160)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:628)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12250)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:356)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:284)
 

[jira] [Created] (HIVE-20357) Introduce initOrUpgradeSchema option to schema tool

2018-08-09 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20357:
-

 Summary: Introduce initOrUpgradeSchema option to schema tool
 Key: HIVE-20357
 URL: https://issues.apache.org/jira/browse/HIVE-20357
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


Currently, schematool has two option: initSchema/upgradeSchema. User needs to 
use different command line for different action. However, from the schema 
version stored in db, we shall able to figure out if there's a need to 
init/upgrade, and choose the right action automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20355) Clean up parameter of HiveConnection.setSchema

2018-08-09 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20355:
-

 Summary: Clean up parameter of HiveConnection.setSchema
 Key: HIVE-20355
 URL: https://issues.apache.org/jira/browse/HIVE-20355
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Daniel Dai
Assignee: Daniel Dai


Not immediately exploitable, as HS2 only allow one statement a time. But in 
future, we may support multiple statement in HiveStatement, so better to clean 
up the database parameter to avoid potential sql injection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20344) PrivilegeSynchronizer for SBA might hit AccessControlException

2018-08-08 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20344:
-

 Summary: PrivilegeSynchronizer for SBA might hit 
AccessControlException
 Key: HIVE-20344
 URL: https://issues.apache.org/jira/browse/HIVE-20344
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai


If "hive" user does not have privilege of corresponding hdfs folders, 
PrivilegeSynchronizer won't be able to get metadata of the table because SBA is 
preventing it. Here is a sample stack:
{code}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.security.AccessControlException: Permission denied: user=hive, 
access=EXECUTE, inode="/tmp/sba_is/sba_db":hrt_7:hrt_qa:dr
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:315)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:242)
at 
org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkDefaultEnforcer(RangerHdfsAuthorizer.java:512)
at 
org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer$RangerAccessControlEnforcer.checkPermission(RangerHdfsAuthorizer.java:305)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1850)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1834)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1784)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:7767)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkAccess(NameNodeRpcServer.java:2217)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.checkAccess(ClientNamenodeProtocolServerSideTranslatorPB.java:1659)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)

at 
org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(StorageBasedAuthorizationProvider.java:424)
at 
org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.checkPermissions(StorageBasedAuthorizationProvider.java:382)
at 
org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(StorageBasedAuthorizationProvider.java:355)
at 
org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider.authorize(StorageBasedAuthorizationProvider.java:203)
at 
org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener.authorizeReadTable(AuthorizationPreEventListener.java:192)
... 23 more
{code}
I simply skip the table if that happens. In practice, managed tables are owned 
by "hive" user, so only external tables will be impacted. User need to grant 
execute permission of db folder and read permission of the table folders to 
"hive" user if they want to query the information schema for the tables, whose 
permission is only granted via SBA. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20130) Better logging for information schema synchronizer

2018-07-09 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20130:
-

 Summary: Better logging for information schema synchronizer
 Key: HIVE-20130
 URL: https://issues.apache.org/jira/browse/HIVE-20130
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-20130.1.patch

The logging of information schema synchronizer should be more useful.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20118) SessionStateUserAuthenticator.getGroupNames() is always empty

2018-07-07 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20118:
-

 Summary: SessionStateUserAuthenticator.getGroupNames() is always 
empty
 Key: HIVE-20118
 URL: https://issues.apache.org/jira/browse/HIVE-20118
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20002) Shipping jdbd-storage-handler dependency jars in LLAP

2018-06-26 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-20002:
-

 Summary: Shipping jdbd-storage-handler dependency jars in LLAP
 Key: HIVE-20002
 URL: https://issues.apache.org/jira/browse/HIVE-20002
 Project: Hive
  Issue Type: Bug
  Components: llap
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-20002.1.patch

Shipping the following jars to LLAP to make jdbc storage-handler work: 
commons-dbcp, commons-pool, db specific jdbc jar whichever exists in classpath.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19938) Upgrade scripts for information schema

2018-06-18 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19938:
-

 Summary: Upgrade scripts for information schema
 Key: HIVE-19938
 URL: https://issues.apache.org/jira/browse/HIVE-19938
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


To make schematool -upgradeSchema work for information schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19920) Schematool fails in embedded mode when auth is on

2018-06-15 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19920:
-

 Summary: Schematool fails in embedded mode when auth is on
 Key: HIVE-19920
 URL: https://issues.apache.org/jira/browse/HIVE-19920
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


This is a follow up of HIVE-19775. We need to override more properties in 
embedded hs2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19913) OWNER_TYPE is missing in some metastore upgrade script

2018-06-15 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19913:
-

 Summary: OWNER_TYPE is missing in some metastore upgrade script
 Key: HIVE-19913
 URL: https://issues.apache.org/jira/browse/HIVE-19913
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


OWNER_TYPE introduced in HIVE-19372 is missing in upgrade-2.3.0-to-3.0.0.*.sql 
except derby.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19872) hive-schema-3.1.0.hive.sql is missing on master and branch-3

2018-06-12 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19872:
-

 Summary: hive-schema-3.1.0.hive.sql is missing on master and 
branch-3
 Key: HIVE-19872
 URL: https://issues.apache.org/jira/browse/HIVE-19872
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Information schema initialization will fail with "Unknown version specified for 
initialization: 3.1.0".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19862) Postgres init script has a glitch around UNIQUE_DATABASE

2018-06-11 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19862:
-

 Summary: Postgres init script has a glitch around UNIQUE_DATABASE
 Key: HIVE-19862
 URL: https://issues.apache.org/jira/browse/HIVE-19862
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


{code}
ALTER TABLE ONLY "DBS" ADD CONSTRAINT "UNIQUE_DATABASE" UNIQUE ("NAME");
{code}
Should also include "CTLG_NAME".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19825) HiveServer2 leader selection shall use different zookeeper znode

2018-06-07 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19825:
-

 Summary: HiveServer2 leader selection shall use different 
zookeeper znode
 Key: HIVE-19825
 URL: https://issues.apache.org/jira/browse/HIVE-19825
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Daniel Dai
Assignee: Daniel Dai


Currently, HiveServer2 leader selection (used only by privilegesynchronizer 
now) is reuse /hiveserver2 parent znode which is already used for HiveServer2 
service discovery. This interfere the service discovery. I'd like to switch to 
a different znode /hiveserver2-leader.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19813) SessionState.start don't have to be synchronized

2018-06-06 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19813:
-

 Summary: SessionState.start don't have to be synchronized
 Key: HIVE-19813
 URL: https://issues.apache.org/jira/browse/HIVE-19813
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


This is introduced in HIVE-14690. However, only check-set block needs to be 
synchronized, not the whole block. The method will start Tez AM, which is a 
long operation. Make the method synchronized will serialize session start thus 
slow down hs2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19810) StorageHandler fail to ship jars in Tez intermittently

2018-06-05 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19810:
-

 Summary: StorageHandler fail to ship jars in Tez intermittently
 Key: HIVE-19810
 URL: https://issues.apache.org/jira/browse/HIVE-19810
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Daniel Dai
Assignee: Daniel Dai


Hive relies on StorageHandler to ship jars to backend automatically in several 
cases: JdbcStorageHandler, HBaseStorageHandler, AccumuloStorageHandler. This 
does not work reliably, in particular, the first dag in the session will have 
those jars, the second will not unless container is reused. In the later case, 
the containers allocated to first dag will be reused in the second dag so the 
container will have additional resources.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19737) Missing update schema version in 3.1 db scripts

2018-05-29 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19737:
-

 Summary: Missing update schema version in 3.1 db scripts
 Key: HIVE-19737
 URL: https://issues.apache.org/jira/browse/HIVE-19737
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-19737.1.patch

I miss several places to update schema version string in 
standalone-metastore/src/main/sql/xxx/hive-schema-3.1.0.xxx.sql when creating 
those scripts.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19440) Make StorageBasedAuthorizer work with information schema

2018-05-07 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19440:
-

 Summary: Make StorageBasedAuthorizer work with information schema
 Key: HIVE-19440
 URL: https://issues.apache.org/jira/browse/HIVE-19440
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai
Assignee: Daniel Dai


With HIVE-19161, Hive information schema works with external authorizer (such 
as ranger). However, we also need to make StorageBasedAuthorizer 
synchronization work as it is also widely use.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19381) Function replication in cloud fail when download resource from AWS

2018-05-01 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19381:
-

 Summary: Function replication in cloud fail when download resource 
from AWS
 Key: HIVE-19381
 URL: https://issues.apache.org/jira/browse/HIVE-19381
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai
 Fix For: 3.0.0, 3.1.0


Another case replication shall use the config in with clause.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19331) Repl load config in "with" clause not pass to Context.getStagingDir

2018-04-26 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19331:
-

 Summary: Repl load config in "with" clause not pass to 
Context.getStagingDir
 Key: HIVE-19331
 URL: https://issues.apache.org/jira/browse/HIVE-19331
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


Another failure similar to HIVE-18626, causing exception when s3 credentials 
are in "REPL LOAD" with clause.

{code}
Caused by: java.lang.IllegalStateException: Error getting FileSystem for 
s3a://nat-yc-r7-nmys-beacon-cloud-s3-2/hive_incremental_testing.db/hive_incremental_testing_new_tabl...:
 org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on 
nat-yc-r7-nmys-beacon-cloud-s3-2: com.amazonaws.AmazonClientException: No AWS 
Credentials provided by BasicAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider SharedInstanceProfileCredentialsProvider 
: com.amazonaws.AmazonClientException: Unable to load credentials from Amazon 
EC2 metadata service: No AWS Credentials provided by 
BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
SharedInstanceProfileCredentialsProvider : com.amazonaws.AmazonClientException: 
Unable to load credentials from Amazon EC2 metadata service
at org.apache.hadoop.hive.ql.Context.getStagingDir(Context.java:359)
at 
org.apache.hadoop.hive.ql.Context.getExternalScratchDir(Context.java:487)
at 
org.apache.hadoop.hive.ql.Context.getExternalTmpPath(Context.java:565)
at 
org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer.loadTable(ImportSemanticAnalyzer.java:370)
at 
org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer.createReplImportTasks(ImportSemanticAnalyzer.java:926)
at 
org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer.prepareImport(ImportSemanticAnalyzer.java:329)
at 
org.apache.hadoop.hive.ql.parse.repl.load.message.TableHandler.handle(TableHandler.java:43)
... 24 more
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19251) ObjectStore.getNextNotification with LIMIT should use less memory

2018-04-19 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19251:
-

 Summary: ObjectStore.getNextNotification with LIMIT should use 
less memory
 Key: HIVE-19251
 URL: https://issues.apache.org/jira/browse/HIVE-19251
 Project: Hive
  Issue Type: Bug
  Components: repl, Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


Experience OOM when Hive metastore try to retrieve huge amount of notification 
logs even there's limit clause. Hive shall only retrieve necessary rows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19161) Add authorizations to information schema

2018-04-10 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19161:
-

 Summary: Add authorizations to information schema
 Key: HIVE-19161
 URL: https://issues.apache.org/jira/browse/HIVE-19161
 Project: Hive
  Issue Type: Sub-task
Reporter: Daniel Dai
Assignee: Daniel Dai


We need to control the access of information schema so user can only query the 
information authorized to.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19065) Metastore client compatibility check should include syncMetaStoreClient

2018-03-28 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19065:
-

 Summary: Metastore client compatibility check should include 
syncMetaStoreClient
 Key: HIVE-19065
 URL: https://issues.apache.org/jira/browse/HIVE-19065
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-19065.1.patch

I saw a case Hive.get(HiveConf c) reuse syncMetaStoreClient with different 
config (in my case, hive.metastore.uris is different), which makes 
syncMetaStoreClient connect to wrong metastore server.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19054) Function replication shall use "hive.repl.replica.functions.root.dir" as root

2018-03-26 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-19054:
-

 Summary: Function replication shall use 
"hive.repl.replica.functions.root.dir" as root
 Key: HIVE-19054
 URL: https://issues.apache.org/jira/browse/HIVE-19054
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-19054.1.patch

It's wrongly use fs.defaultFS as the root, ignore 
"hive.repl.replica.functions.root.dir" definition, thus prevent replicating to 
cloud destination.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18879) Disallow embedded element in UDFXPathUtil needs to work if xercesImpl.jar in classpath

2018-03-06 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18879:
-

 Summary: Disallow embedded element in UDFXPathUtil needs to work 
if xercesImpl.jar in classpath
 Key: HIVE-18879
 URL: https://issues.apache.org/jira/browse/HIVE-18879
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18833) Auto Merge fails when "insert into directory as orcfile"

2018-02-28 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18833:
-

 Summary: Auto Merge fails when "insert into directory as orcfile"
 Key: HIVE-18833
 URL: https://issues.apache.org/jira/browse/HIVE-18833
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai


Here is the reproduction:
{code}
set mapreduce.job.reduces=2;
set hive.merge.tezfiles=true;
INSERT OVERWRITE DIRECTORY 'output' stored as orcfile select age, avg(gpa) from 
student group by age;
{code}

Error message: File Merge Stage after Maps completion is considering input as 
"input format: org.apache.hadoop.mapred.TextInputFormat" instead of 
"org.apache.hadoop.hive.ql.io.orc.OrcInputFormat"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18815) Remove unused feature in HPL/SQL

2018-02-27 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18815:
-

 Summary: Remove unused feature in HPL/SQL
 Key: HIVE-18815
 URL: https://issues.apache.org/jira/browse/HIVE-18815
 Project: Hive
  Issue Type: Bug
  Components: hpl/sql
Reporter: Daniel Dai
Assignee: Daniel Dai


Remove FTP feature in HPL/SQL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18794) Repl load "with" clause does not pass config to tasks for non-partition tables

2018-02-23 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18794:
-

 Summary: Repl load "with" clause does not pass config to tasks for 
non-partition tables
 Key: HIVE-18794
 URL: https://issues.apache.org/jira/browse/HIVE-18794
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-18794.1.patch

Miss one scenario in HIVE-18626.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18789) Disallow embedded element in UDFXPathUtil

2018-02-23 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18789:
-

 Summary: Disallow embedded element in UDFXPathUtil
 Key: HIVE-18789
 URL: https://issues.apache.org/jira/browse/HIVE-18789
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18788) Clean up inputs in JDBC PreparedStatement

2018-02-23 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18788:
-

 Summary: Clean up inputs in JDBC PreparedStatement
 Key: HIVE-18788
 URL: https://issues.apache.org/jira/browse/HIVE-18788
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18778) Needs to capture input/output entities in explain

2018-02-22 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18778:
-

 Summary: Needs to capture input/output entities in explain
 Key: HIVE-18778
 URL: https://issues.apache.org/jira/browse/HIVE-18778
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18626) Repl load "with" clause does not pass config to tasks

2018-02-05 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18626:
-

 Summary: Repl load "with" clause does not pass config to tasks
 Key: HIVE-18626
 URL: https://issues.apache.org/jira/browse/HIVE-18626
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


The "with" clause in repl load suppose to pass custom hive config entries to 
replication. However, the config is only effective in BootstrapEventsIterator, 
but not the generated tasks (such as MoveTask, DDLTask).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18530) Replication should skip MM table (for now)

2018-01-24 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18530:
-

 Summary: Replication should skip MM table (for now)
 Key: HIVE-18530
 URL: https://issues.apache.org/jira/browse/HIVE-18530
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


Currently replication cannot handle transactional table (including MM table) 
until HIVE-18320. HIVE-17504 skips table with transactional=true explicitly. 
HIVE-18352 changes the logic to use AcidUtils.isAcidTable for the same purpose. 
However, isAcidTable returns false for mm table, thus Hive still dump mm table 
during replication.
Here is an error message during dump mm table:
{code}
ERROR : FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
java.io.FileNotFoundException: Path is not a file: 
/apps/hive/warehouse/testrepldb5.db/test1/delta_261_261_
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:89)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:75)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1920)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:731)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:424)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

INFO  : Completed executing 
command(queryId=hive_20180119203438_293813df-7630-47fa-bc30-5ef7cbb42842); Time 
taken: 1.119 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
java.io.FileNotFoundException: Path is not a file: 
/apps/hive/warehouse/testrepldb5.db/test1/delta_261_261_
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:89)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:75)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:154)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1920)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:731)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:424)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) 
(state=08S01,code=1)
0: jdbc:hive2://ctr-e137-1514896590304-25219-> Closing: 0: 
jdbc:hive2://ctr-e137-1514896590304-25219-02-05.hwx.site:2181,ctr-e137-1514896590304-25219-02-12.hwx.site:2181,ctr-e137-1514896590304-25219-02-09.hwx.site:2181,ctr-e137-1514896590304-25219-02-04.hwx.site:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@example.com
{code}
We shall switch to use AcidUtils.isTransactionalTable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18299) DbNotificationListener fail on mysql with "select for update"

2017-12-18 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18299:
-

 Summary: DbNotificationListener fail on mysql with "select for 
update"
 Key: HIVE-18299
 URL: https://issues.apache.org/jira/browse/HIVE-18299
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


This is a continuation of HIVE-17830, which haven't solved the issue. We need 
to run "SET \@\@session.sql_mode=ANSI_QUOTES" statement before we run 
select \"NEXT_EVENT_ID\" from \"NOTIFICATION_SEQUENCE\"". We shall keep table 
name quoted to be in consistent with rest of ObjectStore code. This approach is 
the same as what MetaStoreDirectSql take (set session variable before every 
query).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18298) Fix TestReplicationScenarios.testConstraints

2017-12-18 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18298:
-

 Summary: Fix TestReplicationScenarios.testConstraints
 Key: HIVE-18298
 URL: https://issues.apache.org/jira/browse/HIVE-18298
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


The test if broken by HIVE-16603. Currently on constraints are created without 
order on replication destination cluster during bootstrap, after HIVE-16603, it 
is no longer possible. We need to create foreign keys at last after all primary 
keys are created.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18227) Tez parallel execution fail

2017-12-05 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18227:
-

 Summary: Tez parallel execution fail
 Key: HIVE-18227
 URL: https://issues.apache.org/jira/browse/HIVE-18227
 Project: Hive
  Issue Type: Bug
  Components: Tez
Reporter: Daniel Dai
Assignee: Daniel Dai


Running tez Dag in parallel within a session fail. Here is the test case:
{code}
set hive.exec.parallel=true;
set hive.merge.tezfiles=true;
set tez.grouping.max-size=10;
set tez.grouping.min-size=1;

from student
insert overwrite table student4 select *
insert overwrite table student5 select *
insert overwrite table student6 select *;
{code}

The merge task run in parallel and result the exception:
{code}
org.apache.tez.dag.api.TezException: App master already running a DAG
at 
org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1255)
at 
org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118)
at 
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:161)
at 
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2273)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2269)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2267)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18189) Order by position does not work when cbo is disabled

2017-11-30 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18189:
-

 Summary: Order by position does not work when cbo is disabled
 Key: HIVE-18189
 URL: https://issues.apache.org/jira/browse/HIVE-18189
 Project: Hive
  Issue Type: Bug
  Components: Query Planning
Reporter: Daniel Dai
Assignee: Daniel Dai


Investigating a failed query:
{code}
set hive.cbo.enable=false;
set hive.orderby.position.alias=true;
select distinct age from student order by 1 desc limit 20;
{code}

The query does not sort the output correctly when cbo is disabled/inactivated. 
I found two issues:
1. "order by position" query is broken by HIVE-16774
2. In particular, select distinct query never work for "order by position" query



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-18180) DbNotificationListener broken after HIVE-17967

2017-11-29 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-18180:
-

 Summary: DbNotificationListener broken after HIVE-17967
 Key: HIVE-18180
 URL: https://issues.apache.org/jira/browse/HIVE-18180
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


Exception happens when starting Hive metastore with DbNotificationListener on:
{code}
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.apache.hadoop.hive.metastore.utils.MetaStoreUtils.getMetaStoreListeners(MetaStoreUtils.java:792)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:511)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:80)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:7426)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:7421)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:7694)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:7611)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.ClassCastException: org.apache.hadoop.conf.Configuration 
cannot be cast to org.apache.hadoop.hive.conf.HiveConf
at 
org.apache.hive.hcatalog.listener.DbNotificationListener.(DbNotificationListener.java:114)
... 24 more
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17840) HiveMetaStore eats exception if transactionalListeners.notifyEvent fail

2017-10-18 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-17840:
-

 Summary: HiveMetaStore eats exception if 
transactionalListeners.notifyEvent fail
 Key: HIVE-17840
 URL: https://issues.apache.org/jira/browse/HIVE-17840
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


For example, in add_partitions_core, if there's exception in 
MetaStoreListenerNotifier.notifyEvent(transactionalListeners,), transaction 
rollback but no exception thrown. Client will assume add partition is 
successful and take a positive path.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17497) Constraint import may fail during incremental replication

2017-09-10 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-17497:
-

 Summary: Constraint import may fail during incremental replication
 Key: HIVE-17497
 URL: https://issues.apache.org/jira/browse/HIVE-17497
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


During bootstrap repl dump, we may double export constraint in both bootstrap 
dump and increment dump. Consider the following sequence:
1. Get repl_id, dump table
2. During dump, constraint is added
3. This constraint will be in both bootstrap dump and incremental dump
4. incremental repl_id will be newer, so the constraint will be loaded during 
incremental replication
5. since constraint is already in bootstrap replication, we will have an 
exception



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17421) Clear incorrect stats after replication

2017-08-31 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-17421:
-

 Summary: Clear incorrect stats after replication
 Key: HIVE-17421
 URL: https://issues.apache.org/jira/browse/HIVE-17421
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


After replication, some stats summary are incorrect. If 
hive.compute.query.using.stats set to true, we will get wrong result on the 
destination side.

This will not happen with bootstrap replication. This is because stats summary 
are in table properties and will be replicated to the destination. However, in 
incremental replication, this won't work. When creating table, the stats 
summary are empty (eg, numRows=0). Later when we insert data, stats summary are 
updated with update_table_column_statistics/update_partition_column_statistics, 
however, both events are not captured in incremental replication. Thus on the 
destination side, we will get count(*)=0. The simple solution is to remove 
COLUMN_STATS_ACCURATE property after incremental replication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17366) Constraint replication in bootstrap

2017-08-21 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-17366:
-

 Summary: Constraint replication in bootstrap
 Key: HIVE-17366
 URL: https://issues.apache.org/jira/browse/HIVE-17366
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


Incremental constraint replication is tracked in HIVE-15705. This is to track 
the bootstrap replication.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17254) Skip updating AccessTime of recycled files in ReplChangeManager

2017-08-04 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-17254:
-

 Summary: Skip updating AccessTime of recycled files in 
ReplChangeManager
 Key: HIVE-17254
 URL: https://issues.apache.org/jira/browse/HIVE-17254
 Project: Hive
  Issue Type: Bug
  Components: repl
Reporter: Daniel Dai
Assignee: Daniel Dai


For recycled file, we update both ModifyTime and AccessTime:
fs.setTimes(path, now, now);
On some version of hdfs, this is now allowed when 
"dfs.namenode.accesstime.precision" is set to 0. Though the issue is solved in 
HDFS-9208, we don't use AccessTime in CM and this could be skipped so we don't 
have to fail on this scenario.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17208) Repl dump should pass in db/table information to authorization API

2017-07-30 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-17208:
-

 Summary: Repl dump should pass in db/table information to 
authorization API
 Key: HIVE-17208
 URL: https://issues.apache.org/jira/browse/HIVE-17208
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Daniel Dai
Assignee: Daniel Dai


"repl dump" does not provide db/table information. That is necessary for 
authorization replication in ranger.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-17007) NPE introduced by HIVE-16871

2017-06-30 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-17007:
-

 Summary: NPE introduced by HIVE-16871
 Key: HIVE-17007
 URL: https://issues.apache.org/jira/browse/HIVE-17007
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


Stack:
{code}
2017-06-30T02:39:43,739 ERROR [HiveServer2-Background-Pool: Thread-2873]: 
metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(200)) - 
MetaException(message:java.lang.NullPointerException)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6066)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3993)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_environment_context(HiveMetaStore.java:3944)
at sun.reflect.GeneratedMethodAccessor142.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy32.alter_table_with_environment_context(Unknown 
Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table_with_environmentContext(HiveMetaStoreClient.java:397)
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.alter_table_with_environmentContext(SessionHiveMetaStoreClient.java:325)
at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown 
Source)
at sun.reflect.GeneratedMethodAccessor75.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2306)
at com.sun.proxy.$Proxy33.alter_table_with_environmentContext(Unknown 
Source)
at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:624)
at org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3490)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:383)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:348)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.metastore.cache.SharedCache.getCachedTableColStats(SharedCache.java:140)
at 
org.apache.hadoop.hive.metastore.cache.CachedStore.getTableColumnStatistics(CachedStore.java:1409)
at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at 

[jira] [Created] (HIVE-16871) CachedStore.get_aggr_stats_for has side affect

2017-06-09 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16871:
-

 Summary: CachedStore.get_aggr_stats_for has side affect
 Key: HIVE-16871
 URL: https://issues.apache.org/jira/browse/HIVE-16871
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


Every get_aggr_stats_for accumulates the stats and propagated to the first 
partition stats object. It accumulates and gives wrong result in the follow up 
invocations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16848) NPE during CachedStore refresh

2017-06-07 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16848:
-

 Summary: NPE during CachedStore refresh
 Key: HIVE-16848
 URL: https://issues.apache.org/jira/browse/HIVE-16848
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


CachedStore refresh only happen once due to NPE. ScheduledExecutorService 
canceled subsequent refreshes:

{code}
java.lang.NullPointerException
at 
org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.updateTableColStats(CachedStore.java:458)
at 
org.apache.hadoop.hive.metastore.cache.CachedStore$CacheUpdateMasterWork.run(CachedStore.java:348)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16779) CachedStore refresher leak PersistenceManager resources

2017-05-26 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16779:
-

 Summary: CachedStore refresher leak PersistenceManager resources
 Key: HIVE-16779
 URL: https://issues.apache.org/jira/browse/HIVE-16779
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


See OOM when running CachedStore. We didn't shutdown rawstore in refresh thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16662) Fix remaining unit test failures when CachedStore is enabled

2017-05-12 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16662:
-

 Summary: Fix remaining unit test failures when CachedStore is 
enabled
 Key: HIVE-16662
 URL: https://issues.apache.org/jira/browse/HIVE-16662
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


In HIVE-16586, I fixed most of UT failures for CachedStore. This ticket is for 
the remainings, and regressions when stats methods in CachedStore are enabled.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16638) Get rid of magic constant __HIVE_DEFAULT_PARTITION__ in syntax

2017-05-10 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16638:
-

 Summary: Get rid of magic constant __HIVE_DEFAULT_PARTITION__ in 
syntax
 Key: HIVE-16638
 URL: https://issues.apache.org/jira/browse/HIVE-16638
 Project: Hive
  Issue Type: Improvement
Reporter: Daniel Dai


As per discussion in HIVE-16609, we'd like to get rid of magic constant 
__HIVE_DEFAULT_PARTITION__ in syntax. There are two use cases I currently 
realize:
1. alter table t drop partition(p='__HIVE_DEFAULT_PARTITION__');
2. select * from t where p='__HIVE_DEFAULT_PARTITION__';

Currently we switch p='__HIVE_DEFAULT_PARTITION__' to "p is null" internally 
for processing. It would be good if we can promote to the syntax level and get 
rid of p='__HIVE_DEFAULT_PARTITION__' completely.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16633) username for ATS data shall always be the uid who submit the job

2017-05-10 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16633:
-

 Summary: username for ATS data shall always be the uid who submit 
the job
 Key: HIVE-16633
 URL: https://issues.apache.org/jira/browse/HIVE-16633
 Project: Hive
  Issue Type: Bug
Reporter: Daniel Dai
Assignee: Daniel Dai
 Attachments: HIVE-16633.1.patch

When submitting query via HS2, username for ATS data becomes HS2 process uid in 
case of 
hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator.
 This should always be the real user id to make ATS data more secure and useful.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16609) col='__HIVE_DEFAULT_PARTITION__' condition in select statement may produce wrong result

2017-05-08 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16609:
-

 Summary: col='__HIVE_DEFAULT_PARTITION__' condition in select 
statement may produce wrong result
 Key: HIVE-16609
 URL: https://issues.apache.org/jira/browse/HIVE-16609
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


A variation of drop_partitions_filter4.q produces wrong result:
{code}
create table ptestfilter (a string, b int) partitioned by (c string, d int);
INSERT OVERWRITE TABLE ptestfilter PARTITION (c,d) select 'Col1', 1, null, null;
INSERT OVERWRITE TABLE ptestfilter PARTITION (c,d) select 'Col2', 2, null, 2;
INSERT OVERWRITE TABLE ptestfilter PARTITION (c,d) select 'Col3', 3, 'Uganda', 
null;
select * from ptestfilter where c='__HIVE_DEFAULT_PARTITION__' or lower(c)='a';
{code}
The "select" statement does not produce the rows containing 
"__HIVE_DEFAULT_PARTITION__".

Note "select * from ptestfilter where c is null or lower(c)='a';" works fine.

In the query, c is a non-string partition column, we need another condition 
containing a udf so the condition is not recognized by 
PartFilterExprUtil.makeExpressionTree in ObjectStore. HIVE-11208/HIVE-15923 is 
addressing a similar issue in drop partition, however, select is not covered.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16586) Fix Unit test failures when CachedStore is enabled

2017-05-04 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16586:
-

 Summary: Fix Unit test failures when CachedStore is enabled
 Key: HIVE-16586
 URL: https://issues.apache.org/jira/browse/HIVE-16586
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


Though we don't plan to turn on CachedStore by default, we want to make sure 
unit tests pass with CachedStore. I turn on CachedStore in the patch in order 
to run unit tests with it, but I will turn off CachedStore when committing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16520) Cache hive metadata in metastore

2017-04-24 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16520:
-

 Summary: Cache hive metadata in metastore
 Key: HIVE-16520
 URL: https://issues.apache.org/jira/browse/HIVE-16520
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Daniel Dai
Assignee: Daniel Dai


During Hive 2 benchmark, we find Hive metastore operation take a lot of time 
and thus slow down Hive compilation. In some extreme case, it takes much longer 
than the actual query run time. Especially, we find the latency of cloud db is 
very high and 90% of total query runtime is waiting for metastore SQL database 
operations. Based on this observation, the metastore operation performance will 
be greatly enhanced if we have a memory structure which cache the database 
query result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HIVE-16323) HS2 JDOPersistenceManagerFactory.pmCache leaks after HIVE-14204

2017-03-28 Thread Daniel Dai (JIRA)
Daniel Dai created HIVE-16323:
-

 Summary: HS2 JDOPersistenceManagerFactory.pmCache leaks after 
HIVE-14204
 Key: HIVE-16323
 URL: https://issues.apache.org/jira/browse/HIVE-16323
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Daniel Dai
Assignee: Daniel Dai


Hive.loadDynamicPartitions creates threads with new embedded rawstore, but 
never close them, thus we leak PersistenceManager one per such thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   3   4   5   >