[jira] [Created] (HIVE-19743) hive is not pushing predicate down to HBaseStorageHandler if hive key mapped with hbase is stored as varchar
Rajkumar Singh created HIVE-19743: - Summary: hive is not pushing predicate down to HBaseStorageHandler if hive key mapped with hbase is stored as varchar Key: HIVE-19743 URL: https://issues.apache.org/jira/browse/HIVE-19743 Project: Hive Issue Type: Bug Components: HBase Handler, Hive Affects Versions: 2.1.0 Environment: java8,centos7 Reporter: Rajkumar Singh Steps to Reproduce: {code} //hbase table create 'mytable', 'cf' put 'mytable', 'ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4', 'cf:message', 'hello world' put 'mytable', 'ABCDEF1|GHIJK1|ijj123kl-mn4o-4pq5-678r-st90123u0v41', 'cf:foo', 0x0 // hive table with key stored as varchar show create table hbase_table_4; +---+--+ | createtab_stmt | +---+--+ | CREATE EXTERNAL TABLE `hbase_table_4`( | | `hbase_key` varchar(80) COMMENT 'from deserializer', | | `value` string COMMENT 'from deserializer', | | `value1` string COMMENT 'from deserializer') | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.hbase.HBaseSerDe' | | STORED BY | | 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' | | WITH SERDEPROPERTIES ( | | 'hbase.columns.mapping'=':key,cf:foo,cf:message', | | 'serialization.format'='1') | | TBLPROPERTIES ( | | 'COLUMN_STATS_ACCURATE'='\{\"BASIC_STATS\":\"true\"}', | | 'hbase.table.name'='mytable', | | 'numFiles'='0', | | 'numRows'='0', | | 'rawDataSize'='0', | | 'totalSize'='0', | | 'transient_lastDdlTime'='1527708430') | +---+--+ // hive table key stored as string CREATE EXTERNAL TABLE `hbase_table_5`( | | `hbase_key` string COMMENT 'from deserializer', | | `value` string COMMENT 'from deserializer', | | `value1` string COMMENT 'from deserializer') | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.hbase.HBaseSerDe' | | STORED BY | | 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' | | WITH SERDEPROPERTIES ( | | 'hbase.columns.mapping'=':key,cf:foo,cf:message', | | 'serialization.format'='1') | | TBLPROPERTIES ( | | 'COLUMN_STATS_ACCURATE'='\{\"BASIC_STATS\":\"true\"}', | | 'hbase.table.name'='mytable', | | 'numFiles'='0', | | 'numRows'='0', | | 'rawDataSize'='0', | | 'totalSize'='0', | | 'transient_lastDdlTime'='1527708520') | Explain Plan explain select * from hbase_table_4 where hbase_key='ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4' Stage-0 | | Fetch Operator | | limit:-1 | | Select Operator [SEL_2] | | Output:["_col0","_col1","_col2"] | | Filter Operator [FIL_4] | | predicate:(UDFToString(hbase_key) = 'ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4') | | TableScan [TS_0] | | Output:["hbase_key","value","value1"] explain on table with key stored as string explain select * from hbase_table_5 where hbase_key='ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4'; Plan optimized by CBO. | | | | Stage-0 | | Fetch Operator | | limit:-1 | | Select Operator [SEL_2] | | Output:["_col0","_col1","_col2"] | |
[jira] [Created] (HIVE-19831) Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if database/table already exists
Rajkumar Singh created HIVE-19831: - Summary: Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if database/table already exists Key: HIVE-19831 URL: https://issues.apache.org/jira/browse/HIVE-19831 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 2.1.0, 1.2.1 Reporter: Rajkumar Singh with sqlstdauth on, Create database if exists take TOO LONG if there are too many objects inside the database directory. Hive should not run the doAuth checks for all the objects within database if the database already exists. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19860) HiveServer2 ObjectInspectorFactory memory leak with cachedUnionStructObjectInspector
Rajkumar Singh created HIVE-19860: - Summary: HiveServer2 ObjectInspectorFactory memory leak with cachedUnionStructObjectInspector Key: HIVE-19860 URL: https://issues.apache.org/jira/browse/HIVE-19860 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 2.1.0 Environment: hiveserver2 Interactive with LLAP. Reporter: Rajkumar Singh hiveserver2 is start seeing the memory pressure once the cachedUnionStructObjectInspector start going [https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java#L345] I did not see any eviction policy for cachedUnionStructObjectInspector, so we should implement some size or time-based eviction policy. !Screen Shot 2018-06-11 at 1.52.50 PM.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20080) TxnHandler checkLock direct sql fail with ORA-01795 , if the table has more than 1000 partitions
Rajkumar Singh created HIVE-20080: - Summary: TxnHandler checkLock direct sql fail with ORA-01795 , if the table has more than 1000 partitions Key: HIVE-20080 URL: https://issues.apache.org/jira/browse/HIVE-20080 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.1.0 Reporter: Rajkumar Singh with Oracle as Metastore, txnhandler checkLock fail with "checkLockWithRetry(181398,34773) : ORA-01795: maximum number of expressions in a list is 1000" if the write table has more than 1000 partitions. complete stacktrace {code} txn.TxnHandler (TxnHandler.java:checkRetryable(2099)) - Non-retryable error in checkLockWithRetry(181398,34773) : ORA-01795: maximum number of expressions in a list is 1000 (SQLState=42000, ErrorCode=1795) 2018-06-25 15:09:35,999 ERROR [pool-7-thread-197]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Unable to update transaction database java.sql.SQLSyntaxErrorException: ORA-01795: maximum number of expressions in a list is 1000 at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951) at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513) at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227) at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531) at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:195) at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:876) at oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175) at oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296) at oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1498) at oracle.jdbc.driver.OracleStatementWrapper.executeQuery(OracleStatementWrapper.java:406) at com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464) at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2649) at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1126) at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:895) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:6123) at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy11.lock(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:12012) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:11996) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) ) at org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1131) at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:895) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:6123) at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.Retryin
[jira] [Created] (HIVE-20099) incorrect logger for LlapServlet
Rajkumar Singh created HIVE-20099: - Summary: incorrect logger for LlapServlet Key: HIVE-20099 URL: https://issues.apache.org/jira/browse/HIVE-20099 Project: Hive Issue Type: Improvement Affects Versions: 2.1.0 Reporter: Rajkumar Singh logger should be LlapServlet, not the JMXJsonServlet, it can mislead the user while debugging UI issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20172) StatsUpdater failed with GSS Exception while trying to connect to remote metastore
Rajkumar Singh created HIVE-20172: - Summary: StatsUpdater failed with GSS Exception while trying to connect to remote metastore Key: HIVE-20172 URL: https://issues.apache.org/jira/browse/HIVE-20172 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.1.1 Environment: Hive-1.2.1,Hive2.1,java8 Reporter: Rajkumar Singh Assignee: Rajkumar Singh StatsUpdater task failed with GSS Exception while trying to connect to remote Metastore. {code} org.apache.thrift.transport.TTransportException: GSS initiate failed at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:487) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:92) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3526) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3558) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:533) at org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:300) at org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265) at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:177) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) ) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:534) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76) {code} since metastore client is running in HMS so there is no need to connect to remote URI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20275) hive produces incorrect result when using MIN()/MAX() on varchar with hive.vectorized.reuse.scratch.columns enabled
Rajkumar Singh created HIVE-20275: - Summary: hive produces incorrect result when using MIN()/MAX() on varchar with hive.vectorized.reuse.scratch.columns enabled Key: HIVE-20275 URL: https://issues.apache.org/jira/browse/HIVE-20275 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Environment: Hive3.1,java8 Reporter: Rajkumar Singh Steps to reproduce: {code} create table testhive3 (name varchar(8), `time` double); insert into table testhive3 values ('ABC', 1), ('ABC', 2), ('DEF', 1), ('DEF', 2), ('DEF', 1), ('DEF', 2), ('ABC', 1), ('ABC', 2), ('DEF', 1), ('DEF', 2), ('ABC', 1), ('ABC', 2), ('ABC', 1), ('ABC', 2), ('DEF', 1), ('DEF', 2), ('ABC', 1), ('ABC', 2), ('ABC', 1), ('ABC', 2), ('DEF', 1), ('DEF', 2), ('ABC', 1), ('ABC', 2), ('DEF', 1), ('DEF', 2), ( 'ABC', 1), ( NULL, NULL), ( 'ABC', 1), ( 'ABC', 2), ( 'DEF', 1), ('DEF', 2), ('ABC', 1), ( 'ABC', 2), ('ABC', 1), ( 'ABC', 2), ( 'DEF', 1), ('DEF', 2); select name, `time` from testhive3 where name = 'ABC' group by name, `time`; +---+---+ | name | time | +---+---+ | ABC | 1.0 | | ABC | 2.0 | +---+---+ select min(name), `time` from testhive3 where name = 'ABC' group by name, `time`; +---+---+ | _c0 | time | +---+---+ | NULL | 1.0 | | NULL | 2.0 | +---+---+ set hive.vectorized.reuse.scratch.columns=false; select min(name), `time` from testhive3 where name = 'ABC' group by name, `time`; +--+---+ | _c0 | time | +--+---+ | ABC | 1.0 | | ABC | 2.0 | +--+---+ {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20343) Hive 3: CTAS does not respect transactional_properties
Rajkumar Singh created HIVE-20343: - Summary: Hive 3: CTAS does not respect transactional_properties Key: HIVE-20343 URL: https://issues.apache.org/jira/browse/HIVE-20343 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Environment: hive-3 Reporter: Rajkumar Singh Steps to reproduce: {code} create table ctasexampleinsertonly stored as orc TBLPROPERTIES ("transactional_properties"="insert_only") as select * from testtable limit 1; describe formatted ctasexampleinsertonly +---++---+ | col_name| data_type |comment| +---++---+ | # col_name| data_type | comment | | name | varchar(8) | | | time | double | | | | NULL | NULL | | # Detailed Table Information | NULL | NULL | | Database: | default | NULL | | OwnerType:| USER | NULL | | Owner:| hive | NULL | | CreateTime: | Wed Aug 08 21:35:15 UTC 2018 | NULL | | LastAccessTime: | UNKNOWN | NULL | | Retention:| 0 | NULL | | Location: | hdfs://xx:8020/warehouse/tablespace/managed/hive/ctasexampleinsertonly | NULL | | Table Type: | MANAGED_TABLE | NULL | | Table Parameters: | NULL | NULL | | | COLUMN_STATS_ACCURATE | {}| | | bucketing_version | 2 | | | numFiles | 1 | | | numRows | 1 | | | rawDataSize | 0 | | | totalSize | 754 | | | transactional | true | | | transactional_properties | default | | | transient_lastDdlTime | 1533764115| | | NULL | NULL | | # Storage Information | NULL | NULL | | SerDe Library:| org.apache.hadoop.hive.ql.io.orc.OrcSerde | NULL | | InputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcInputFormat| NULL | | OutputFormat: | org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat | NULL | | Compressed: | No | NULL | | Num Buckets: | -1 | NULL | | Bucket Columns: | [] | NULL | | Sort Columns: | [] | NULL | | Storage Desc Params: | NULL | NULL | | | serialization.format | 1 | +---++---+ {code} this creates a problem with insert {code} CREATE TABLE TABLE
[jira] [Created] (HIVE-20409) Hive ACID: Update/delete/merge leave behind the staging directory
Rajkumar Singh created HIVE-20409: - Summary: Hive ACID: Update/delete/merge leave behind the staging directory Key: HIVE-20409 URL: https://issues.apache.org/jira/browse/HIVE-20409 Project: Hive Issue Type: Bug Environment: Hive-2.1,java-1.8 Reporter: Rajkumar Singh UpdateDeleteSemanticAnalyzer creates query context while rewriting the context which doesn't set hdfscleanup, As a result, Driver doesn't clear the staging dir. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20415) Hive1: Tez Session failed to return if background thread is intruppted
Rajkumar Singh created HIVE-20415: - Summary: Hive1: Tez Session failed to return if background thread is intruppted Key: HIVE-20415 URL: https://issues.apache.org/jira/browse/HIVE-20415 Project: Hive Issue Type: Bug Affects Versions: 1.2.1 Reporter: Rajkumar Singh user canceled the query which interrupts the background thread, because of this interrupt background thread fail to put the session back to the pool. {code} 2018-08-14 15:55:27,581 ERROR exec.Task (TezTask.java:execute(226)) - Failed to execute tez graph. java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220) at java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335) at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:350) at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.returnSession(TezSessionPoolManager.java:176) {code} we need a similar fix as HIVE-15731 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20442) Hive stale lock when the hiveserver2 background thread died abruptly
Rajkumar Singh created HIVE-20442: - Summary: Hive stale lock when the hiveserver2 background thread died abruptly Key: HIVE-20442 URL: https://issues.apache.org/jira/browse/HIVE-20442 Project: Hive Issue Type: Bug Components: Hive, Transactions Affects Versions: 2.1.1 Environment: Hive-2.1 Reporter: Rajkumar Singh this look like a race condition where background thread is not able to release the lock it aquired. 1. hiveserver2 background thread request for lock {code} 2018-08-20T14:13:38,813 INFO [HiveServer2-Background-Pool: Thread-X]: lockmgr.DbLockManager (DbLockManager.java:lock(100)) - Requesting: queryId=hive_xxx LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE, dbname:testdb, tablename:test_table, operationType:SELECT)], txnid:0, user:hive, hostname:HOSTNAME, agentInfo:hive_xxx) {code} 2. acquired the lock and start heartbeating {code} 2018-08-20T14:36:30,233 INFO [HiveServer2-Background-Pool: Thread-X]: lockmgr.DbTxnManager (DbTxnManager.java:startHeartbeat(517)) - Started heartbeat with delay/interval = 15/15 MILLISECONDS for query: agentInfo:hive_xxx {code} 3. during time between event #1 and #2, client disconnected and deleteContext cleanup the session dir {code} 2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-XXX]: thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(136)) - Session disconnected without closing properly. 2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-]: thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(140)) - Closing the session: SessionHandle [3be07faf-5544-4178-8b50-8173002b171a] 2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-]: service.CompositeService (SessionManager.java:closeSession(363)) - Session closed, SessionHandle [xxx], current sessions:2 {code} 4. background thread died with NPE while trying to get the queryid {code} java.lang.NullPointerException: null at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1568) ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414) ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1211) ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1204) ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242) [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336) [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_77] at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_77] {code} did not get a chance to release the lock and heartbeater thread continue heartbeat indefinately. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20499) GetTablesOperation pull all the tables meta irrespective of auth.
Rajkumar Singh created HIVE-20499: - Summary: GetTablesOperation pull all the tables meta irrespective of auth. Key: HIVE-20499 URL: https://issues.apache.org/jira/browse/HIVE-20499 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.0 Environment: hive-3,java-8,sqlstdaut/ranger auth enabled. Reporter: Rajkumar Singh Assignee: Rajkumar Singh GetTablesOperation pull all the tables meta irrespective of auth. Steps to reproduce: {code} ResultSet res = con.getMetaData().getTables("", "", "%", new String[] { "TABLE", "VIEW" }); {code} https://github.com/rajkrrsingh/HiveServer2JDBCSample/blob/master/src/main/java/TestConnection.java#L20 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20568) GetTablesOperation : There is no need to convert the dbname to pattern
Rajkumar Singh created HIVE-20568: - Summary: GetTablesOperation : There is no need to convert the dbname to pattern Key: HIVE-20568 URL: https://issues.apache.org/jira/browse/HIVE-20568 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 0.4.0 Environment: Hive-4,Java-8 Reporter: Rajkumar Singh there is no need to convert the dbname to pattern, dbNamePattern is just a dbName which we are passing to getTableMeta https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java#L117 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20591) hive query hung during compilation if same previous query is unable to invalidate the QueryResultsCache entry
Rajkumar Singh created HIVE-20591: - Summary: hive query hung during compilation if same previous query is unable to invalidate the QueryResultsCache entry Key: HIVE-20591 URL: https://issues.apache.org/jira/browse/HIVE-20591 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 3.0.0 Environment: Hive-3,java-8 Reporter: Rajkumar Singh I believe this is the sequence of event to reproduce this issue. 1. query failed with some env issue while setting up the Tez session. 2. hiveserver2 tries do query cleanup, it invokes queryresultscache cleanup. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java#L235 3. for some reason eighter following 2 event never happen and query falls into the endless loop of checking the valid status. i: unable to set the invalid status and return the old status https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java#L260 ii: or this condition never reached. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java#L245 I don't have complete jstack so its tough to say who is waiting on what, the stuck thread stack snipped look like {code} java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hive.ql.cache.results.QueryResultsCache$CacheEntry.waitForValidStatus(QueryResultsCache.java:325) - locked <0xb32661c0> (a org.apache.hadoop.hive.ql.cache.results.QueryResultsCache$CacheEntry) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.checkResultsCache(SemanticAnalyzer.java:14860) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12200) {code} will add more details after reproducing the issue again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20616) Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars
Rajkumar Singh created HIVE-20616: - Summary: Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars Key: HIVE-20616 URL: https://issues.apache.org/jira/browse/HIVE-20616 Project: Hive Issue Type: Bug Reporter: Rajkumar Singh Assignee: Rajkumar Singh with mysql as metastore db the PARTITION_PARAMS.PARAM_VALUE defined as varchar(4000) {code} describe PARTITION_PARAMS; +-+---+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +-+---+--+-+-+---+ | PART_ID | bigint(20) | NO | PRI | NULL | | | PARAM_KEY | varchar(256) | NO | PRI | NULL | | | PARAM_VALUE | varchar(4000) | YES | | NULL | | +-+---+--+-+-+---+ {code} which lead to the MoveTask failure if PART_VALUE excceeds 4000 chars. {code} org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO `PARTITION_PARAMS` (`PARAM_VALUE`,`PART_ID`,`PARAM_KEY`) VALUES (?,?,?) at org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1074) at org.datanucleus.store.rdbms.scostore.JoinMapStore.putAll(JoinMapStore.java:224) at org.datanucleus.store.rdbms.mapping.java.MapMapping.postInsert(MapMapping.java:158) at org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:522) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138) at org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363) at org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339) at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080) at org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923) at org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778) at org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217) at org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:724) at org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749) at org.apache.hadoop.hive.metastore.ObjectStore.addPartition(ObjectStore.java:2442) at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) at com.sun.proxy.$Proxy32.addPartition(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_core(HiveMetaStore.java:3976) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_with_environment_context(HiveMetaStore.java:4032) at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) at com.sun.proxy.$Proxy34.add_partition_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partition_with_environment_context.getResult(ThriftHiveMetastore.java:15528) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partition_with_environment_context.getResult(ThriftHiveMetastore.java:15512) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncatio
[jira] [Created] (HIVE-20673) vectorized map join fail with Unexpected column vector type STRUCT.
Rajkumar Singh created HIVE-20673: - Summary: vectorized map join fail with Unexpected column vector type STRUCT. Key: HIVE-20673 URL: https://issues.apache.org/jira/browse/HIVE-20673 Project: Hive Issue Type: Bug Components: Hive, Transactions, Vectorization Affects Versions: 3.1.0 Environment: hive-3, java-8 Reporter: Rajkumar Singh update query on ACID table fails with the following exception. UPDATE census_clus SET name = 'updated name' where ssn=100 and EXISTS (select distinct ssn from census where ssn=census_clus.ssn); {code} Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:354) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected column vector type STRUCT at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:302) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:419) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.initializeOp(VectorMapJoinGenerateResultOperator.java:115) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:572) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:524) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:335) {code} STEPS TO REPRODUCE {code} create table census( ssn int, name string, city string, email string) row format delimited fields terminated by ','; insert into census values(100,"raj","san jose","email"); create table census_clus( ssn int, name string, city string, email string) clustered by (ssn) into 4 buckets stored as orc TBLPROPERTIES ('transactional'='true'); insert into table census_clus select * from census; UPDATE census_clus SET name = 'updated name' where ssn=100 and EXISTS (select distinct ssn from census where ssn=census_clus.ssn); {code} looking at the exception it seems the join operator getting typeInfo incorrectly while doing join, _col6 seems to be of struct type. {code} 2018-10-02 22:22:23,392 [INFO] [TezChild] |exec.CommonJoinOperator|: JOIN struct<_col2:string,_col3:string,_col6:struct> totalsz = 3 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.
Rajkumar Singh created HIVE-20848: - Summary: After setting UpdateInputAccessTimeHook query fail with Table Not Found. Key: HIVE-20848 URL: https://issues.apache.org/jira/browse/HIVE-20848 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh {code} select from_unixtime(1540495168); set hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec; select from_unixtime(1540495168); {code} the second select fail with following exception {code} ERROR ql.Driver: FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found _dummy_table) org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found _dummy_table at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155) at org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20908) Avoid multiple getTableMeta calls per database
Rajkumar Singh created HIVE-20908: - Summary: Avoid multiple getTableMeta calls per database Key: HIVE-20908 URL: https://issues.apache.org/jira/browse/HIVE-20908 Project: Hive Issue Type: Bug Reporter: Rajkumar Singh following HIVE-19432, we are doing getTableMeta for each authorized db instead of that we can pass pattern for metastore. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21342) Analyze compute stats for column leave behind staging dir on hdfs
Rajkumar Singh created HIVE-21342: - Summary: Analyze compute stats for column leave behind staging dir on hdfs Key: HIVE-21342 URL: https://issues.apache.org/jira/browse/HIVE-21342 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Environment: hive-3.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh staging dir cleanup does not happen for the "analyze table .. compute statistics for columns", this leads to stale directory on hdfs. the problem seems to be with ColumnStatsSemanticAnalyzer which don't have hdfscleanup set for the context. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java#L310 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21499) should not remove the function if create command failed with AlreadyExistsException
Rajkumar Singh created HIVE-21499: - Summary: should not remove the function if create command failed with AlreadyExistsException Key: HIVE-21499 URL: https://issues.apache.org/jira/browse/HIVE-21499 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Environment: Hive-3.1 Reporter: Rajkumar Singh As a part of HIVE-20953 we are removing the function if creation for same failed with any reason, this will yield into the following situation. 1. create function failed since function already exists 2. on #1 failure hive will clear the permanent function from the registry 3. this function will be of no use until hiveserver2 restarted. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21538) Beeline: password source though the console reader did not pass to connection param
Rajkumar Singh created HIVE-21538: - Summary: Beeline: password source though the console reader did not pass to connection param Key: HIVE-21538 URL: https://issues.apache.org/jira/browse/HIVE-21538 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.0 Environment: Hive-3.1 auth set to LDAP Reporter: Rajkumar Singh Beeline: password source through the console reader do not pass to connection param, this will yield into the Authentication failure in case of LDAP authentication. {code} beeline -n USER -u "jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" -p Connecting to jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=USER Enter password for jdbc:hive2://host:2181/: 19/03/26 19:49:44 [main]: WARN jdbc.HiveConnection: Failed to connect to host:1 19/03/26 19:49:44 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 configs from ZooKeeper Unknown HS2 problem when communicating with Thrift server. Error: Could not open client transport for any of the Server URI's in ZooKeeper: Peer indicated failure: PLAIN auth failed: javax.security.sasl.AuthenticationException: Error validating LDAP user [Caused by javax.naming.AuthenticationException: [LDAP: error code 49 - 80090308: LdapErr: DSID-0C0903C8, comment: AcceptSecurityContext error, data 52e, v2580]] (state=08S01,code=0) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21601) Hive JDBC Storage Handler query fail because projected timestamp max precision is not valid for mysql
Rajkumar Singh created HIVE-21601: - Summary: Hive JDBC Storage Handler query fail because projected timestamp max precision is not valid for mysql Key: HIVE-21601 URL: https://issues.apache.org/jira/browse/HIVE-21601 Project: Hive Issue Type: Bug Components: Hive, JDBC Affects Versions: 3.1.1 Environment: Hive-3.1 Reporter: Rajkumar Singh Steps to reproduce: {code} --mysql table mysql> show create table dd_timestamp_error; ++--+ | Table | Create Table | ++--+ | dd_timestamp_error | CREATE TABLE `dd_timestamp_error` ( `col1` text, `col2` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 | ++--+ 1 row in set (0.00 sec) -- hive table ++ | createtab_stmt | ++ | CREATE EXTERNAL TABLE `dd_timestamp_error`(| | `col1` string COMMENT 'from deserializer', | | `col2` timestamp COMMENT 'from deserializer')| | ROW FORMAT SERDE | | 'org.apache.hive.storage.jdbc.JdbcSerDe' | | STORED BY | | 'org.apache.hive.storage.jdbc.JdbcStorageHandler' | | WITH SERDEPROPERTIES ( | | 'serialization.format'='1') | | TBLPROPERTIES (| | 'bucketing_version'='2', | | 'hive.sql.database.type'='MYSQL',| | 'hive.sql.dbcp.maxActive'='1', | | 'hive.sql.dbcp.password'='testuser', | | 'hive.sql.dbcp.username'='testuser', | | 'hive.sql.jdbc.driver'='com.mysql.jdbc.Driver', | | 'hive.sql.jdbc.url'='jdbc:mysql://c46-node3.squadron-labs.com/test', | | 'hive.sql.table'='dd_timestamp_error', | | 'transient_lastDdlTime'='1554910389')| ++ --query failure 0: jdbc:hive2://c46-node2.squadron-labs.com:2> select * from dd_timestamp_error where col2 = '2019-04-03 15:54:21.543654'; Error: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'TIMESTAMP(9)) AS `col2` -- explain select * from dd_timestamp_error where col2 = '2019-04-03 15:54:21.543654'; TableScan [TS_0] | | Output:["col1","col2"],properties:{"hive.sql.query":"SELECT `col1`, CAST(TIMESTAMP '2019-04-03 15:54:21.543654000' AS TIMESTAMP(9)) AS `col2`\nFROM `dd_timestamp_error`\nWHERE `col2` = TIMESTAMP '2019-04-03 15:54:21.543654000'","hive.sql.query.fieldNames":"col1,col2","hive.sql.query.fieldTypes":"string,timestamp","hive.sql.query.split":"true"} | | {code} the problem seems to be with convertedFilterExpr ( -- where col2 = '2019-04-03 15:54:21.543654';) while comparing timestamp with constant:- https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java#L856 https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveTypeSystemImpl.java#L38 hive timestamp MAX_TIMESTAMP_PRECISION seems to be 9 and it appears that hive pushes the same in query projection(JDBC project) for MySQL and fail the query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21728) WorkloadManager logging fix
Rajkumar Singh created HIVE-21728: - Summary: WorkloadManager logging fix Key: HIVE-21728 URL: https://issues.apache.org/jira/browse/HIVE-21728 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.2.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh logger skip the following message if HS2 is running in INFO level. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java#L705 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21902) HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty response header
Rajkumar Singh created HIVE-21902: - Summary: HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty response header Key: HIVE-21902 URL: https://issues.apache.org/jira/browse/HIVE-21902 Project: Hive Issue Type: Improvement Reporter: Rajkumar Singh some vulnerability are reported for webserver ui X-Frame-Options or Content-Security-Policy: frame-ancestors HTTP Headers missing on port 10002. {code} GET / HTTP/1.1 Host: HOSTNAME:10002 Connection: Keep-Alive X-XSS-Protection HTTP Header missing on port 10002. X-Content-Type-Options HTTP Header missing on port 10002. {code} after the proposed changes {code} HTTP/1.1 200 OK Date: Thu, 20 Jun 2019 05:29:59 GMT Content-Type: text/html;charset=utf-8 X-Content-Type-Options: nosniff X-FRAME-OPTIONS: SAMEORIGIN X-XSS-Protection: 1; mode=block Set-Cookie: JSESSIONID=15kscuow9cmy7qms6dzaxllqt;Path=/ Expires: Thu, 01 Jan 1970 00:00:00 GMT Content-Length: 3824 Server: Jetty(9.3.25.v20180904) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21903) HiveServer2 Query Compilation fails with StackOverflowError if the list of IN is too big
Rajkumar Singh created HIVE-21903: - Summary: HiveServer2 Query Compilation fails with StackOverflowError if the list of IN is too big Key: HIVE-21903 URL: https://issues.apache.org/jira/browse/HIVE-21903 Project: Hive Issue Type: Bug Affects Versions: 3.1.0 Environment: Hive-3.1. java-8, Thread stack size default set to 1024k Reporter: Rajkumar Singh Attachments: thread-progress.log Steps to Reproduce: The query including some joins and IN clause containing more than 15000 values. Attaching the Handlers thread progress before it runs into the StackOverFlowError. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21927) HiveServer Web UI: Setting the HttpOnly option in the cookies
Rajkumar Singh created HIVE-21927: - Summary: HiveServer Web UI: Setting the HttpOnly option in the cookies Key: HIVE-21927 URL: https://issues.apache.org/jira/browse/HIVE-21927 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Intend of this JIRA is to introduce the HttpOnly option in the cookie. cookie: before change {code:java} hdp32b FALSE / FALSE 0 JSESSIONID 8dkibwayfnrc4y4hvpu3vh74 {code} after change: {code:java} #HttpOnly_hdp32bFALSE / FALSE 0 JSESSIONID e1npdkbo3inj1xnd6gdc6ihws {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21935) Hive Vectorization : Server performance issue with vectorize UDF
Rajkumar Singh created HIVE-21935: - Summary: Hive Vectorization : Server performance issue with vectorize UDF Key: HIVE-21935 URL: https://issues.apache.org/jira/browse/HIVE-21935 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 3.1.1 Environment: Hive-3, JDK-8 Reporter: Rajkumar Singh with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we were seeing severe performance degradation. looking at the task jstacks it seems that it is running the code which vectorizes UDF and stuck in some loop. {code:java} jstack -l 14954 | grep 0x3af0 -A20 "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 runnable [0x7f1547581000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573) at org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) at org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) at org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) at org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) [yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20 "TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 runnable [0x7f1547581000] java.lang.Thread.State: RUNNABLE at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554) at org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570) at org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350) at org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205) at org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271) at org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146) at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) {code} after setting the hive.vectori
[jira] [Created] (HIVE-21972) "show transactions" display the header twice
Rajkumar Singh created HIVE-21972: - Summary: "show transactions" display the header twice Key: HIVE-21972 URL: https://issues.apache.org/jira/browse/HIVE-21972 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.1.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh show transactions; +-+++--+---+---+ | txnid | state| startedtime | lastheartbeattime | user | host| +-+++--+---+---+ | Transaction ID | Transaction State | Started Time | Last Heartbeat Time | User | Hostname | | 896 | ABORTED| 1560209607000 | 1560209607000 | hive | hdp32b.hdp.local | +-+++--+---+---+ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21973) "show locks" print the header twice.
Rajkumar Singh created HIVE-21973: - Summary: "show locks" print the header twice. Key: HIVE-21973 URL: https://issues.apache.org/jira/browse/HIVE-21973 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.1.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh show locks; -- output {code:java} +--+---+++-+-++-+-+--+---+---+-+ | lockid | database | table | partition | lock_state | blocked_by | lock_type | transaction_id | last_heartbeat | acquired_at | user | hostname | agent_info | +--+---+++-+-++-+-+--+---+---+-+ | Lock ID | Database | Table | Partition | State | Blocked By | Type | Transaction ID | Last Heartbeat | Acquired At | User | Hostname | Agent Info | +--+---+++-+-++-+-+--+---+---+-+ {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21986) HiveServer Web UI: Setting the Strict-Transport-Security in default response header
Rajkumar Singh created HIVE-21986: - Summary: HiveServer Web UI: Setting the Strict-Transport-Security in default response header Key: HIVE-21986 URL: https://issues.apache.org/jira/browse/HIVE-21986 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 3.1.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Currently, HiveServer UI HTTP response header doesn't have Strict-Transport-Security set so will be adding this to default header. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-22081) Hivemetastore Performance: Compaction Initiator thread overwhelmed if no there are too many Table/partitions are eligible for compaction
Rajkumar Singh created HIVE-22081: - Summary: Hivemetastore Performance: Compaction Initiator thread overwhelmed if no there are too many Table/partitions are eligible for compaction Key: HIVE-22081 URL: https://issues.apache.org/jira/browse/HIVE-22081 Project: Hive Issue Type: Improvement Components: Transactions Affects Versions: 3.1.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh if Automatic Compaction is turned on, Initiator thread check for potential table/partitions which are eligible for compactions and run some checks in for loop before requesting compaction for eligibles. Though initiator thread is configured to run at interval 5 min default, in case of many objects it keeps on running as these checks are IO intensive and hog cpu. In the proposed changes, I am planning to do 1. passing less object to for loop by filtering out the objects based on the condition which we are checking within the loop. 2. Doing Async call using future to determine compaction type(this is where we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22118) compaction worker thread won't log the table name while skipping the compaction because it's sorted table/partitions
Rajkumar Singh created HIVE-22118: - Summary: compaction worker thread won't log the table name while skipping the compaction because it's sorted table/partitions Key: HIVE-22118 URL: https://issues.apache.org/jira/browse/HIVE-22118 Project: Hive Issue Type: Improvement Components: Transactions Affects Versions: 3.1.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Attachments: HIVE-22118.patch for debugging perspective it's good if we log the full table name while skipping the table for compaction otherwise it's tedious to know why the compaction is not happening for the target table. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (HIVE-22144) HiveServer Web UI: Adding secure flag to the cookies options
Rajkumar Singh created HIVE-22144: - Summary: HiveServer Web UI: Adding secure flag to the cookies options Key: HIVE-22144 URL: https://issues.apache.org/jira/browse/HIVE-22144 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 3.1.1 Reporter: Rajkumar Singh Assignee: Rajkumar Singh introduce a secure flag to the cookie option. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (HIVE-22173) HiveServer2: Query with multiple lateral view hung forever during compile stage
Rajkumar Singh created HIVE-22173: - Summary: HiveServer2: Query with multiple lateral view hung forever during compile stage Key: HIVE-22173 URL: https://issues.apache.org/jira/browse/HIVE-22173 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.1 Environment: Hive-3.1.1, Java-8 Reporter: Rajkumar Singh Steps To Repro: {code:java} -- create table CREATE EXTERNAL TABLE `jsontable`( `json_string` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; -- Run explain of the query explain SELECT * FROM jsontable lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t17 as c17 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t18 as c18 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t19 as c19 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.country'), "\\[|\\]|\"", ""),',')) t20 as c20 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.country'), "\\[|\\]|\"", ""),',')) t21 as c21 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield'), "\\[|\\]|\"", ""),',')) t22 as c22 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.postalCode'), "\\[|\\]|\"", ""),',')) t23 as c23 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.postalCode'), "\\[|\\]|\"", ""),',')) t24 as c24 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.state'), "\\[|\\]|\"", ""),',')) t25 as c25 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.state'), "\\[|\\]|\"", ""),',')) t26 as c26 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2'), "\\[|\\]|\"", ""),',')) t27 as c27 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfield2.streetAddressLine'), "\\[|\\]|\"", ""),',')) t28 as c28 lateral view explode(split(regexp_replace(get_json_object(jsontable.json_string, '$.jsonfi
[jira] [Created] (HIVE-22255) Hive don't trigger Major Compaction automatically if table contains all base files
Rajkumar Singh created HIVE-22255: - Summary: Hive don't trigger Major Compaction automatically if table contains all base files Key: HIVE-22255 URL: https://issues.apache.org/jira/browse/HIVE-22255 Project: Hive Issue Type: Bug Components: Hive, Transactions Affects Versions: 3.1.2 Environment: Hive-3.1.1 Reporter: Rajkumar Singh user may run into the issue if the table consists of all base files but no delta, then the following condition will yield false and automatic major compaction will be skipped. [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L313] Steps to Reproduce: # create Acid table {code:java} // create table myacid(id int); {code} # Run multiple insert table {code:java} // insert overwrite table myacid values(1);insert overwrite table myacid values(2),(3),(4){code} # DFS ls output {code:java} // dfs -ls -R /warehouse/tablespace/managed/hive/myacid; ++ | DFS Output | ++ | drwxrwx---+ - hive hadoop 0 2019-09-27 16:42 /warehouse/tablespace/managed/hive/myacid/base_001 | | -rw-rw+ 3 hive hadoop 1 2019-09-27 16:42 /warehouse/tablespace/managed/hive/myacid/base_001/_orc_acid_version | | -rw-rw+ 3 hive hadoop 610 2019-09-27 16:42 /warehouse/tablespace/managed/hive/myacid/base_001/bucket_0 | | drwxrwx---+ - hive hadoop 0 2019-09-27 16:43 /warehouse/tablespace/managed/hive/myacid/base_002 | | -rw-rw+ 3 hive hadoop 1 2019-09-27 16:43 /warehouse/tablespace/managed/hive/myacid/base_002/_orc_acid_version | | -rw-rw+ 3 hive hadoop 633 2019-09-27 16:43 /warehouse/tablespace/managed/hive/myacid/base_002/bucket_0 | ++ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22317) Beeline site parser does not handle the variable substitution correctly
Rajkumar Singh created HIVE-22317: - Summary: Beeline site parser does not handle the variable substitution correctly Key: HIVE-22317 URL: https://issues.apache.org/jira/browse/HIVE-22317 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 4.0.0 Environment: Hive-4.0.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh beeline-site.xml {code:java} http://www.w3.org/2001/XInclude";> beeline.hs2.jdbc.url.container jdbc:hive2://c3220-node2.host.com:2181,c3220-node3.host.com:2181,c3220-node4.host.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 beeline.hs2.jdbc.url.default test beeline.hs2.jdbc.url.test ${beeline.hs2.jdbc.url.container}?tez.queue.name=myqueue beeline.hs2.jdbc.url.llap jdbc:hive2://c3220-node2.host.com:2181,c3220-node3.host.com:2181,c3220-node4.host.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive {code} beeline fail to connect because it does not parse the substituted value correctly {code:java} beeline Error in parsing jdbc url: ${beeline.hs2.jdbc.url.container}?tez.queue.name=myqueue from beeline-site.xml beeline> {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22352) Hive JDBC Storage Handler, simple select query failed with NPE if executed using Fetch Task
Rajkumar Singh created HIVE-22352: - Summary: Hive JDBC Storage Handler, simple select query failed with NPE if executed using Fetch Task Key: HIVE-22352 URL: https://issues.apache.org/jira/browse/HIVE-22352 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.1 Environment: Hive-3.1 Reporter: Rajkumar Singh Steps To Repro: {code:java} // MySQL Table CREATE TABLE `visitors` ( `id` bigint(20) unsigned NOT NULL, `date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ) // hive table CREATE EXTERNAL TABLE `hive_visitors`( `col1` bigint COMMENT 'from deserializer', `col2` timestamp COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hive.storage.jdbc.JdbcSerDe' STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' WITH SERDEPROPERTIES ( 'serialization.format'='1') TBLPROPERTIES ( 'bucketing_version'='2', 'hive.sql.database.type'='MYSQL', 'hive.sql.dbcp.maxActive'='1', 'hive.sql.dbcp.password'='hive', 'hive.sql.dbcp.username'='hive', 'hive.sql.jdbc.driver'='com.mysql.jdbc.Driver', 'hive.sql.jdbc.url'='jdbc:mysql://hostname/test', 'hive.sql.table'='visitors', 'transient_lastDdlTime'='1554910389') Query: select * from hive_visitors ; Exception: 2019-10-16T04:04:39,483 WARN [HiveServer2-Handler-Pool: Thread-71]: thrift.ThriftCLIService (:()) - Error fetching results: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.NullPointerException at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:478) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:952) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at com.sun.proxy.$Proxy42.fetchResults(Unknown Source) ~[?:?] at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:565) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:792) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:602) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:509) ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Fetch
[jira] [Created] (HIVE-22353) Hive JDBC Storage Handler: Simple query fail with " Invalid table alias or column reference"
Rajkumar Singh created HIVE-22353: - Summary: Hive JDBC Storage Handler: Simple query fail with " Invalid table alias or column reference" Key: HIVE-22353 URL: https://issues.apache.org/jira/browse/HIVE-22353 Project: Hive Issue Type: Bug Components: Hive Reporter: Rajkumar Singh Steps To Repro: {code:java} // show create table (Hive) CREATE EXTERNAL TABLE `hive_visitors`( `col1` bigint COMMENT 'from deserializer', `col2` timestamp COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hive.storage.jdbc.JdbcSerDe' STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' WITH SERDEPROPERTIES ( 'serialization.format'='1') TBLPROPERTIES ( 'bucketing_version'='2', 'hive.sql.database.type'='MYSQL', 'hive.sql.dbcp.maxActive'='1', 'hive.sql.dbcp.password'='hive', 'hive.sql.dbcp.username'='hive', 'hive.sql.jdbc.driver'='com.mysql.jdbc.Driver', 'hive.sql.jdbc.url'='jdbc:mysql://hostname/test', 'hive.sql.table'='visitors', 'transient_lastDdlTime'='1554910389') // MySql Table CREATE TABLE `visitors` ( `id` bigint(20) unsigned NOT NULL, `date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ) // Hive Query select * from hive_visitors where col2='2018-10-15'; Error: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:34 Invalid table alias or column reference 'col2': (possible column names are: id, date) (state=42000,code=10004) {code} col2 is a valid column reference for the hive table, In some old version I was able to run the query referencing the hive column but it broken now. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22397) "describe table" statement for the table backed by custom storage handler fail with CNF
Rajkumar Singh created HIVE-22397: - Summary: "describe table" statement for the table backed by custom storage handler fail with CNF Key: HIVE-22397 URL: https://issues.apache.org/jira/browse/HIVE-22397 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.2 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Steps to Repro: {code:java} 1) describe customsdtable; 2) ADD JAR hdfs:///user/hive/customsdtable.jar; 3) describe customsdtable; CNF is expected for #1 but even adding the custome serde, hive fail with following exception for statement #3 Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.ClassNotFoundException {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22467) Hive-1: does not set jetty request.header.size correctly incase of SSL set up
Rajkumar Singh created HIVE-22467: - Summary: Hive-1: does not set jetty request.header.size correctly incase of SSL set up Key: HIVE-22467 URL: https://issues.apache.org/jira/browse/HIVE-22467 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 1.2.1, 1.0.0, 1.3.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh on hive-1: if user have SSL setup with HS2 then SslSelectChannelConnector override the connector request header settings. [https://github.com/apache/hive/blob/5740946859fcca44b5e453ef02534b1ec5edcbca/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java#L102] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22630) Do not retrieve Materialized View definition for rebuild if query is test SQL
Rajkumar Singh created HIVE-22630: - Summary: Do not retrieve Materialized View definition for rebuild if query is test SQL Key: HIVE-22630 URL: https://issues.apache.org/jira/browse/HIVE-22630 Project: Hive Issue Type: Bug Environment: Hive-3.1.2 Reporter: Rajkumar Singh Assignee: Rajkumar Singh for the query like select 1, select current_timestamp, select current_date hive retrieve all the Materialized view from megastore, if the no of databases are too large then this call take lots of time, the situation becomes worse if there are too frequent if hive server receives frequent "select 1" query ( connection pool uses it to check if the connection is valid or not). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22712) ReExec Driver execute submit the query in default queue irrespective of user defined queue
Rajkumar Singh created HIVE-22712: - Summary: ReExec Driver execute submit the query in default queue irrespective of user defined queue Key: HIVE-22712 URL: https://issues.apache.org/jira/browse/HIVE-22712 Project: Hive Issue Type: Bug Components: Hive, HiveServer2 Affects Versions: 3.1.2 Environment: Hive-3 Reporter: Rajkumar Singh Assignee: Rajkumar Singh we unset the queue name intentionally in TezSessionState#startSessionAndContainers, as a result reexec create a new session in the default queue and create a problem, its a cumbersome to add reexec.overlay.tez.queue.name at session level. I could not find a better way of setting the queue name (I am open for the suggestion here) since it can create a conflict with the Global queue name vs user-defined queue that's why setting while initialization of ReExecutionOverlayPlugin. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22855) Do not change DB location to external if DB URI already exists or already referring to non-managed locattion
Rajkumar Singh created HIVE-22855: - Summary: Do not change DB location to external if DB URI already exists or already referring to non-managed locattion Key: HIVE-22855 URL: https://issues.apache.org/jira/browse/HIVE-22855 Project: Hive Issue Type: Bug Components: Hive Environment: Hive-3 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Attachments: HIVE-22855.patch from Spark: {code:java} spark.sql("CREATE DATABASE IF NOT EXISTS test LOCATION '/tmp/test'") spark.sql("describe database test").show(false) {code} describe output suggests that DB URI is updated to the external warehouse path, all data will be written to hive warehouse external path which is undesired. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22945) Hive ACID Data Corruption: Update command mess the other column data and produces incorrect result
Rajkumar Singh created HIVE-22945: - Summary: Hive ACID Data Corruption: Update command mess the other column data and produces incorrect result Key: HIVE-22945 URL: https://issues.apache.org/jira/browse/HIVE-22945 Project: Hive Issue Type: Bug Components: Hive, Transactions Affects Versions: 3.2.0 Reporter: Rajkumar Singh Hive Update Operation update the other column incorrectly and produces incorrect results: Steps to reproduce: {code:java} CREATE TABLE `test`( `start_dt` timestamp, `stop_dt` timestamp ); INSERT INTO test (start_dt, stop_dt) SELECT CURRENT_TIMESTAMP, CAST(NULL AS TIMESTAMP); select * from test; +--+---+ | test.start_dt | test.stop_dt | +--+---+ | 2020-02-28 20:06:29.116 | NULL | +--+---+ UPDATE test SET STOP_DT = CURRENT_TIMESTAMP WHERE CAST(START_DT AS DATE) = CURRENT_DATE; ++--+ | test.start_dt | test.stop_dt | ++--+ | 2020-02-28 00:00:00.0 | 2020-02-28 20:07:12.248 | ++--+ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23408) hive on Tez : Kafka storage handler broken in secure environment
Rajkumar Singh created HIVE-23408: - Summary: hive on Tez : Kafka storage handler broken in secure environment Key: HIVE-23408 URL: https://issues.apache.org/jira/browse/HIVE-23408 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0 Reporter: Rajkumar Singh hive.server2.authentication.kerberos.principal set in the form of hive/_HOST@REALM, Tez task can start at the random NM host and unfold the value of _HOST with the value of fqdn where it is running. this leads to an authentication issue. for LLAP there is fallback for LLAP daemon keytab/principal, Kafka 1.1 onwards support delegation token and we should take advantage of it for hive on tez. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23457) Hive Incorrect result with subquery while optimizer misses the aggregation stage
Rajkumar Singh created HIVE-23457: - Summary: Hive Incorrect result with subquery while optimizer misses the aggregation stage Key: HIVE-23457 URL: https://issues.apache.org/jira/browse/HIVE-23457 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.2.0 Reporter: Rajkumar Singh Steps to Repro: {code:java} create table abc (id int); insert into table abc values (1),(2),(3),(4),(5),(6); select * from abc order by id desc 6 5 4 3 2 1 select `id` from (select * from abc order by id desc ) as tmp; 1 2 3 4 5 6 {code} looking at the query plan it seems while using the subquery optimizer missed the aggregation stage, I cant see any reduce stage. {code:java} set hive.query.results.cache.enabled=false; explain select * from abc order by id desc; ++ | Explain | ++ | Plan optimized by CBO. | || | Vertex dependency in root stage| | Reducer 2 <- Map 1 (SIMPLE_EDGE) | || | Stage-0| | Fetch Operator | | limit:-1 | | Stage-1| | Reducer 2 vectorized | | File Output Operator [FS_8] | | Select Operator [SEL_7] (rows=6 width=4) | | Output:["_col0"] | | <-Map 1 [SIMPLE_EDGE] vectorized | | SHUFFLE [RS_6] | | Select Operator [SEL_5] (rows=6 width=4) | | Output:["_col0"] | | TableScan [TS_0] (rows=6 width=4)| | default@abc,abc, ACID table,Tbl:COMPLETE,Col:COMPLETE,Output:["id"] | || ++ explain select `id` from (select * from abc order by id desc ) as tmp; +--+ | Explain| +--+ | Plan optimized by CBO. | | | | Stage-0 | | Fetch Operator | | limit:-1 | | Select Operator [SEL_1] | | Output:["_col0"] | | TableScan [TS_0] | | Output:["id"]| | | +--+ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23498) Disable HTTP Trace method on ThriftServer
Rajkumar Singh created HIVE-23498: - Summary: Disable HTTP Trace method on ThriftServer Key: HIVE-23498 URL: https://issues.apache.org/jira/browse/HIVE-23498 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.2 Reporter: Rajkumar Singh Assignee: Rajkumar Singh -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23512) ReplDumpTask: Adding debug to print opentxn for debugging perspective
Rajkumar Singh created HIVE-23512: - Summary: ReplDumpTask: Adding debug to print opentxn for debugging perspective Key: HIVE-23512 URL: https://issues.apache.org/jira/browse/HIVE-23512 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 3.2.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Often time we see that ReplDumpTask waiting for hive.repl.bootstrap.dump.open.txn.timeout (1h) to kill open txns and make progress, the only way to know for what txns it is waiting on is query the Metastore DB and backtrack the txns in HS2 logs to know if open txns are genuinely open for this long or any other issue. I am adding the debug log to print these txns which can help in debugging such issues. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23542) Query based compaction failing with ClassCastException
Rajkumar Singh created HIVE-23542: - Summary: Query based compaction failing with ClassCastException Key: HIVE-23542 URL: https://issues.apache.org/jira/browse/HIVE-23542 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 4.0.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Steps to repro: create table test(id int); insert into table test1 values (1),(2),(3); insert into table test1 values (4),(5),(6); run query-based compactor and it will fail with the following exception:- {code:java} alter table test compact 'major'; -- query based compaction Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92) ... 19 more Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to org.apache.hadoop.io.IntWritable at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:967) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23675) WebHcat: java level deadlock in hcat in presence of InMemoryJAAS
Rajkumar Singh created HIVE-23675: - Summary: WebHcat: java level deadlock in hcat in presence of InMemoryJAAS Key: HIVE-23675 URL: https://issues.apache.org/jira/browse/HIVE-23675 Project: Hive Issue Type: Improvement Reporter: Rajkumar Singh ENV: Keberos/SPNEGO enabled set hive.exec.post.hook; org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.atlas.hive.hook.HiveHook ATLAS Hook use InMemoryJAASConfiguration This is a sequence of the event while issue reproduces: WebHcat -> hcat -> Hive Driver -> post hook execution create ATSHook -> hook start the spnego auth and stuck while finding InMemoryJAASConfiguration used by the AtlasHook (this happens in separate thread ATS Logger) Hcat jstack {code:java} Found one Java-level deadlock: = "ATS Logger 0": waiting to lock monitor 0x7efdc8003a38 (object 0xf3fcfe28, a org.apache.atlas.plugin.classloader.AtlasPluginClassLoader), which is held by "main" "main": waiting to lock monitor 0x7efdc8003da8 (object 0xc0050d40, a org.apache.hadoop.hive.ql.exec.UDFClassLoader), which is held by "ATS Logger 0" Java stack information for the threads listed above: === "ATS Logger 0": at org.apache.atlas.security.InMemoryJAASConfiguration.getAppConfigurationEntry(InMemoryJAASConfiguration.java:238) at sun.security.jgss.LoginConfigImpl.getAppConfigurationEntry(LoginConfigImpl.java:145) at javax.security.auth.login.LoginContext.init(LoginContext.java:251) at javax.security.auth.login.LoginContext.(LoginContext.java:512) at sun.security.jgss.GSSUtil.login(GSSUtil.java:256) at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:158) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:335) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:331) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:330) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:145) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at sun.security.jgss.spnego.SpNegoContext.GSS_initSecContext(SpNegoContext.java:882) at sun.security.jgss.spnego.SpNegoContext.initSecContext(SpNegoContext.java:317) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at sun.net.www.protocol.http.spnego.NegotiatorImpl.init(NegotiatorImpl.java:108) at sun.net.www.protocol.http.spnego.NegotiatorImpl.(NegotiatorImpl.java:117) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at sun.net.www.protocol.http.Negotiator.getNegotiator(Negotiator.java:63) at sun.net.www.protocol.http.NegotiateAuthentication.isSupportedImpl(NegotiateAuthentication.java:130) - locked <0xf48c4d90> (a java.lang.Class for sun.net.www.protocol.http.NegotiateAuthentication) at sun.net.www.protocol.http.NegotiateAuthentication.isSupported(NegotiateAuthentication.java:102) - locked <0xc0050d40> (a org.apache.hadoop.hive.ql.exec.UDFClassLoader) at sun.net.www.protocol.http.AuthenticationHeader.parse(AuthenticationHeader.java:180) at sun.net.www.protocol.http.AuthenticationHeader.(AuthenticationHeader.java:126) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1660) - locked <0xf47b7298> (a sun.net.www.protocol.https.DelegateHttpsURLConnection) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441) - locked <0xf47b7298> (a sun.net.www.protocol.https.DelegateHttpsURLConnection) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:191) at org.apache.hadoop.security.toke
[jira] [Created] (HIVE-23752) Cast as Date for invalid date produce the valid output
Rajkumar Singh created HIVE-23752: - Summary: Cast as Date for invalid date produce the valid output Key: HIVE-23752 URL: https://issues.apache.org/jira/browse/HIVE-23752 Project: Hive Issue Type: Bug Components: Hive Reporter: Rajkumar Singh Hive-3: {code:java} select cast("-00-00" as date) 0002-11-30 select cast("2010-27-54" as date) 2012-04-23 select cast("1992-00-74" as date) ; 1992-02-12 {code} The reason Hive allowing is because Parser formatted is set to LENIENT https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/common/src/java/org/apache/hadoop/hive/common/type/Date.java#L50, this seems to be an intentional change as changing the ResolverStyle to STRICT start failing the tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23753) Make LLAP Secretmanager token path configurable
Rajkumar Singh created HIVE-23753: - Summary: Make LLAP Secretmanager token path configurable Key: HIVE-23753 URL: https://issues.apache.org/jira/browse/HIVE-23753 Project: Hive Issue Type: Bug Components: llap Affects Versions: 4.0.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh In a very Busy LLAP cluster if for some reason the Tokens under zkdtsm_hive_llap0 zk path are not cleaned then LLAP Daemon startup takes a very long time to startup, this may lead to service outage if LLAP daemons are not started and the number of retries while checking LLAP app status exceeds. upon looking the jstack of llap daemon it seems to traverse the zkdtsm_hive_llap0 zk path before starting the secret manager. {code:java} java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1386) - locked <0x7fef36cdd338> (a org.apache.zookeeper.ClientCnxn$Packet) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153) at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302) at org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291) at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) at org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288) at org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279) at org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:142) at org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:138) at org.apache.curator.framework.recipes.cache.PathChildrenCache.internalRebuildNode(PathChildrenCache.java:591) at org.apache.curator.framework.recipes.cache.PathChildrenCache.rebuild(PathChildrenCache.java:331) at org.apache.curator.framework.recipes.cache.PathChildrenCache.start(PathChildrenCache.java:300) at org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.startThreads(ZKDelegationTokenSecretManager.java:370) at org.apache.hadoop.hive.llap.security.SecretManager.startThreads(SecretManager.java:82) at org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:223) at org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:218) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846) at org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:218) at org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:212) at org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.(LlapDaemon.java:279) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23808) "MSCK REPAIR.. DROP Partitions fail" with kryo Exception
Rajkumar Singh created HIVE-23808: - Summary: "MSCK REPAIR.. DROP Partitions fail" with kryo Exception Key: HIVE-23808 URL: https://issues.apache.org/jira/browse/HIVE-23808 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.2.0 Reporter: Rajkumar Singh Steps to the repo: 1. Create External partition table 2. Remove some partition manually be using hdfs dfs -rm command 3. run "MSCK REPAIR.. DROP Partitions" and it will fail with following exception {code:java} 2020-07-06 10:42:11,434 WARN org.apache.hadoop.hive.metastore.utils.RetryUtilities$ExponentiallyDecayingBatchWork: [HiveServer2-Background-Pool: Thread-210]: Exception thrown while processing using a batch size 2 org.apache.hadoop.hive.metastore.utils.MetastoreException: MetaException(message:Index: 117, Size: 0) at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:479) ~[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:432) ~[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.metastore.utils.RetryUtilities$ExponentiallyDecayingBatchWork.run(RetryUtilities.java:91) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.metastore.Msck.dropPartitionsInBatches(Msck.java:496) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.metastore.Msck.repair(Msck.java:223) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.ddl.misc.msck.MsckOperation.execute(MsckOperation.java:74) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225) [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322) [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at java.security.AccessController.doPrivileged(Native Method) [?:1.8.0_242] at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_242] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) [hadoop-common-3.1.1.7.1.1.0-565.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340) [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] // Caused by java.lang.IndexOutOfBoundsException: Index: 117, Size: 0 at java.util.ArrayList.rangeChec
[jira] [Created] (HIVE-23867) Truncate table fail with AccessControlException if doAs enabled and tbl database has source of replication
Rajkumar Singh created HIVE-23867: - Summary: Truncate table fail with AccessControlException if doAs enabled and tbl database has source of replication Key: HIVE-23867 URL: https://issues.apache.org/jira/browse/HIVE-23867 Project: Hive Issue Type: Bug Components: Hive, repl Affects Versions: 3.1.1 Reporter: Rajkumar Singh Steps to repro: 1. enable doAs 2. with some user (not a super user) create database create database sampledb with dbproperties('repl.source.for'='1,2,3'); 3. create table using create table sampledb.sampletble (id int); 4. insert some data into it insert into sampledb.sampletble values (1), (2),(3); 5. Run truncate command on the table which fail with following error {code:java} org.apache.hadoop.ipc.RemoteException: User username is not a super user (non-super user cannot change owner). at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:85) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1907) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:866) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:531) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1498) ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1444) ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.ipc.Client.call(Client.java:1354) ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228) ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?] at com.sun.proxy.$Proxy31.setOwner(Unknown Source) ~[?:?] at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setOwner(ClientNamenodeProtocolTranslatorPB.java:470) ~[hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?] at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) [hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) [hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) [hadoop-common-3.1.1.3.1.5.0-152.jar:?] at com.sun.proxy.$Proxy32.setOwner(Unknown Source) [?:?] at org.apache.hadoop.hdfs.DFSClient.setOwner(DFSClient.java:1914) [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1764) [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1761) [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) [hadoop-common-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.hdfs.DistributedFileSystem.setOwner(DistributedFileSystem.java:1774) [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?] at org.apache.hadoop.hive.metastore.ReplChangeManager.recycle(ReplChangeManager.java:238) [hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.metastore.ReplChangeManager.recycle(ReplChangeManager.java:191) [hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152
[jira] [Created] (HIVE-23886) Filter Query on External table produce no result if hive.metastore.expression.proxy set to MsckPartitionExpressionProxy
Rajkumar Singh created HIVE-23886: - Summary: Filter Query on External table produce no result if hive.metastore.expression.proxy set to MsckPartitionExpressionProxy Key: HIVE-23886 URL: https://issues.apache.org/jira/browse/HIVE-23886 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.0 Reporter: Rajkumar Singh query such as "select count(1) from tpcds_10_parquet.store_returns where sr_returned_date_sk=2452802" return row count as 0 even though partition has enough rows in it. upon investigation, I found that partition list passed during the StatsUtils.getNumRows is of zero size. https://github.com/apache/hive/blob/ccaf783a198e142b408cb57415c4262d27b45831/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L438-L439 it seems partitionlist is retrieved during PartitionPruner https://github.com/apache/hive/blob/36bf7f00731e3b95af3e5eeaa4ce39b375974a74/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java#L439 Hive serialized this filter expression using Kryo before passing to HMS https://github.com/apache/hive/blob/36bf7f00731e3b95af3e5eeaa4ce39b375974a74/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3931 on the server-side if the hive.metastore.expression.proxy set to MsckPartitionExpressionProxy it tries to convert this expression into the string https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MsckPartitionExpressionProxy.java#L50 https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MsckPartitionExpressionProxy.java#L56 because of this bad filter expression hive did not retrieve any partitions, I think to make it work hive should try to deserialize it similar to PartitionExpressionForMetastore. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23968) CTAS with TBLPROPERTIES ('transactional'='false') does not entertain translated table location
Rajkumar Singh created HIVE-23968: - Summary: CTAS with TBLPROPERTIES ('transactional'='false') does not entertain translated table location Key: HIVE-23968 URL: https://issues.apache.org/jira/browse/HIVE-23968 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0 Reporter: Rajkumar Singh HMS translation layer convert the table to external based on the transactional property set to false but MoveTask does not entertain the translated table location and move the data to the managed table location; steps to repro: {code:java} create table nontxnal TBLPROPERTIES ('transactional'='false') as select * from abc; {code} select query on table return nothing t but the source table has data in it. {code:java} select * from nontxnal; +--+ | nontxnal.id | +--+ +--+ {code} --show create table {code:java} CREATE EXTERNAL TABLE `nontxnal`( | | `id` int)| | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' | | STORED AS INPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' | | OUTPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' | | LOCATION | | 'hdfs://hostname:8020/warehouse/tablespace/external/hive/nontxnal' | | TBLPROPERTIES (| | 'TRANSLATED_TO_EXTERNAL'='TRUE', | | 'bucketing_version'='2', | | 'external.table.purge'='TRUE', | | 'transient_lastDdlTime'='1596215634')| {code} table data is moved to the managed location: ``` dfs -ls -R hdfs://hostname:8020/warehouse/tablespace/managed/hive/nontxnal . . . . . . . . . . . . . . . . . . . . . . .> ; ++ | DFS Output | ++ | -rw-rw+ 3 hive hadoop201 2020-07-31 17:05 hdfs://hostname:8020/warehouse/tablespace/managed/hive/nontxnal/00_0 | ++ ``` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24039) update jquery version to mitigate CVE-2020-11023
Rajkumar Singh created HIVE-24039: - Summary: update jquery version to mitigate CVE-2020-11023 Key: HIVE-24039 URL: https://issues.apache.org/jira/browse/HIVE-24039 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Rajkumar Singh Assignee: Rajkumar Singh there is known vulnerability in jquery version used by hive, with this jira plan is to upgrade the jquery version 3.5.0 where it's been fixed. more details about the vulnerability can be found here. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24113) NPE in GenericUDFToUnixTimeStamp
Rajkumar Singh created HIVE-24113: - Summary: NPE in GenericUDFToUnixTimeStamp Key: HIVE-24113 URL: https://issues.apache.org/jira/browse/HIVE-24113 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.2 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Following query will trigger the getPartitionsByExpr call at HMS, HMS will try to evaluate the filter based on the PartitionExpressionForMetastore proxy, this proxy uses the QL packages to evaluate the filter and call GenericUDFToUnixTimeStamp. select * from table_name where hour between from_unixtime(unix_timestamp('2020090120', 'MMddHH') - 1*60*60, 'MMddHH') and from_unixtime(unix_timestamp('2020090122', 'MMddHH') + 2*60*60, 'MMddHH'); I think SessionState in the code path will always be NULL thats why it hit the NPE. {code:java} java.lang.NullPointerException: null at org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initializeInput(GenericUDFToUnixTimeStamp.java:126) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initialize(GenericUDFToUnixTimeStamp.java:75) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:146) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.prepareExpr(PartExprEvalUtils.java:119) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prunePartitionNames(PartitionPruner.java:551) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.filterPartitionsByExpr(PartitionExpressionForMetastore.java:82) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesPrunedByExprNoTxn(ObjectStore.java:3527) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.metastore.ObjectStore.access$1400(ObjectStore.java:252) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3493) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3464) ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3764) [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3499) [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:3452) [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112] at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at com.sun.proxy.$Proxy28.getPartitionsByExpr(Unknown Source) [?:?] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_expr(HiveMetaStore.java:6637) [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja {code} -- This message was sent by Atlassian Jira (v8.3.4#8030
[jira] [Created] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail while Move Operation
Rajkumar Singh created HIVE-24163: - Summary: Dynamic Partitioning Insert fail for MM table fail while Move Operation Key: HIVE-24163 URL: https://issues.apache.org/jira/browse/HIVE-24163 Project: Hive Issue Type: Bug Components: Hive Reporter: Rajkumar Singh Fix For: 3.1.2 -- create MM table {code:java} CREATE TABLE `part1`( | | `id` double, | | `n` double, | | `name` varchar(8), | | `sex` varchar(1))| | PARTITIONED BY ( | | `weight` string, | | `age` string,| | `height` string) | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' | | WITH SERDEPROPERTIES ( | | 'field.delim'='\u0001', | | 'line.delim'='\n', | | 'serialization.format'='\u0001') | | STORED AS INPUTFORMAT | | 'org.apache.hadoop.mapred.TextInputFormat' | | OUTPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' | | LOCATION | | 'hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1' | | TBLPROPERTIES (| | 'bucketing_version'='2', | | 'transactional'='true', | | 'transactional_properties'='insert_only',| | 'transient_lastDdlTime'='1599053368') {code} -- create managed table {code:java} CREATE TABLE `class`( | | `name` varchar(8), | | `sex` varchar(1),| | `age` double,| | `height` double, | | `weight` double) | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' | | STORED AS INPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' | | OUTPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' | | LOCATION | | 'hdfs://hostname:8020/warehouse/tablespace/managed/hive/class' | | TBLPROPERTIES (| | 'bucketing_version'='2', | | 'transactional'='true', | | 'transactional_properties'='default',| | 'transient_lastDdlTime'='1599053345') {code} -- Run Insert query {code:java} INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`) SELECT 0, 0, `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`; {code} it fail during the MoveTask execution: {code:java} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3 is not a directory! at org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.
[jira] [Created] (HIVE-24193) Select query on renamed hive acid table does not produce any output
Rajkumar Singh created HIVE-24193: - Summary: Select query on renamed hive acid table does not produce any output Key: HIVE-24193 URL: https://issues.apache.org/jira/browse/HIVE-24193 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.2 Reporter: Rajkumar Singh Assignee: Rajkumar Singh During onRename, HMS update COMPLETED_TXN_COMPONENTS which fail with CTC_DATABASE column does not exist, upon investigation I found that enclosing quotes are missing for columns thats db query fail with this exception -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24194) Query with column mask fail with ParseException if column name has special char
Rajkumar Singh created HIVE-24194: - Summary: Query with column mask fail with ParseException if column name has special char Key: HIVE-24194 URL: https://issues.apache.org/jira/browse/HIVE-24194 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.2 Reporter: Rajkumar Singh hive query with column masking failed with ParseException Table DDL {code:java} CREATE TABLE `emp`( `id` string, `name#` string); {code} The following query failed {code:java} select `emp`.`id`, `emp`.`name#` from (SELECT `id`, CAST(mask_show_first_n(name#, 4, 'x', 'x', 'x', -1, '1') AS string) AS `name#`, BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM `default`.`emp` )`emp`; {code} Error: Error while compiling statement: FAILED: ParseException line 1:79 character '#' not supported here (state=42000,code=4) quoting manually helped {code:java} select `emp`.`id`, `emp`.`name#` from (SELECT `id`, CAST(mask_show_first_n(`name#`, 4, 'x', 'x', 'x', -1, '1') AS string) AS `name#`, BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM `default`.`emp` )`emp`; {code} manual query change will not work for Ranger authorizer as following query {code:java} select * from emp; {code} will be rewritten to {code:java} select `emp`.`id`, `emp`.`name#` from (SELECT `id`, CAST(mask_show_first_n(name#, 4, 'x', 'x', 'x', -1, '1') AS string) AS `name#`, BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM `default`.`emp` )`emp`; {code} Ranger apply the transformer for column here so we should consider the enclosing the column names in the back-ticks to make it work https://github.com/apache/ranger/blob/master/hive-agent/src/main/java/org/apache/ranger/authorization/hive/authorizer/RangerHiveAuthorizer.java#L1332 I have opened https://issues.apache.org/jira/browse/RANGER-3009 for Ranger but checking we hive can pass the column name enclosed in back-ticks while passing the priv-object to ranger https://github.com/apache/hive/blob/7dd12cd9d7720f22159062d3c3e5d7bdd127/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L11977 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24276) HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) Vulnerability
Rajkumar Singh created HIVE-24276: - Summary: HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) Vulnerability Key: HIVE-24276 URL: https://issues.apache.org/jira/browse/HIVE-24276 Project: Hive Issue Type: Bug Reporter: Rajkumar Singh Assignee: Rajkumar Singh -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24469) StatsTask failure while inserting the data into the table partitioned by timestamp
Rajkumar Singh created HIVE-24469: - Summary: StatsTask failure while inserting the data into the table partitioned by timestamp Key: HIVE-24469 URL: https://issues.apache.org/jira/browse/HIVE-24469 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0 Reporter: Rajkumar Singh Steps to repro: {code:java} CREATE EXTERNAL TABLE `tblsource`( `x` int, `y` string) STORED AS PARQUET; CREATE EXTERNAL TABLE `tblinsert`( `x` int) PARTITIONED BY ( `y` timestamp) STORED AS PARQUET; insert into table tblsource values (5,'2020-11-06 00:00:00.000'); insert into table tblinsert partition(y) select * from tblsource distribute by (y); {code} Query fail while executing the stats task and I can see the exception in HMS {code:java} java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_232] at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_232] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:8629) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:8590) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at com.sun.proxy.$Proxy28.set_aggr_stats_for(Unknown Source) ~[?:?] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:18937) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:18921) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_232] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_232] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) ~[hadoop-common-3.1.1.7.2.0.0-237.jar:?] at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232] {code} I think the problem is with timestamp containing all 000 in nano seconds, after inserting the value 2020-11-06 00:00:00.000, hive perform set_aggr_stats_for and construct the SetPartitionsStatsRequest. during construction of the request since nano seconds are all 0 hive FetchOperator convert the 2020-11-06 00:00:00.000 to 2020-11-06 00:00:00 ( Timestamp.valueOf(string)). https://github.com/apache/hive/blob/f8aa55f9c8f22c4fd293d9531192f7f46099a420/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L176 on HMS https://github.com/apache/hive/blob/2ab194d25311e15487ae010b8dd113879ccd501b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8626 does not yield any partition as the filter expression for partition was 2020-11-06 00:00:00 hence it fail with the above mentioned IndexOutOfBoundsException. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24491) setting custom job name is ineffective if the tez session pool is configured or in case of session reuse.
Rajkumar Singh created HIVE-24491: - Summary: setting custom job name is ineffective if the tez session pool is configured or in case of session reuse. Key: HIVE-24491 URL: https://issues.apache.org/jira/browse/HIVE-24491 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Rajkumar Singh Assignee: Rajkumar Singh HIVE-23026 add capability to set tez.job.name but it's not effective if tez session pool manager is configured or tez session reuse. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24523) Vectorized read path for LazySimpleSerde does not honor the SERDEPROPERTIES for timestamp
Rajkumar Singh created HIVE-24523: - Summary: Vectorized read path for LazySimpleSerde does not honor the SERDEPROPERTIES for timestamp Key: HIVE-24523 URL: https://issues.apache.org/jira/browse/HIVE-24523 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 3.2.0 Reporter: Rajkumar Singh Steps to repro: {code:java} create external table tstable(date_created timestamp) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'timestamp.formats'='MMddHHmmss') stored as textfile; cat sampledata 2020120517 hdfs dfs -put sampledata /warehouse/tablespace/external/hive/tstable {code} disable fetch task conversion and run select * from tstable which produce no results, disabling the set hive.vectorized.use.vector.serde.deserialize=false; return the expected output. while parsing the string to timestamp https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazy/fast/LazySimpleDeserializeRead.java#L812 does not set the DateTimeFormatter which results IllegalArgumentException while parsing the timestamp through TimestampUtils.stringToTimestamp(strValue) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24724) Create table with LIKE operator does not work correctly
Rajkumar Singh created HIVE-24724: - Summary: Create table with LIKE operator does not work correctly Key: HIVE-24724 URL: https://issues.apache.org/jira/browse/HIVE-24724 Project: Hive Issue Type: Bug Components: Hive, HiveServer2 Affects Versions: 4.0.0 Reporter: Rajkumar Singh Steps to repro: {code:java} create table atable (id int, str1 string); alter table atable add constraint pk_atable primary key (id) disable novalidate; create table btable like atable; {code} describe formatted btable lacks the constraints information. CreateTableLikeDesc does not set/fetch the constraints for LIKE table https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L13594-L13616 neither DDLTask fetches/set the constraints for the table. https://github.com/apache/hive/blob/5ba3dfcb6470ff42c58a3f95f0d5e72050274a42/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/like/CreateTableLikeOperation.java#L58-L83 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24848) CBO failed with NPE
Rajkumar Singh created HIVE-24848: - Summary: CBO failed with NPE Key: HIVE-24848 URL: https://issues.apache.org/jira/browse/HIVE-24848 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.1 Reporter: Rajkumar Singh CBO failed for query having a predicate based on from_unixtime udf select * from classification where CAST(from_unixtime(unix_timestamp(cast(partition_batch_ts as string),'MMddHHmmss')) AS TIMESTAMP) = '2021-02-26 02:00:00'; {code:java} 2021-03-04 10:08:58,844 ERROR org.apache.hadoop.hive.ql.parse.CalcitePlanner: [4d92f6e5-9a53-41fb-b53f-9003c338ab52 etp2107079200-38767]: CBO failed, skipping CBO. java.lang.RuntimeException: org.apache.hadoop.hive.ql.parse.SemanticException: java.lang.NullPointerException at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:159) ~[calcite-core-1.19.0.7.1.3.0-100.jar:1.19.0.7.1.3.0-100] at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:114) ~[calcite-core-1.19.0.7.1.3.0-100.jar:1.19.0.7.1.3.0-100] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1544) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:529) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12667) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:422) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:288) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:221) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:188) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:598) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:544) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:538) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127) ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199) ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:260) ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hive.service.cli.operation.Operation.run(Operation.java:274) ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:565) ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:551) ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100] at sun.reflect.GeneratedMethodAccessor207.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_282] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24898) Beeline does not honor the credential provided in property-file
Rajkumar Singh created HIVE-24898: - Summary: Beeline does not honor the credential provided in property-file Key: HIVE-24898 URL: https://issues.apache.org/jira/browse/HIVE-24898 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 4.0.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Beeline read the param correctly from the properties files but again fallback to the default beeline connection which require user to provide username and password. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24916) EXPORT TABLE command to ADLS Gen2/s3 fail with org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not supported for file system: abfs://
Rajkumar Singh created HIVE-24916: - Summary: EXPORT TABLE command to ADLS Gen2/s3 fail with org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not supported for file system: abfs:// Key: HIVE-24916 URL: https://issues.apache.org/jira/browse/HIVE-24916 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh "EXPORT TABLE" command invoked using distcp command failed with following error - org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not supported for file system: abfs://storage...@xx.core.windows.net -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24950) Fixing the logger for TaskQueue
Rajkumar Singh created HIVE-24950: - Summary: Fixing the logger for TaskQueue Key: HIVE-24950 URL: https://issues.apache.org/jira/browse/HIVE-24950 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24951) Table created with Uppercase name using CTAS does not produce result for select queries
Rajkumar Singh created HIVE-24951: - Summary: Table created with Uppercase name using CTAS does not produce result for select queries Key: HIVE-24951 URL: https://issues.apache.org/jira/browse/HIVE-24951 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 4.0.0 Reporter: Rajkumar Singh Assignee: Rajkumar Singh Steps to repro: {code:java} CREATE EXTERNAL TABLE MY_TEST AS SELECT * FROM source Table created with Location but does not have any data moved to it. /warehouse/tablespace/external/hive/MY_TEST {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24982) HMS- Postgres: Create table fail if SERDEPROPERTIES contains the NULL character
Rajkumar Singh created HIVE-24982: - Summary: HMS- Postgres: Create table fail if SERDEPROPERTIES contains the NULL character Key: HIVE-24982 URL: https://issues.apache.org/jira/browse/HIVE-24982 Project: Hive Issue Type: Bug Components: Hive Reporter: Rajkumar Singh Fix For: 4.0.0 Postgres does not expect the NULL char ('\u') during the insert (ref: https://www.postgresql.org/message-id/1171970019.3101.328.camel%40coppola.muc.ecircle.de ) so create table with following SERDEPROPERTIES will fail with exception. {code:java} ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' WITH SERDEPROPERTIES ( 'field.delim'='\u', 'serialization.format'='\u') {code} {code:java} org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO "SERDE_PARAMS" ("PARAM_VALUE","SERDE_ID","PARAM_KEY") VALUES (?,?,?) at org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1074) at org.datanucleus.store.rdbms.scostore.JoinMapStore.putAll(JoinMapStore.java:224) at org.datanucleus.store.rdbms.mapping.java.MapMapping.postInsert(MapMapping.java:158) at org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:522) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138) at org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363) at org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339) at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080) at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2172) at org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:603) at org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:357) at org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191) at org.datanucleus.state.AbstractStateManager.providedObjectField(AbstractStateManager.java:1460) at org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:120) at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.dnProvideField(MStorageDescriptor.java) at org.apache.hadoop.hive.metastore.model.MStorageDescriptor.dnProvideFields(MStorageDescriptor.java) at org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:1170) at org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:292) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138) at org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363) at org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339) at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080) at org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2172) at org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:603) at org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:357) at org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191) at org.datanucleus.state.AbstractStateManager.providedObjectField(AbstractStateManager.java:1460) at org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:120) at org.apache.hadoop.hive.metastore.model.MTable.dnProvideField(MTable.java) at org.apache.hadoop.hive.metastore.model.MTable.dnProvideFields(MTable.java) at org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:1170) at org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:292) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162) at org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138) at org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363) at org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339) at org.datanucleus.ExecutionContextImpl.persistObject
[jira] [Created] (HIVE-24994) get_aggr_stats_for call fail with "Tried to send an out-of-range integer"
Rajkumar Singh created HIVE-24994: - Summary: get_aggr_stats_for call fail with "Tried to send an out-of-range integer" Key: HIVE-24994 URL: https://issues.apache.org/jira/browse/HIVE-24994 Project: Hive Issue Type: Bug Components: Hive Reporter: Rajkumar Singh Assignee: Rajkumar Singh Fix For: 4.0.0 aggrColStatsForPartitions call fail with the Postgres LIMIT if the no of partitions passed in the direct sql goes beyond the 32767 {code:java} postgresql.util.PSQLException: An I/O error occurred while sending to the backend. at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:337) ~[postgresql-42.2.8.jar:42.2.8] at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:446) ~[postgresql-42.2.8.jar:42.2.8] at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:370) ~[postgresql-42.2.8.jar:42.2.8] at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:149) ~[postgresql-42.2.8.jar:42.2.8] at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108) ~[postgresql-42.2.8.jar:42.2.8] at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52) ~[HikariCP-2.6.1.jar:?] at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java) [HikariCP-2.6.1.jar:?] at org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:375) [datanucleus-rdbms-4.1.19.jar:?] at org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:552) [datanucleus-rdbms-4.1.19.jar:?] at org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:645) [datanucleus-rdbms-4.1.19.jar:?] at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) [datanucleus-core-4.1.17.jar:?] at org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) [datanucleus-rdbms-4.1.19.jar:?] at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:368) [datanucleus-api-jdo-4.2.4.jar:?] at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) [datanucleus-api-jdo-4.2.4.jar:?] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2058) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2050) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$1500(MetaStoreDirectSql.java:110) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql$15$1.run(MetaStoreDirectSql.java:1530) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql$15.run(MetaStoreDirectSql.java:1521) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.partsFoundForPartitions(MetaStoreDirectSql.java:1518) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1489) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.ObjectStore$20.getSqlResult(ObjectStore.java:8966) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.ObjectStore$20.getSqlResult(ObjectStore.java:8962) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3757) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:8981) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:8951) [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4] at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source) ~[?:?] at sun.refle
[jira] [Created] (HIVE-25024) Length function on char field yield incorrect result if CBO is enable
Rajkumar Singh created HIVE-25024: - Summary: Length function on char field yield incorrect result if CBO is enable Key: HIVE-25024 URL: https://issues.apache.org/jira/browse/HIVE-25024 Project: Hive Issue Type: Bug Components: CBO, Hive Affects Versions: 4.0.0 Reporter: Rajkumar Singh Steps to repro: {code:java} create table char_test(val char(10)); insert into table char_test values ('abc') select * from char_test; ++ | char_test.val | ++ | abc| ++ select length(val) from char_test where val='abc'; +--+ | _c0 | +--+ | 10 | +--+ {code} The problem surface when CBO is enabled and query have a predicate on the char field. the filter form in this case is 'abc ' (extra padded char) of string type since this is constant comparison. for string type genericudflength will not strip the extra chars. https://github.com/apache/hive/blob/1758c8c857f8a6dc4c9dc9c522de449f53e5e5cc/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java#L943 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25172) HMS Postgres: Lock acquisition is failing because table name exceeds the char limit of MS table datatype
Rajkumar Singh created HIVE-25172: - Summary: HMS Postgres: Lock acquisition is failing because table name exceeds the char limit of MS table datatype Key: HIVE-25172 URL: https://issues.apache.org/jira/browse/HIVE-25172 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 4.0.0 Reporter: Rajkumar Singh Only affect user running Postgres as HMS backend database {code:java} 021-05-11 19:49:41,040 ERROR org.apache.thrift.ProcessFunction: [pool-7-thread-199]: Internal error processing lock org.apache.hadoop.hive.metastore.api.MetaException: Unable to update transaction database java.sql.BatchUpdateException: Batch entry 0 INSERT INTO "TXN_COMPONENTS" ("TC_TXNID", "TC_DATABASE", "TC_TABLE", "TC_PARTITION", "T C_OPERATION_TYPE", "TC_WRITEID") VALUES (654299, 'default', '$some_big_table_name_exceeding_the_128_char_limit', NULL, 'i', 3) was aborte d: ERROR: value too long for type character varying(128) Call getNextException to see other errors in the batch. ... Caused by: org.postgresql.util.PSQLException: ERROR: value too long for type character varying(128) {code} it seems we are hitting the table name (TXN_COMPONENTS) char limit here which is defined as TC_TABLE TYPE character varying(128); there is need to increase the limit of char or limit the table name under the 128 chars -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-18398) WITH SERDEPROPERTIES option is broken without org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Rajkumar Singh created HIVE-18398: - Summary: WITH SERDEPROPERTIES option is broken without org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Key: HIVE-18398 URL: https://issues.apache.org/jira/browse/HIVE-18398 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.1 Reporter: Rajkumar Singh Priority: Minor *Steps to reproduce:* 1. Create table {code} create table test_serde(id int,value string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' ESCAPED BY '\\' {code} 2. show create table produce following output {code} CREATE TABLE `test_serde`( `id` int, `value` string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' WITH SERDEPROPERTIES ( 'escape.delim'='\\') STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' LOCATION 'hdfs://hdp262a.hdp.local:8020/apps/hive/warehouse/test_serde' TBLPROPERTIES ( 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 'numFiles'='0', 'numRows'='0', 'rawDataSize'='0', 'totalSize'='0', 'transient_lastDdlTime'='1515448894') {code} 3. once you run the create table using the output of show create it ran into the parsing error {code} NoViableAltException(296@[1876:103: ( tableRowFormatMapKeysIdentifier )?]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:116) . FAILED: ParseException line 6:0 cannot recognize input near 'WITH' 'SERDEPROPERTIES' '(' in serde properties specification {code} 4. table create with LazySimpleSerde don't have any such issue. {code} hive> CREATE TABLE `foo`( > `col` string) > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( > 'serialization.encoding'='UTF-8') > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ; OK Time taken: 0.375 seconds {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-19192) HiveServer2 query compilation : query compilation time increases sql has multiple unions
Rajkumar Singh created HIVE-19192: - Summary: HiveServer2 query compilation : query compilation time increases sql has multiple unions Key: HIVE-19192 URL: https://issues.apache.org/jira/browse/HIVE-19192 Project: Hive Issue Type: Improvement Components: Hive, HiveServer2 Affects Versions: 2.1.0, 1.2.1 Environment: Hive-1.2.1 Hive-2.1.0 Reporter: Rajkumar Singh Attachments: query-with-100-union.q, query-with-200-union.q, query-with-50-union.q query compilation time suffer a lot if SQL has many unions, here is the simple reproduce of the problem. PFA attached query with 50,100 and 200 unions(forgive me for this bad SQL). when run explain against hiveserver2 I can see the compilation time increase many folds. {code} query-with-50-union.q 1,671 rows selected (10.662 seconds) query-with-100-union.q 3,321 rows selected (101.709 seconds) query-with-200-union.q 6,588 rows selected (1074.487 seconds) {code} Running such SQL against hiveserver2 can starve other SQL to run into single threaded compilation stage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19259) Create view on tables having union all fail with "Table not found"
Rajkumar Singh created HIVE-19259: - Summary: Create view on tables having union all fail with "Table not found" Key: HIVE-19259 URL: https://issues.apache.org/jira/browse/HIVE-19259 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 1.2.1 Environment: hive-1.2.1 Reporter: Rajkumar Singh create view on table with union work well while "union all" failed with table not found, here are the reproduce steps. {code} _hive> create table foo(id int);_ _OK_ _Time taken: 0.401 seconds_ _hive> create table bar(id int);_ _OK_ _// view on table union_ _hive> create view unionview as with tmp_1 as ( select * from foo ), tmp_2 as (select * from bar ) select * from tmp_1 union select * from tmp_2;_ _OK_ _Time taken: 0.517 seconds_ _hive> select * from unionview;_ _OK_ _Time taken: 5.805 seconds_ _// view on union all_ _hive> create view unionallview as with tmp_1 as ( select * from foo ), tmp_2 as (select * from bar ) select * from tmp_1 union all select * from tmp_2;_ _OK_ _Time taken: 1.535 seconds_ _hive> select * from unionallview;_ _FAILED: SemanticException Line 1:134 Table not found 'tmp_1' in definition of VIEW unionallview [_ _with tmp_1 as ( select `foo`.`id` from `default`.`foo` ), tmp_2 as (select `bar`.`id` from `default`.`bar` ) select `tmp_1`.`id` from tmp_1 union all select `tmp_2`.`id` from tmp_2_ _] used as unionallview at Line 1:14_ _{code}_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables
Rajkumar Singh created HIVE-19432: - Summary: HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables Key: HIVE-19432 URL: https://issues.apache.org/jira/browse/HIVE-19432 Project: Hive Issue Type: Improvement Components: Hive, HiveServer2 Affects Versions: 2.2.0 Reporter: Rajkumar Singh GetTablesOperation is too slow since it does not check for the authorization for databases and try pulling all the tables from all the databases using getTableMeta. for operation like follows {code} con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" }); {code} build the getTableMeta call with wildcard * {code} metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=* {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19469) HiveServer2: SqlStdAuth take too much time while doing checkFileAccessWithImpersonation if the table location has too many files/dirs
Rajkumar Singh created HIVE-19469: - Summary: HiveServer2: SqlStdAuth take too much time while doing checkFileAccessWithImpersonation if the table location has too many files/dirs Key: HIVE-19469 URL: https://issues.apache.org/jira/browse/HIVE-19469 Project: Hive Issue Type: Task Components: HiveServer2 Affects Versions: 2.1.0 Reporter: Rajkumar Singh HiveServer2 doAuthorization call takes too much time while doing checkFileAccessWithImpersonation if the table location has too many files/dirs which increases the query compilation time. {code} at org.apache.hadoop.hive.shims.Hadoop23Shims.checkFileAccess(Hadoop23Shims.java:1006) at org.apache.hadoop.hive.common.FileUtils.checkFileAccessWithImpersonation(FileUtils.java:378) at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:417) at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431) at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431) at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431) at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431) at org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431) at org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:752) {code} the improvement we can make here is to parallelize checkFileAccessWithImpersonation call. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-14856) create table with select from table limit is failing with NFE if limit exceed than allowed 32bit integer length
Rajkumar Singh created HIVE-14856: - Summary: create table with select from table limit is failing with NFE if limit exceed than allowed 32bit integer length Key: HIVE-14856 URL: https://issues.apache.org/jira/browse/HIVE-14856 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.1 Environment: centos 6.6 Reporter: Rajkumar Singh Assignee: Rajkumar Singh query with limit is failing with NumberFormatException if the limit exceeds 32bit integer length. create table sample1 as select * from sample limit 2248321440; FAILED: NumberFormatException For input string: "2248321440" -- This message was sent by Atlassian JIRA (v6.3.4#6332)