[jira] [Created] (HIVE-19743) hive is not pushing predicate down to HBaseStorageHandler if hive key mapped with hbase is stored as varchar

2018-05-30 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-19743:
-

 Summary: hive is not pushing predicate down to HBaseStorageHandler 
if hive key mapped with hbase is stored as varchar
 Key: HIVE-19743
 URL: https://issues.apache.org/jira/browse/HIVE-19743
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler, Hive
Affects Versions: 2.1.0
 Environment: java8,centos7
Reporter: Rajkumar Singh


Steps to Reproduce:

{code}

//hbase table

create 'mytable', 'cf'
put 'mytable', 'ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4', 
'cf:message', 'hello world'
put 'mytable', 'ABCDEF1|GHIJK1|ijj123kl-mn4o-4pq5-678r-st90123u0v41', 'cf:foo', 
0x0

// hive table with key stored as varchar

show create table hbase_table_4;

+---+--+

|                      createtab_stmt                       |

+---+--+

| CREATE EXTERNAL TABLE `hbase_table_4`(                    |

|   `hbase_key` varchar(80) COMMENT 'from deserializer',    |

|   `value` string COMMENT 'from deserializer',             |

|   `value1` string COMMENT 'from deserializer')            |

| ROW FORMAT SERDE                                          |

|   'org.apache.hadoop.hive.hbase.HBaseSerDe'               |

| STORED BY                                                 |

|   'org.apache.hadoop.hive.hbase.HBaseStorageHandler'      |

| WITH SERDEPROPERTIES (                                    |

|   'hbase.columns.mapping'=':key,cf:foo,cf:message',       |

|   'serialization.format'='1')                             |

| TBLPROPERTIES (                                           |

|   'COLUMN_STATS_ACCURATE'='\{\"BASIC_STATS\":\"true\"}',   |

|   'hbase.table.name'='mytable',                           |

|   'numFiles'='0',                                         |

|   'numRows'='0',                                          |

|   'rawDataSize'='0',                                      |

|   'totalSize'='0',                                        |

|   'transient_lastDdlTime'='1527708430')                   |

+---+--+

 

// hive table key stored as string

CREATE EXTERNAL TABLE `hbase_table_5`(                    |

|   `hbase_key` string COMMENT 'from deserializer',         |

|   `value` string COMMENT 'from deserializer',             |

|   `value1` string COMMENT 'from deserializer')            |

| ROW FORMAT SERDE                                          |

|   'org.apache.hadoop.hive.hbase.HBaseSerDe'               |

| STORED BY                                                 |

|   'org.apache.hadoop.hive.hbase.HBaseStorageHandler'      |

| WITH SERDEPROPERTIES (                                    |

|   'hbase.columns.mapping'=':key,cf:foo,cf:message',       |

|   'serialization.format'='1')                             |

| TBLPROPERTIES (                                           |

|   'COLUMN_STATS_ACCURATE'='\{\"BASIC_STATS\":\"true\"}',   |

|   'hbase.table.name'='mytable',                           |

|   'numFiles'='0',                                         |

|   'numRows'='0',                                          |

|   'rawDataSize'='0',                                      |

|   'totalSize'='0',                                        |

|   'transient_lastDdlTime'='1527708520')                   |

 

Explain Plan

 explain select * from hbase_table_4 where 
hbase_key='ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4'

 Stage-0                                                                        
                  |

|   Fetch Operator                                                              
                   |

|     limit:-1                                                                  
                   |

|     Select Operator [SEL_2]                                                   
                   |

|       Output:["_col0","_col1","_col2"]                                        
                   |

|       Filter Operator [FIL_4]                                                 
                   |

|         predicate:(UDFToString(hbase_key) = 
'ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4')  |

|         TableScan [TS_0]                                                      
                   |

|           Output:["hbase_key","value","value1"] 

 

explain on table with key stored as string

explain select * from hbase_table_5 where 
hbase_key='ABCDEF|GHIJK|ijj123kl-mn4o-4pq5-678r-st90123u0v4';

 Plan optimized by CBO.                  |

|                                         |

| Stage-0                                 |

|   Fetch Operator                        |

|     limit:-1                            |

|     Select Operator [SEL_2]             |

|       Output:["_col0","_col1","_col2"]  |

|      

[jira] [Created] (HIVE-19831) Hiveserver2 should skip doAuth checks for CREATE DATABASE/TABLE if database/table already exists

2018-06-08 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-19831:
-

 Summary: Hiveserver2 should skip doAuth checks for CREATE 
DATABASE/TABLE if database/table already exists
 Key: HIVE-19831
 URL: https://issues.apache.org/jira/browse/HIVE-19831
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 2.1.0, 1.2.1
Reporter: Rajkumar Singh


with sqlstdauth on, Create database if exists take TOO LONG if there are too 
many objects inside the database directory. Hive should not run the doAuth 
checks for all the objects within database if the database already exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19860) HiveServer2 ObjectInspectorFactory memory leak with cachedUnionStructObjectInspector

2018-06-11 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-19860:
-

 Summary: HiveServer2 ObjectInspectorFactory memory leak with 
cachedUnionStructObjectInspector
 Key: HIVE-19860
 URL: https://issues.apache.org/jira/browse/HIVE-19860
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 2.1.0
 Environment: hiveserver2 Interactive with LLAP.
Reporter: Rajkumar Singh


hiveserver2 is start seeing the memory pressure once the 
cachedUnionStructObjectInspector start going 

[https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java#L345]

I did not see any eviction policy for cachedUnionStructObjectInspector, so we 
should implement some size or time-based eviction policy. 

 

!Screen Shot 2018-06-11 at 1.52.50 PM.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20080) TxnHandler checkLock direct sql fail with ORA-01795 , if the table has more than 1000 partitions

2018-07-03 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20080:
-

 Summary: TxnHandler checkLock direct sql fail with ORA-01795 , if 
the table has more than 1000 partitions
 Key: HIVE-20080
 URL: https://issues.apache.org/jira/browse/HIVE-20080
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.0
Reporter: Rajkumar Singh


with Oracle as Metastore, txnhandler checkLock fail with 
"checkLockWithRetry(181398,34773) : ORA-01795: maximum number of expressions in 
a list is 1000" if the write table has more than 1000 partitions.

complete stacktrace

{code}

txn.TxnHandler (TxnHandler.java:checkRetryable(2099)) - Non-retryable error in 
checkLockWithRetry(181398,34773) : ORA-01795: maximum number of expressions in 
a list is 1000

 (SQLState=42000, ErrorCode=1795)

2018-06-25 15:09:35,999 ERROR [pool-7-thread-197]: metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Unable to 
update transaction database java.sql.SQLSyntaxErrorException: ORA-01795: 
maximum number of expressions in a list is 1000

 

    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447)

    at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396)

    at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951)

    at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513)

    at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)

    at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531)

    at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:195)

    at oracle.jdbc.driver.T4CStatement.executeForDescribe(T4CStatement.java:876)

    at 
oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.java:1175)

    at 
oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.java:1296)

    at 
oracle.jdbc.driver.OracleStatement.executeQuery(OracleStatement.java:1498)

    at 
oracle.jdbc.driver.OracleStatementWrapper.executeQuery(OracleStatementWrapper.java:406)

    at com.jolbox.bonecp.StatementHandle.executeQuery(StatementHandle.java:464)

    at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLock(TxnHandler.java:2649)

    at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1126)

    at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:895)

    at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:6123)

    at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)

    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:498)

    at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)

    at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)

    at com.sun.proxy.$Proxy11.lock(Unknown Source)

    at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:12012)

    at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:11996)

    at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

    at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

    at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:551)

    at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:546)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.Subject.doAs(Subject.java:422)

    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)

    at 
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:546)

    at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)

    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

    at java.lang.Thread.run(Thread.java:748)

)

    at 
org.apache.hadoop.hive.metastore.txn.TxnHandler.checkLockWithRetry(TxnHandler.java:1131)

    at org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:895)

    at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:6123)

    at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source)

    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:498)

    at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)

    at 
org.apache.hadoop.hive.metastore.Retryin

[jira] [Created] (HIVE-20099) incorrect logger for LlapServlet

2018-07-05 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20099:
-

 Summary: incorrect logger for LlapServlet
 Key: HIVE-20099
 URL: https://issues.apache.org/jira/browse/HIVE-20099
 Project: Hive
  Issue Type: Improvement
Affects Versions: 2.1.0
Reporter: Rajkumar Singh


logger should be LlapServlet, not the JMXJsonServlet, it can mislead the user 
while debugging UI issues.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20172) StatsUpdater failed with GSS Exception while trying to connect to remote metastore

2018-07-13 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20172:
-

 Summary: StatsUpdater failed with GSS Exception while trying to 
connect to remote metastore
 Key: HIVE-20172
 URL: https://issues.apache.org/jira/browse/HIVE-20172
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 2.1.1
 Environment: Hive-1.2.1,Hive2.1,java8
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


StatsUpdater task failed with GSS Exception while trying to connect to remote 
Metastore.
{code}
org.apache.thrift.transport.TTransportException: GSS initiate failed 
at 
org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
 
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:316) 
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
 
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)
 
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)
 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
 
at 
org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)
 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:487)
 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
 
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
 
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) 
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1564)
 
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:92)
 
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:138)
 
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:110)
 
at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3526) 
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3558) 
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:533) 
at 
org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:300)
 
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265) 
at org.apache.hadoop.hive.ql.txn.compactor.Worker$1.run(Worker.java:177) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:422) 
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
 
at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174) 
) 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:534)
 
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:282)
 
at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:76)
 
{code}
since metastore client is running in HMS so there is no need to connect to 
remote URI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20275) hive produces incorrect result when using MIN()/MAX() on varchar with hive.vectorized.reuse.scratch.columns enabled

2018-07-30 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20275:
-

 Summary: hive produces incorrect result when using MIN()/MAX() on 
varchar with hive.vectorized.reuse.scratch.columns enabled
 Key: HIVE-20275
 URL: https://issues.apache.org/jira/browse/HIVE-20275
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
 Environment: Hive3.1,java8
Reporter: Rajkumar Singh


Steps to reproduce:
{code}
create table testhive3 (name varchar(8), `time` double);

 insert into table testhive3 values
 ('ABC', 1),
 ('ABC', 2),
 ('DEF', 1),
 ('DEF', 2),
 ('DEF', 1),
 ('DEF', 2),
 ('ABC', 1),
 ('ABC', 2),
 ('DEF', 1),
 ('DEF', 2),
 ('ABC', 1),
 ('ABC', 2),
 ('ABC', 1),
 ('ABC', 2),
 ('DEF', 1),
 ('DEF', 2),
 ('ABC', 1),
 ('ABC', 2),
 ('ABC', 1),
 ('ABC', 2),
 ('DEF', 1),
 ('DEF', 2),
 ('ABC', 1),
 ('ABC', 2),
 ('DEF', 1),
 ('DEF', 2),
 ( 'ABC', 1),
 ( NULL, NULL),
 ( 'ABC', 1),
 ( 'ABC', 2),
 ( 'DEF', 1),
 ('DEF', 2),
 ('ABC', 1),
 ( 'ABC', 2),
 ('ABC', 1),
 ( 'ABC', 2),
 ( 'DEF', 1),
 ('DEF', 2);

 select name, `time` from testhive3 where name = 'ABC' group by name, `time`;

 +---+---+
| name  | time  |
+---+---+
| ABC   | 1.0   |
| ABC   | 2.0   |
+---+---+


 select min(name), `time` from testhive3 where name = 'ABC' group by name, 
`time`;

 +---+---+
|  _c0  | time  |
+---+---+
| NULL  | 1.0   |
| NULL  | 2.0   |
+---+---+

set hive.vectorized.reuse.scratch.columns=false;
select min(name), `time` from testhive3 where name = 'ABC' group by name, 
`time`;
+--+---+
| _c0  | time  |
+--+---+
| ABC  | 1.0   |
| ABC  | 2.0   |
+--+---+
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20343) Hive 3: CTAS does not respect transactional_properties

2018-08-08 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20343:
-

 Summary: Hive 3: CTAS does not respect transactional_properties
 Key: HIVE-20343
 URL: https://issues.apache.org/jira/browse/HIVE-20343
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
 Environment: hive-3
Reporter: Rajkumar Singh


Steps to reproduce:
{code}
create table ctasexampleinsertonly stored as orc  TBLPROPERTIES 
("transactional_properties"="insert_only") as select * from testtable limit 1;

 describe formatted ctasexampleinsertonly

 
+---++---+
|   col_name| data_type 
 |comment|
+---++---+
| # col_name| data_type 
 | comment   |
| name  | varchar(8)
 |   |
| time  | double
 |   |
|   | NULL  
 | NULL  |
| # Detailed Table Information  | NULL  
 | NULL  |
| Database: | default   
 | NULL  |
| OwnerType:| USER  
 | NULL  |
| Owner:| hive  
 | NULL  |
| CreateTime:   | Wed Aug 08 21:35:15 UTC 2018  
 | NULL  |
| LastAccessTime:   | UNKNOWN   
 | NULL  |
| Retention:| 0 
 | NULL  |
| Location: | 
hdfs://xx:8020/warehouse/tablespace/managed/hive/ctasexampleinsertonly 
| NULL  |
| Table Type:   | MANAGED_TABLE 
 | NULL  |
| Table Parameters: | NULL  
 | NULL  |
|   | COLUMN_STATS_ACCURATE 
 | {}|
|   | bucketing_version 
 | 2 |
|   | numFiles  
 | 1 |
|   | numRows   
 | 1 |
|   | rawDataSize   
 | 0 |
|   | totalSize 
 | 754   |
|   | transactional 
 | true  |
|   | transactional_properties  
 | default   |
|   | transient_lastDdlTime 
 | 1533764115|
|   | NULL  
 | NULL  |
| # Storage Information | NULL  
 | NULL  |
| SerDe Library:| org.apache.hadoop.hive.ql.io.orc.OrcSerde 
 | NULL  |
| InputFormat:  | 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat| NULL  |
| OutputFormat: | 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat   | NULL  |
| Compressed:   | No
 | NULL  |
| Num Buckets:  | -1
 | NULL  |
| Bucket Columns:   | []
 | NULL  |
| Sort Columns: | []
 | NULL  |
| Storage Desc Params:  | NULL  
 | NULL  |
|   | serialization.format  
 | 1 |
+---++---+
{code}

this creates a problem with insert 
{code}
CREATE TABLE TABLE

[jira] [Created] (HIVE-20409) Hive ACID: Update/delete/merge leave behind the staging directory

2018-08-16 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20409:
-

 Summary: Hive ACID: Update/delete/merge leave behind the staging 
directory
 Key: HIVE-20409
 URL: https://issues.apache.org/jira/browse/HIVE-20409
 Project: Hive
  Issue Type: Bug
 Environment: Hive-2.1,java-1.8
Reporter: Rajkumar Singh


UpdateDeleteSemanticAnalyzer creates query context while rewriting the context 
which doesn't set hdfscleanup, As a result, Driver doesn't clear the staging 
dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20415) Hive1: Tez Session failed to return if background thread is intruppted

2018-08-17 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20415:
-

 Summary: Hive1: Tez Session failed to return if background thread 
is intruppted
 Key: HIVE-20415
 URL: https://issues.apache.org/jira/browse/HIVE-20415
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.1
Reporter: Rajkumar Singh


user canceled the query which interrupts the background thread, because of this 
interrupt background thread fail to put the session back to the pool.
{code}
2018-08-14 15:55:27,581 ERROR exec.Task (TezTask.java:execute(226)) - Failed to 
execute tez graph.
java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
at 
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
at 
java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:350)
at 
org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.returnSession(TezSessionPoolManager.java:176)
{code}
we need a similar fix as HIVE-15731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20442) Hive stale lock when the hiveserver2 background thread died abruptly

2018-08-22 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20442:
-

 Summary: Hive stale lock when the hiveserver2 background thread 
died abruptly
 Key: HIVE-20442
 URL: https://issues.apache.org/jira/browse/HIVE-20442
 Project: Hive
  Issue Type: Bug
  Components: Hive, Transactions
Affects Versions: 2.1.1
 Environment: Hive-2.1
Reporter: Rajkumar Singh


this look like a race condition where background thread is not able to release 
the lock it aquired.

1. hiveserver2 background thread request for lock
{code}
2018-08-20T14:13:38,813 INFO  [HiveServer2-Background-Pool: Thread-X]: 
lockmgr.DbLockManager (DbLockManager.java:lock(100)) - Requesting: 
queryId=hive_xxx LockRequest(component:[LockComponent(type:SHARED_READ, 
level:TABLE, dbname:testdb, tablename:test_table, operationType:SELECT)], 
txnid:0, user:hive, hostname:HOSTNAME, agentInfo:hive_xxx)
{code}
2. acquired the lock and start heartbeating
{code}
2018-08-20T14:36:30,233 INFO  [HiveServer2-Background-Pool: Thread-X]: 
lockmgr.DbTxnManager (DbTxnManager.java:startHeartbeat(517)) - Started 
heartbeat with delay/interval = 15/15 MILLISECONDS for 
query: agentInfo:hive_xxx
{code}

3. during time between event #1 and #2, client disconnected and deleteContext 
cleanup the session dir
{code}
2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-XXX]: 
thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(136)) - 
Session disconnected without closing properly.
2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-]: 
thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(140)) - 
Closing the session: SessionHandle [3be07faf-5544-4178-8b50-8173002b171a]
2018-08-21T15:39:57,820 INFO  [HiveServer2-Handler-Pool: Thread-]: 
service.CompositeService (SessionManager.java:closeSession(363)) - Session 
closed, SessionHandle [xxx], current sessions:2
{code}

4. background thread died with NPE while trying to get the queryid 
{code}
java.lang.NullPointerException: null
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1568) 
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414) 
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1211) 
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1204) 
~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
 [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at 
org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
 [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336)
 [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292]
at java.security.AccessController.doPrivileged(Native Method) 
[?:1.8.0_77]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_77]
{code}
did not get a chance to release the lock and heartbeater thread continue 
heartbeat indefinately.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20499) GetTablesOperation pull all the tables meta irrespective of auth.

2018-09-04 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20499:
-

 Summary: GetTablesOperation pull all the tables meta irrespective 
of auth.
 Key: HIVE-20499
 URL: https://issues.apache.org/jira/browse/HIVE-20499
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.0
 Environment: hive-3,java-8,sqlstdaut/ranger auth enabled.
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


GetTablesOperation pull all the tables meta irrespective of auth.
Steps to reproduce:
{code}
ResultSet res = con.getMetaData().getTables("", "", "%", new String[] { 
"TABLE", "VIEW" });
{code}
https://github.com/rajkrrsingh/HiveServer2JDBCSample/blob/master/src/main/java/TestConnection.java#L20



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20568) GetTablesOperation : There is no need to convert the dbname to pattern

2018-09-14 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20568:
-

 Summary: GetTablesOperation : There is no need to convert the 
dbname to pattern
 Key: HIVE-20568
 URL: https://issues.apache.org/jira/browse/HIVE-20568
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 0.4.0
 Environment: Hive-4,Java-8
Reporter: Rajkumar Singh


there is no need to convert the dbname to pattern, dbNamePattern is just a 
dbName which we are passing to getTableMeta
https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java#L117



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20591) hive query hung during compilation if same previous query is unable to invalidate the QueryResultsCache entry

2018-09-18 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20591:
-

 Summary: hive query hung during compilation if same previous query 
is unable to invalidate the QueryResultsCache entry 
 Key: HIVE-20591
 URL: https://issues.apache.org/jira/browse/HIVE-20591
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.0.0
 Environment: Hive-3,java-8
Reporter: Rajkumar Singh


I believe this is the sequence of event to reproduce this issue.
1. query failed with some env issue while setting up the Tez session.
2. hiveserver2 tries do query cleanup, it invokes queryresultscache cleanup.
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java#L235
3. for some reason eighter following 2 event never happen and query falls into 
the endless loop of checking the valid status.
i: unable to set the invalid status and return the old status
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java#L260
ii: or this condition never reached.
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/cache/results/QueryResultsCache.java#L245

I don't have complete jstack so its tough to say who is waiting on what, the 
stuck thread stack snipped look like 

{code}
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
  at java.lang.Object.wait(Native Method)
  at 
org.apache.hadoop.hive.ql.cache.results.QueryResultsCache$CacheEntry.waitForValidStatus(QueryResultsCache.java:325)
  - locked <0xb32661c0> (a 
org.apache.hadoop.hive.ql.cache.results.QueryResultsCache$CacheEntry)
  at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.checkResultsCache(SemanticAnalyzer.java:14860)
  at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12200)
{code}
 
will add more details after reproducing the issue again.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20616) Dynamic Partition Insert failed if PART_VALUE exceeds 4000 chars

2018-09-20 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20616:
-

 Summary: Dynamic Partition Insert failed if PART_VALUE exceeds 
4000 chars
 Key: HIVE-20616
 URL: https://issues.apache.org/jira/browse/HIVE-20616
 Project: Hive
  Issue Type: Bug
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


with mysql as metastore db the PARTITION_PARAMS.PARAM_VALUE defined as 
varchar(4000)
{code}
describe PARTITION_PARAMS; 
+-+---+--+-+-+---+ 
| Field | Type | Null | Key | Default | Extra | 
+-+---+--+-+-+---+ 
| PART_ID | bigint(20) | NO | PRI | NULL | | 
| PARAM_KEY | varchar(256) | NO | PRI | NULL | | 
| PARAM_VALUE | varchar(4000) | YES | | NULL | | 
+-+---+--+-+-+---+ 
{code}
which lead to the MoveTask failure if PART_VALUE excceeds 4000 chars.
{code}
org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO 
`PARTITION_PARAMS` (`PARAM_VALUE`,`PART_ID`,`PARAM_KEY`) VALUES (?,?,?)
 at 
org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1074)
 at 
org.datanucleus.store.rdbms.scostore.JoinMapStore.putAll(JoinMapStore.java:224)
 at 
org.datanucleus.store.rdbms.mapping.java.MapMapping.postInsert(MapMapping.java:158)
 at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:522)
 at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162)
 at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138)
 at 
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363)
 at 
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339)
 at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080)
 at 
org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:1923)
 at 
org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1778)
 at 
org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
 at 
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:724)
 at 
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:749)
 at 
org.apache.hadoop.hive.metastore.ObjectStore.addPartition(ObjectStore.java:2442)
 at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
 at com.sun.proxy.$Proxy32.addPartition(Unknown Source)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_core(HiveMetaStore.java:3976)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.add_partition_with_environment_context(HiveMetaStore.java:4032)
 at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 at com.sun.proxy.$Proxy34.add_partition_with_environment_context(Unknown 
Source)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partition_with_environment_context.getResult(ThriftHiveMetastore.java:15528)
 at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$add_partition_with_environment_context.getResult(ThriftHiveMetastore.java:15512)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
 at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
 at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)

Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncatio

[jira] [Created] (HIVE-20673) vectorized map join fail with Unexpected column vector type STRUCT.

2018-10-02 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20673:
-

 Summary: vectorized map join fail with Unexpected column vector 
type STRUCT.
 Key: HIVE-20673
 URL: https://issues.apache.org/jira/browse/HIVE-20673
 Project: Hive
  Issue Type: Bug
  Components: Hive, Transactions, Vectorization
Affects Versions: 3.1.0
 Environment: hive-3, java-8
Reporter: Rajkumar Singh


update query on ACID table fails with the following exception.
 
UPDATE census_clus SET name = 'updated name' where ssn=100 and   EXISTS (select 
distinct ssn from census where ssn=census_clus.ssn);

{code}
Caused by: java.lang.RuntimeException: Map operator initialization failed
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:354)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected column 
vector type STRUCT
at 
org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:302)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:419)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.initializeOp(VectorMapJoinGenerateResultOperator.java:115)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:572)
at 
org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:524)
at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:335)
{code}

STEPS TO REPRODUCE
{code}
create table census(
ssn int,
name string,
city string,
email string) 
row format delimited 
fields terminated by ',';

insert into census values(100,"raj","san jose","email");
create table census_clus(
ssn int,
name string,
city string,
email string) 
clustered by (ssn) into 4 buckets  stored as orc TBLPROPERTIES 
('transactional'='true');

insert into  table census_clus select *  from census;

UPDATE census_clus SET name = 'updated name' where ssn=100 and   EXISTS (select 
distinct ssn from census where ssn=census_clus.ssn);
{code}

looking at the exception it seems the join operator getting typeInfo 
incorrectly while doing join, _col6 seems to be of struct type.

{code}
2018-10-02 22:22:23,392 [INFO] [TezChild] |exec.CommonJoinOperator|: JOIN 
struct<_col2:string,_col3:string,_col6:struct>
 totalsz = 3

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20848) After setting UpdateInputAccessTimeHook query fail with Table Not Found.

2018-10-31 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20848:
-

 Summary: After setting UpdateInputAccessTimeHook query fail with 
Table Not Found.
 Key: HIVE-20848
 URL: https://issues.apache.org/jira/browse/HIVE-20848
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


{code}
 select from_unixtime(1540495168); 
 set 
hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;
 select from_unixtime(1540495168); 
{code}
the second select fail with following exception
{code}
ERROR ql.Driver: FAILED: Hive Internal Error: 
org.apache.hadoop.hive.ql.metadata.InvalidTableException(Table not found 
_dummy_table)
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found 
_dummy_table
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1217)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1168)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1155)
at 
org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec.run(UpdateInputAccessTimeHook.java:67)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1444)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
at 
org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
at 
org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at 
org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20908) Avoid multiple getTableMeta calls per database

2018-11-12 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-20908:
-

 Summary: Avoid multiple getTableMeta calls per database
 Key: HIVE-20908
 URL: https://issues.apache.org/jira/browse/HIVE-20908
 Project: Hive
  Issue Type: Bug
Reporter: Rajkumar Singh


following HIVE-19432, we are doing getTableMeta for each authorized db instead 
of that we can pass pattern for metastore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21342) Analyze compute stats for column leave behind staging dir on hdfs

2019-02-27 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21342:
-

 Summary: Analyze compute stats for column leave behind staging dir 
on hdfs
 Key: HIVE-21342
 URL: https://issues.apache.org/jira/browse/HIVE-21342
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
 Environment: hive-3.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


staging dir cleanup does not happen for the "analyze table .. compute 
statistics for columns", this leads to stale directory on hdfs.
the problem seems to be with ColumnStatsSemanticAnalyzer which don't have 
hdfscleanup set for the context.
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java#L310



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21499) should not remove the function if create command failed with AlreadyExistsException

2019-03-24 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21499:
-

 Summary: should not remove the function if create command failed 
with AlreadyExistsException
 Key: HIVE-21499
 URL: https://issues.apache.org/jira/browse/HIVE-21499
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
 Environment: Hive-3.1
Reporter: Rajkumar Singh


As a part of HIVE-20953 we are removing the function if creation for same 
failed with any reason, this will yield into the following situation.
1. create function failed since function already exists
2. on #1 failure hive will clear the permanent function from the registry
3. this function will be of no use until hiveserver2 restarted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21538) Beeline: password source though the console reader did not pass to connection param

2019-03-28 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21538:
-

 Summary: Beeline: password source though the console reader did 
not pass to connection param
 Key: HIVE-21538
 URL: https://issues.apache.org/jira/browse/HIVE-21538
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.0
 Environment: Hive-3.1 auth set to LDAP
Reporter: Rajkumar Singh


Beeline: password source through the console reader do not pass to connection 
param, this will yield into the Authentication failure in case of LDAP 
authentication.
{code}
beeline -n USER -u 
"jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
 -p

Connecting to 
jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=USER
Enter password for jdbc:hive2://host:2181/: 
19/03/26 19:49:44 [main]: WARN jdbc.HiveConnection: Failed to connect to 
host:1
19/03/26 19:49:44 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 configs 
from ZooKeeper
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport for any of the Server URI's in 
ZooKeeper: Peer indicated failure: PLAIN auth failed: 
javax.security.sasl.AuthenticationException: Error validating LDAP user [Caused 
by javax.naming.AuthenticationException: [LDAP: error code 49 - 80090308: 
LdapErr: DSID-0C0903C8, comment: AcceptSecurityContext error, data 52e, v2580]] 
(state=08S01,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21601) Hive JDBC Storage Handler query fail because projected timestamp max precision is not valid for mysql

2019-04-10 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21601:
-

 Summary: Hive JDBC Storage Handler query fail because projected 
timestamp max precision is not valid for mysql
 Key: HIVE-21601
 URL: https://issues.apache.org/jira/browse/HIVE-21601
 Project: Hive
  Issue Type: Bug
  Components: Hive, JDBC
Affects Versions: 3.1.1
 Environment: Hive-3.1
Reporter: Rajkumar Singh


Steps to reproduce:
{code}
--mysql table
mysql> show create table dd_timestamp_error;
++--+
| Table  | Create Table 

|
++--+
| dd_timestamp_error | CREATE TABLE `dd_timestamp_error` (
  `col1` text,
  `col2` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE 
CURRENT_TIMESTAMP(6)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
++--+
1 row in set (0.00 sec)

-- hive table 

++
|   createtab_stmt   |
++
| CREATE EXTERNAL TABLE `dd_timestamp_error`(|
|   `col1` string COMMENT 'from deserializer',   |
|   `col2` timestamp COMMENT 'from deserializer')|
| ROW FORMAT SERDE   |
|   'org.apache.hive.storage.jdbc.JdbcSerDe' |
| STORED BY  |
|   'org.apache.hive.storage.jdbc.JdbcStorageHandler'  |
| WITH SERDEPROPERTIES ( |
|   'serialization.format'='1')  |
| TBLPROPERTIES (|
|   'bucketing_version'='2', |
|   'hive.sql.database.type'='MYSQL',|
|   'hive.sql.dbcp.maxActive'='1',   |
|   'hive.sql.dbcp.password'='testuser', |
|   'hive.sql.dbcp.username'='testuser', |
|   'hive.sql.jdbc.driver'='com.mysql.jdbc.Driver',  |
|   'hive.sql.jdbc.url'='jdbc:mysql://c46-node3.squadron-labs.com/test',  |
|   'hive.sql.table'='dd_timestamp_error',   |
|   'transient_lastDdlTime'='1554910389')|
++

--query failure

0: jdbc:hive2://c46-node2.squadron-labs.com:2>  select * from 
dd_timestamp_error where col2 = '2019-04-03 15:54:21.543654';

Error: java.io.IOException: java.io.IOException: 
org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught 
exception while trying to execute query:You have an error in your SQL syntax; 
check the manual that corresponds to your MySQL server version for the right 
syntax to use near 'TIMESTAMP(9)) AS `col2`


--
explain select * from dd_timestamp_error where col2 = '2019-04-03 
15:54:21.543654';

TableScan [TS_0] |
| Output:["col1","col2"],properties:{"hive.sql.query":"SELECT `col1`, 
CAST(TIMESTAMP '2019-04-03 15:54:21.543654000' AS TIMESTAMP(9)) AS `col2`\nFROM 
`dd_timestamp_error`\nWHERE `col2` = TIMESTAMP '2019-04-03 
15:54:21.543654000'","hive.sql.query.fieldNames":"col1,col2","hive.sql.query.fieldTypes":"string,timestamp","hive.sql.query.split":"true"}
 |
|   
{code}

the problem seems to be with convertedFilterExpr ( -- where col2 = '2019-04-03 
15:54:21.543654';) while comparing timestamp with constant:- 

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java#L856
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveTypeSystemImpl.java#L38

hive timestamp MAX_TIMESTAMP_PRECISION seems to be 9 and it appears that hive 
pushes the same in query projection(JDBC project) for MySQL and fail the query.







--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21728) WorkloadManager logging fix

2019-05-14 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21728:
-

 Summary: WorkloadManager logging fix 
 Key: HIVE-21728
 URL: https://issues.apache.org/jira/browse/HIVE-21728
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.2.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


logger skip the following message if HS2 is running in INFO level.
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/WorkloadManager.java#L705




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21902) HiveServer2 UI: Adding X-XSS-Protection, X-Content-Type-Options to jetty response header

2019-06-19 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21902:
-

 Summary: HiveServer2 UI: Adding X-XSS-Protection, 
X-Content-Type-Options to jetty response header
 Key: HIVE-21902
 URL: https://issues.apache.org/jira/browse/HIVE-21902
 Project: Hive
  Issue Type: Improvement
Reporter: Rajkumar Singh


some vulnerability are reported for webserver ui


X-Frame-Options or Content-Security-Policy: frame-ancestors HTTP Headers 
missing on port 10002. 
{code}
GET / HTTP/1.1 
Host: HOSTNAME:10002 
Connection: Keep-Alive 



X-XSS-Protection HTTP Header missing on port 10002. 
X-Content-Type-Options HTTP Header missing on port 10002. 
{code}
after the proposed changes

{code}
HTTP/1.1 200 OK
Date: Thu, 20 Jun 2019 05:29:59 GMT
Content-Type: text/html;charset=utf-8
X-Content-Type-Options: nosniff
X-FRAME-OPTIONS: SAMEORIGIN
X-XSS-Protection: 1; mode=block
Set-Cookie: JSESSIONID=15kscuow9cmy7qms6dzaxllqt;Path=/
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Content-Length: 3824
Server: Jetty(9.3.25.v20180904)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21903) HiveServer2 Query Compilation fails with StackOverflowError if the list of IN is too big

2019-06-20 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21903:
-

 Summary: HiveServer2 Query Compilation fails with 
StackOverflowError if the list of IN is too big
 Key: HIVE-21903
 URL: https://issues.apache.org/jira/browse/HIVE-21903
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.0
 Environment: Hive-3.1. java-8, Thread stack size default set to 1024k
Reporter: Rajkumar Singh
 Attachments: thread-progress.log

Steps to Reproduce:
The query including some joins and IN clause containing more than 15000 values.

Attaching the Handlers thread progress before it runs into the 
StackOverFlowError.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21927) HiveServer Web UI: Setting the HttpOnly option in the cookies

2019-06-26 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21927:
-

 Summary: HiveServer Web UI: Setting the HttpOnly option in the 
cookies
 Key: HIVE-21927
 URL: https://issues.apache.org/jira/browse/HIVE-21927
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Intend of this JIRA is to introduce the HttpOnly option in the cookie.

cookie: before change

{code:java}
hdp32b  FALSE   /   FALSE   0   JSESSIONID  8dkibwayfnrc4y4hvpu3vh74
{code}

after change:

{code:java}
#HttpOnly_hdp32bFALSE   /   FALSE   0   JSESSIONID  
e1npdkbo3inj1xnd6gdc6ihws

{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21935) Hive Vectorization : Server performance issue with vectorize UDF

2019-06-28 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21935:
-

 Summary: Hive Vectorization : Server performance issue with 
vectorize UDF  
 Key: HIVE-21935
 URL: https://issues.apache.org/jira/browse/HIVE-21935
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 3.1.1
 Environment: Hive-3, JDK-8
Reporter: Rajkumar Singh


with vectorization turned on and hive.vectorized.adaptor.usage.mode=all we were 
seeing severe performance degradation. looking at the task jstacks it seems 
that it is running the code which vectorizes UDF and stuck in some loop.


{code:java}
jstack -l 14954 | grep 0x3af0 -A20
"TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
runnable [0x7f1547581000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:573)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
[yarn@hdp32b ~]$ jstack -l 14954 | grep 0x3af0 -A20
"TezChild" #15 daemon prio=5 os_prio=0 tid=0x7f157538d800 nid=0x3af0 
runnable [0x7f1547581000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:554)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:570)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.setResult(VectorUDFAdaptor.java:205)
at 
org.apache.hadoop.hive.ql.exec.vector.udf.VectorUDFAdaptor.evaluate(VectorUDFAdaptor.java:150)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:271)
at 
org.apache.hadoop.hive.ql.exec.vector.expressions.ListIndexColScalar.evaluate(ListIndexColScalar.java:59)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:146)
at 
org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:889)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)

{code}

after setting the hive.vectori

[jira] [Created] (HIVE-21972) "show transactions" display the header twice

2019-07-08 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21972:
-

 Summary: "show transactions" display the header twice
 Key: HIVE-21972
 URL: https://issues.apache.org/jira/browse/HIVE-21972
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


show transactions;
+-+++--+---+---+
|  txnid  |   state|  startedtime   |  lastheartbeattime   
| user  |   host|
+-+++--+---+---+
| Transaction ID  | Transaction State  | Started Time   | Last Heartbeat Time  
| User  | Hostname  |
| 896 | ABORTED| 1560209607000  | 1560209607000
| hive  | hdp32b.hdp.local  |
+-+++--+---+---+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21973) "show locks" print the header twice.

2019-07-08 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21973:
-

 Summary: "show locks" print the header twice.
 Key: HIVE-21973
 URL: https://issues.apache.org/jira/browse/HIVE-21973
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


show locks; 
-- output

{code:java}
+--+---+++-+-++-+-+--+---+---+-+
|  lockid  | database  | table  | partition  | lock_state  | blocked_by  | 
lock_type  | transaction_id  | last_heartbeat  | acquired_at  | user  | 
hostname  | agent_info  |
+--+---+++-+-++-+-+--+---+---+-+
| Lock ID  | Database  | Table  | Partition  | State   | Blocked By  | Type 
  | Transaction ID  | Last Heartbeat  | Acquired At  | User  | Hostname  | 
Agent Info  |
+--+---+++-+-++-+-+--+---+---+-+
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21986) HiveServer Web UI: Setting the Strict-Transport-Security in default response header

2019-07-10 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21986:
-

 Summary: HiveServer Web UI: Setting the Strict-Transport-Security 
in default response header
 Key: HIVE-21986
 URL: https://issues.apache.org/jira/browse/HIVE-21986
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Currently, HiveServer UI HTTP response header doesn't have 
Strict-Transport-Security set so will be adding this to default header.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-22081) Hivemetastore Performance: Compaction Initiator thread overwhelmed if no there are too many Table/partitions are eligible for compaction

2019-08-02 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-22081:
-

 Summary: Hivemetastore Performance: Compaction Initiator thread 
overwhelmed if no there are too many Table/partitions are eligible for 
compaction 
 Key: HIVE-22081
 URL: https://issues.apache.org/jira/browse/HIVE-22081
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


if Automatic Compaction is turned on, Initiator thread check for potential 
table/partitions which are eligible for compactions and run some checks in for 
loop before requesting compaction for eligibles. Though initiator thread is 
configured to run at interval 5 min default, in case of many objects it keeps 
on running as these checks are IO intensive and hog cpu.
In the proposed changes, I am planning to do
1. passing less object to for loop by filtering out the objects based on the 
condition which we are checking within the loop.
2. Doing Async call using future to determine compaction type(this is where we 
do FileSystem calls)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22118) compaction worker thread won't log the table name while skipping the compaction because it's sorted table/partitions

2019-08-15 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-22118:
-

 Summary: compaction worker thread won't log the table name while 
skipping the compaction because it's sorted table/partitions
 Key: HIVE-22118
 URL: https://issues.apache.org/jira/browse/HIVE-22118
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh
 Attachments: HIVE-22118.patch

for debugging perspective it's good if we log the full table name while 
skipping the table for compaction otherwise it's tedious to know why the 
compaction is not happening for the target table.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (HIVE-22144) HiveServer Web UI: Adding secure flag to the cookies options

2019-08-26 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22144:
-

 Summary: HiveServer Web UI: Adding secure flag to the cookies 
options
 Key: HIVE-22144
 URL: https://issues.apache.org/jira/browse/HIVE-22144
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.1.1
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


introduce a secure flag to the cookie option.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HIVE-22173) HiveServer2: Query with multiple lateral view hung forever during compile stage

2019-09-05 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22173:
-

 Summary: HiveServer2: Query with multiple lateral view hung 
forever during compile stage
 Key: HIVE-22173
 URL: https://issues.apache.org/jira/browse/HIVE-22173
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.1
 Environment: Hive-3.1.1, Java-8
Reporter: Rajkumar Singh


Steps To Repro:
{code:java}
-- create table 

CREATE EXTERNAL TABLE `jsontable`( 
`json_string` string) 
ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;

-- Run explain of the query
explain SELECT
*
FROM jsontable
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.addr.city'), "\\[|\\]|\"", ""),',')) t1 as c1
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.addr.country'), "\\[|\\]|\"", ""),',')) t2 as c2
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.addr'), "\\[|\\]|\"", ""),',')) t3 as c3
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.addr.postalCode'), "\\[|\\]|\"", ""),',')) t4 as c4
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.addr.state'), "\\[|\\]|\"", ""),',')) t5 as c5
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.addr.streetAddressLine'), "\\[|\\]|\"", ""),',')) t6 as c6
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t7 as c7
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.dummyfield'), "\\[|\\]|\"", ""),',')) t8 as c8
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.dummyfield.name.suffix'), "\\[|\\]|\"", ""),',')) t9 as c9
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.id.extension'), "\\[|\\]|\"", ""),',')) t10 as c10
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.id'), "\\[|\\]|\"", ""),',')) t11 as c11
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.id.root'), "\\[|\\]|\"", ""),',')) t12 as c12
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.telecom.'), "\\[|\\]|\"", ""),',')) t13 as c13
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.dummyfield1.use'), "\\[|\\]|\"", ""),',')) t14 as c14
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t15 as c15
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield1.dummyfield1.code'), "\\[|\\]|\"", ""),',')) t16 as c16
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield1.dummyfield1.value'), "\\[|\\]|\"", ""),',')) t17 as c17
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t18 as c18
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.city'), "\\[|\\]|\"", ""),',')) t19 as c19
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.country'), "\\[|\\]|\"", ""),',')) t20 as c20
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.country'), "\\[|\\]|\"", ""),',')) t21 as c21
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield'), "\\[|\\]|\"", ""),',')) t22 as c22
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.postalCode'), "\\[|\\]|\"", ""),',')) t23 as c23
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.postalCode'), "\\[|\\]|\"", ""),',')) t24 as c24
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.state'), "\\[|\\]|\"", ""),',')) t25 as c25
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.state'), "\\[|\\]|\"", ""),',')) t26 as c26
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2'), "\\[|\\]|\"", ""),',')) t27 as c27
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfield2.streetAddressLine'), "\\[|\\]|\"", ""),',')) t28 as c28
lateral view 
explode(split(regexp_replace(get_json_object(jsontable.json_string, 
'$.jsonfi

[jira] [Created] (HIVE-22255) Hive don't trigger Major Compaction automatically if table contains all base files

2019-09-27 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22255:
-

 Summary: Hive don't trigger Major Compaction automatically if 
table contains all base files 
 Key: HIVE-22255
 URL: https://issues.apache.org/jira/browse/HIVE-22255
 Project: Hive
  Issue Type: Bug
  Components: Hive, Transactions
Affects Versions: 3.1.2
 Environment: Hive-3.1.1
Reporter: Rajkumar Singh


user may run into the issue if the table consists of all base files but no 
delta, then the following condition will yield false and automatic major 
compaction will be skipped.

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java#L313]

 

Steps to Reproduce:
 # create Acid table 
{code:java}
//  create table myacid(id int);
{code}

 # Run multiple insert table 
{code:java}
// insert overwrite table myacid values(1);insert overwrite table myacid 
values(2),(3),(4){code}

 # DFS ls output
{code:java}
// dfs -ls -R /warehouse/tablespace/managed/hive/myacid;
++
|                     DFS Output                     |
++
| drwxrwx---+  - hive hadoop          0 2019-09-27 16:42 
/warehouse/tablespace/managed/hive/myacid/base_001 |
| -rw-rw+  3 hive hadoop          1 2019-09-27 16:42 
/warehouse/tablespace/managed/hive/myacid/base_001/_orc_acid_version |
| -rw-rw+  3 hive hadoop        610 2019-09-27 16:42 
/warehouse/tablespace/managed/hive/myacid/base_001/bucket_0 |
| drwxrwx---+  - hive hadoop          0 2019-09-27 16:43 
/warehouse/tablespace/managed/hive/myacid/base_002 |
| -rw-rw+  3 hive hadoop          1 2019-09-27 16:43 
/warehouse/tablespace/managed/hive/myacid/base_002/_orc_acid_version |
| -rw-rw+  3 hive hadoop        633 2019-09-27 16:43 
/warehouse/tablespace/managed/hive/myacid/base_002/bucket_0 |
++
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22317) Beeline site parser does not handle the variable substitution correctly

2019-10-09 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22317:
-

 Summary: Beeline site parser does not handle the variable 
substitution correctly
 Key: HIVE-22317
 URL: https://issues.apache.org/jira/browse/HIVE-22317
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 4.0.0
 Environment: Hive-4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


beeline-site.xml
{code:java}
http://www.w3.org/2001/XInclude";>
 
 
 beeline.hs2.jdbc.url.container
 
jdbc:hive2://c3220-node2.host.com:2181,c3220-node3.host.com:2181,c3220-node4.host.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
 
 
 
 beeline.hs2.jdbc.url.default
 test
 
 
beeline.hs2.jdbc.url.test
${beeline.hs2.jdbc.url.container}?tez.queue.name=myqueue
 
 
 beeline.hs2.jdbc.url.llap
 
jdbc:hive2://c3220-node2.host.com:2181,c3220-node3.host.com:2181,c3220-node4.host.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive
 
 
 {code}
beeline fail to connect because it does not parse the substituted value 
correctly
{code:java}
beeline
Error in parsing jdbc url: 
${beeline.hs2.jdbc.url.container}?tez.queue.name=myqueue from beeline-site.xml
beeline>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22352) Hive JDBC Storage Handler, simple select query failed with NPE if executed using Fetch Task

2019-10-15 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22352:
-

 Summary: Hive JDBC Storage Handler, simple select query failed 
with NPE if executed using Fetch Task
 Key: HIVE-22352
 URL: https://issues.apache.org/jira/browse/HIVE-22352
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.1
 Environment: Hive-3.1
Reporter: Rajkumar Singh


Steps To Repro:

 
{code:java}
// MySQL Table
CREATE TABLE `visitors` ( `id` bigint(20) unsigned NOT NULL, `date` timestamp 
NOT NULL DEFAULT CURRENT_TIMESTAMP )
// hive table
CREATE EXTERNAL TABLE `hive_visitors`( `col1` bigint COMMENT 'from 
deserializer', `col2` timestamp COMMENT 'from deserializer') ROW FORMAT SERDE 
'org.apache.hive.storage.jdbc.JdbcSerDe' STORED BY 
'org.apache.hive.storage.jdbc.JdbcStorageHandler' WITH SERDEPROPERTIES ( 
'serialization.format'='1') TBLPROPERTIES ( 'bucketing_version'='2', 
'hive.sql.database.type'='MYSQL', 'hive.sql.dbcp.maxActive'='1', 
'hive.sql.dbcp.password'='hive', 'hive.sql.dbcp.username'='hive', 
'hive.sql.jdbc.driver'='com.mysql.jdbc.Driver', 
'hive.sql.jdbc.url'='jdbc:mysql://hostname/test', 'hive.sql.table'='visitors', 
'transient_lastDdlTime'='1554910389')
Query:
select * from hive_visitors ;
Exception:
2019-10-16T04:04:39,483 WARN  [HiveServer2-Handler-Pool: Thread-71]: 
thrift.ThriftCLIService (:()) - Error fetching results: 
org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
java.lang.NullPointerException at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:478)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:952)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source) ~[?:?] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112] at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_112] at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at 
javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 ~[hadoop-common-3.1.1.3.1.4.0-315.jar:?] at 
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
com.sun.proxy.$Proxy42.fetchResults(Unknown Source) ~[?:?] at 
org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:565) 
~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:792)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1837)
 ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1822)
 ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
 ~[hive-service-3.1.0.3.1.4.0-315.jar:3.1.0.3.1.4.0-315] at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 ~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[?:1.8.0_112] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] Caused 
by: java.io.IOException: java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:602) 
~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:509) 
~[hive-exec-3.1.0.3.1.4.0-315.jar:3.1.1000-SNAPSHOT] at 
org.apache.hadoop.hive.ql.exec.Fetch

[jira] [Created] (HIVE-22353) Hive JDBC Storage Handler: Simple query fail with " Invalid table alias or column reference"

2019-10-15 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22353:
-

 Summary: Hive JDBC Storage Handler: Simple query fail with " 
Invalid table alias or column reference"
 Key: HIVE-22353
 URL: https://issues.apache.org/jira/browse/HIVE-22353
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajkumar Singh


Steps To Repro:
{code:java}
// show create table (Hive) 
CREATE EXTERNAL TABLE `hive_visitors`(
  `col1` bigint COMMENT 'from deserializer',   
  `col2` timestamp COMMENT 'from deserializer')
ROW FORMAT SERDE   
  'org.apache.hive.storage.jdbc.JdbcSerDe' 
STORED BY  
  'org.apache.hive.storage.jdbc.JdbcStorageHandler'  
WITH SERDEPROPERTIES ( 
  'serialization.format'='1')  
TBLPROPERTIES (
  'bucketing_version'='2', 
  'hive.sql.database.type'='MYSQL',
  'hive.sql.dbcp.maxActive'='1',   
  'hive.sql.dbcp.password'='hive', 
  'hive.sql.dbcp.username'='hive', 
  'hive.sql.jdbc.driver'='com.mysql.jdbc.Driver',  
  'hive.sql.jdbc.url'='jdbc:mysql://hostname/test',  
  'hive.sql.table'='visitors',   
  'transient_lastDdlTime'='1554910389') 

// MySql Table
CREATE TABLE `visitors` (
`id` bigint(20) unsigned NOT NULL,
`date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP
)

// Hive Query

 select * from hive_visitors where col2='2018-10-15';

Error: Error while compiling statement: FAILED: SemanticException [Error 
10004]: Line 1:34 Invalid table alias or column reference 'col2': (possible 
column names are: id, date) (state=42000,code=10004)
{code}
col2 is a valid column reference for the hive table, In some old version I was 
able to run the query referencing the hive column but it broken now.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22397) "describe table" statement for the table backed by custom storage handler fail with CNF

2019-10-23 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22397:
-

 Summary: "describe table"  statement for the table backed by 
custom storage handler fail with CNF
 Key: HIVE-22397
 URL: https://issues.apache.org/jira/browse/HIVE-22397
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.2
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Steps to Repro:
{code:java}
1) describe customsdtable;

2) ADD JAR hdfs:///user/hive/customsdtable.jar;

3) describe customsdtable;

CNF is expected for #1 but even adding the custome serde, hive fail with 
following exception for statement #3

Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.ClassNotFoundException

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22467) Hive-1: does not set jetty request.header.size correctly incase of SSL set up

2019-11-06 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22467:
-

 Summary: Hive-1: does not set jetty request.header.size correctly 
incase of SSL set up
 Key: HIVE-22467
 URL: https://issues.apache.org/jira/browse/HIVE-22467
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 1.2.1, 1.0.0, 1.3.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


on hive-1:

if user have SSL setup with HS2 then SslSelectChannelConnector override the 
connector request header settings.

[https://github.com/apache/hive/blob/5740946859fcca44b5e453ef02534b1ec5edcbca/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpCLIService.java#L102]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22630) Do not retrieve Materialized View definition for rebuild if query is test SQL

2019-12-11 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22630:
-

 Summary: Do not retrieve Materialized View definition for rebuild 
if query is test SQL
 Key: HIVE-22630
 URL: https://issues.apache.org/jira/browse/HIVE-22630
 Project: Hive
  Issue Type: Bug
 Environment: Hive-3.1.2
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


for the query like select 1, select current_timestamp, select current_date

hive retrieve all the Materialized view from megastore, if the no of databases 
are too large then this call take lots of time, the situation becomes worse if 
there are too frequent if hive server receives frequent "select 1" query ( 
connection pool uses it to check if the connection is valid or not).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22712) ReExec Driver execute submit the query in default queue irrespective of user defined queue

2020-01-09 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22712:
-

 Summary: ReExec Driver execute submit the query in default queue 
irrespective of user defined queue
 Key: HIVE-22712
 URL: https://issues.apache.org/jira/browse/HIVE-22712
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 3.1.2
 Environment: Hive-3
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


we unset the queue name intentionally in 
TezSessionState#startSessionAndContainers, 

as a result reexec create a new session in the default queue and create a 
problem, its a cumbersome to add reexec.overlay.tez.queue.name at session level.

I could not find a better way of setting the queue name (I am open for the 
suggestion here) since it can create a  conflict with the Global queue name vs 
user-defined queue that's why setting while initialization of 
ReExecutionOverlayPlugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22855) Do not change DB location to external if DB URI already exists or already referring to non-managed locattion

2020-02-07 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22855:
-

 Summary: Do not change DB location to external if DB URI already 
exists or already referring to non-managed locattion
 Key: HIVE-22855
 URL: https://issues.apache.org/jira/browse/HIVE-22855
 Project: Hive
  Issue Type: Bug
  Components: Hive
 Environment: Hive-3
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh
 Attachments: HIVE-22855.patch

from Spark:
{code:java}
spark.sql("CREATE DATABASE IF NOT EXISTS test LOCATION '/tmp/test'")
spark.sql("describe database test").show(false)
{code}
describe output suggests that DB URI is updated to the external warehouse path, 
all data will be written to hive warehouse external path which is undesired.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22945) Hive ACID Data Corruption: Update command mess the other column data and produces incorrect result

2020-02-28 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-22945:
-

 Summary: Hive ACID Data Corruption: Update command mess the other 
column data and produces incorrect result
 Key: HIVE-22945
 URL: https://issues.apache.org/jira/browse/HIVE-22945
 Project: Hive
  Issue Type: Bug
  Components: Hive, Transactions
Affects Versions: 3.2.0
Reporter: Rajkumar Singh


Hive Update Operation update the other column incorrectly and produces 
incorrect results:

Steps to reproduce:
{code:java}
CREATE TABLE `test`(
  `start_dt` timestamp, 
  `stop_dt` timestamp
  );
  
INSERT INTO test (start_dt, stop_dt) SELECT  CURRENT_TIMESTAMP, CAST(NULL AS 
TIMESTAMP);

select * from test; 
+--+---+
|  test.start_dt   | test.stop_dt  |
+--+---+
| 2020-02-28 20:06:29.116  | NULL  |
+--+---+

UPDATE test SET STOP_DT = CURRENT_TIMESTAMP WHERE CAST(START_DT AS DATE) = 
CURRENT_DATE;

++--+
| test.start_dt  |   test.stop_dt   |
++--+
| 2020-02-28 00:00:00.0  | 2020-02-28 20:07:12.248  |
++--+
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23408) hive on Tez : Kafka storage handler broken in secure environment

2020-05-07 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23408:
-

 Summary: hive on Tez :  Kafka storage handler broken in secure 
environment
 Key: HIVE-23408
 URL: https://issues.apache.org/jira/browse/HIVE-23408
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 4.0.0
Reporter: Rajkumar Singh


hive.server2.authentication.kerberos.principal set in the form of 
hive/_HOST@REALM,

Tez task can start at the random NM host and unfold the value of _HOST with the 
value of fqdn where it is running. this leads to an authentication issue.

for LLAP there is fallback for LLAP daemon keytab/principal, Kafka 1.1 onwards 
support delegation token and we should take advantage of it for hive on tez.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23457) Hive Incorrect result with subquery while optimizer misses the aggregation stage

2020-05-12 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23457:
-

 Summary: Hive Incorrect result with subquery while optimizer 
misses the aggregation stage 
 Key: HIVE-23457
 URL: https://issues.apache.org/jira/browse/HIVE-23457
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.2.0
Reporter: Rajkumar Singh


Steps to Repro:
{code:java}
create table abc (id int);
insert into table abc values (1),(2),(3),(4),(5),(6);
select * from abc order by id desc
6
5
4
3
2
1
select `id` from (select * from abc order by id desc ) as tmp;
1
2
3
4
5
6
 
{code}

looking at the query plan it seems while using the subquery optimizer missed 
the aggregation stage, I cant see any reduce stage.

{code:java}
set hive.query.results.cache.enabled=false;
explain select * from abc order by id desc;
++
|  Explain   |
++
| Plan optimized by CBO. |
||
| Vertex dependency in root stage|
| Reducer 2 <- Map 1 (SIMPLE_EDGE)   |
||
| Stage-0|
|   Fetch Operator   |
| limit:-1   |
| Stage-1|
|   Reducer 2 vectorized |
|   File Output Operator [FS_8]  |
| Select Operator [SEL_7] (rows=6 width=4)   |
|   Output:["_col0"] |
| <-Map 1 [SIMPLE_EDGE] vectorized   |
|   SHUFFLE [RS_6]   |
| Select Operator [SEL_5] (rows=6 width=4) |
|   Output:["_col0"] |
|   TableScan [TS_0] (rows=6 width=4)|
| default@abc,abc, ACID 
table,Tbl:COMPLETE,Col:COMPLETE,Output:["id"] |
||
++


explain select `id` from (select * from abc order by id desc ) as tmp;
+--+
|   Explain|
+--+
| Plan optimized by CBO.   |
|  |
| Stage-0  |
|   Fetch Operator |
| limit:-1 |
| Select Operator [SEL_1]  |
|   Output:["_col0"]   |
|   TableScan [TS_0]   |
| Output:["id"]|
|  |
+--+
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23498) Disable HTTP Trace method on ThriftServer

2020-05-18 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23498:
-

 Summary: Disable HTTP Trace method on ThriftServer
 Key: HIVE-23498
 URL: https://issues.apache.org/jira/browse/HIVE-23498
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.2
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23512) ReplDumpTask: Adding debug to print opentxn for debugging perspective

2020-05-19 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23512:
-

 Summary: ReplDumpTask: Adding debug to print opentxn for debugging 
perspective
 Key: HIVE-23512
 URL: https://issues.apache.org/jira/browse/HIVE-23512
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.2.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Often time we see that ReplDumpTask waiting for 
hive.repl.bootstrap.dump.open.txn.timeout  (1h) to kill open txns and make 
progress, the only way to know for what txns it is waiting on is query the 
Metastore DB and backtrack the txns in HS2 logs to know if open txns are 
genuinely open for this long or any other issue.
I am adding the debug log to print these txns which can help in debugging such 
issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23542) Query based compaction failing with ClassCastException

2020-05-23 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23542:
-

 Summary: Query based compaction failing with ClassCastException 
 Key: HIVE-23542
 URL: https://issues.apache.org/jira/browse/HIVE-23542
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Steps to repro:
create table test(id int);
insert into table test1 values (1),(2),(3);
insert into table test1 values (4),(5),(6);

run query-based compactor and it will fail with the following exception:-

{code:java}
alter table test compact 'major'; -- query based compaction 

Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
... 16 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
... 19 more
Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
org.apache.hadoop.io.IntWritable
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:967)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
{code}








--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23675) WebHcat: java level deadlock in hcat in presence of InMemoryJAAS

2020-06-10 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23675:
-

 Summary: WebHcat: java level deadlock in hcat in presence of 
InMemoryJAAS
 Key: HIVE-23675
 URL: https://issues.apache.org/jira/browse/HIVE-23675
 Project: Hive
  Issue Type: Improvement
Reporter: Rajkumar Singh


ENV: Keberos/SPNEGO enabled

set hive.exec.post.hook;
org.apache.hadoop.hive.ql.hooks.ATSHook,org.apache.atlas.hive.hook.HiveHook

ATLAS Hook use InMemoryJAASConfiguration

This is a sequence of the event while issue reproduces:

WebHcat -> hcat -> Hive Driver -> post hook execution create ATSHook  -> hook 
start the spnego auth and stuck while finding InMemoryJAASConfiguration used by 
the AtlasHook (this happens in separate thread ATS Logger)

Hcat jstack
{code:java}
Found one Java-level deadlock:
 =
 "ATS Logger 0":
   waiting to lock monitor 0x7efdc8003a38 (object 0xf3fcfe28, a 
org.apache.atlas.plugin.classloader.AtlasPluginClassLoader),
   which is held by "main"
 "main":
   waiting to lock monitor 0x7efdc8003da8 (object 0xc0050d40, a 
org.apache.hadoop.hive.ql.exec.UDFClassLoader),
   which is held by "ATS Logger 0"

Java stack information for the threads listed above:
 ===
 "ATS Logger 0":
 at 
org.apache.atlas.security.InMemoryJAASConfiguration.getAppConfigurationEntry(InMemoryJAASConfiguration.java:238)
 at 
sun.security.jgss.LoginConfigImpl.getAppConfigurationEntry(LoginConfigImpl.java:145)
 at javax.security.auth.login.LoginContext.init(LoginContext.java:251)
 at javax.security.auth.login.LoginContext.(LoginContext.java:512)
 at sun.security.jgss.GSSUtil.login(GSSUtil.java:256)
 at sun.security.jgss.krb5.Krb5Util.getTicket(Krb5Util.java:158)
 at 
sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:335)
 at 
sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:331)
 at java.security.AccessController.doPrivileged(Native Method)
 at 
sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:330)
 at 
sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:145)
 at 
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
 at 
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
 at 
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
 at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
 at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
 at 
sun.security.jgss.spnego.SpNegoContext.GSS_initSecContext(SpNegoContext.java:882)
 at 
sun.security.jgss.spnego.SpNegoContext.initSecContext(SpNegoContext.java:317)
 at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:248)
 at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
 at 
sun.net.www.protocol.http.spnego.NegotiatorImpl.init(NegotiatorImpl.java:108)
 at 
sun.net.www.protocol.http.spnego.NegotiatorImpl.(NegotiatorImpl.java:117)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at sun.net.www.protocol.http.Negotiator.getNegotiator(Negotiator.java:63)
 at 
sun.net.www.protocol.http.NegotiateAuthentication.isSupportedImpl(NegotiateAuthentication.java:130)
 - locked <0xf48c4d90> (a java.lang.Class for 
sun.net.www.protocol.http.NegotiateAuthentication)
 at 
sun.net.www.protocol.http.NegotiateAuthentication.isSupported(NegotiateAuthentication.java:102)
 - locked <0xc0050d40> (a 
org.apache.hadoop.hive.ql.exec.UDFClassLoader)
 at 
sun.net.www.protocol.http.AuthenticationHeader.parse(AuthenticationHeader.java:180)
 at 
sun.net.www.protocol.http.AuthenticationHeader.(AuthenticationHeader.java:126)
 at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1660)
 - locked <0xf47b7298> (a 
sun.net.www.protocol.https.DelegateHttpsURLConnection)
 at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
 - locked <0xf47b7298> (a 
sun.net.www.protocol.https.DelegateHttpsURLConnection)
 at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
 at 
sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:338)
 at 
org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:191)
 at 
org.apache.hadoop.security.toke

[jira] [Created] (HIVE-23752) Cast as Date for invalid date produce the valid output

2020-06-23 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23752:
-

 Summary: Cast as Date for invalid date produce the valid output
 Key: HIVE-23752
 URL: https://issues.apache.org/jira/browse/HIVE-23752
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajkumar Singh


Hive-3:

{code:java}
select cast("-00-00" as date) 
0002-11-30 

select cast("2010-27-54" as date)
 2012-04-23

select cast("1992-00-74" as date) ;
1992-02-12

{code}

The reason Hive allowing is because Parser formatted is set to LENIENT 
https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/common/src/java/org/apache/hadoop/hive/common/type/Date.java#L50,
 this seems to be an intentional change as changing the ResolverStyle to STRICT 
start failing the tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23753) Make LLAP Secretmanager token path configurable

2020-06-23 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23753:
-

 Summary: Make LLAP Secretmanager token path configurable
 Key: HIVE-23753
 URL: https://issues.apache.org/jira/browse/HIVE-23753
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


In a very Busy LLAP cluster if for some reason the Tokens under 
zkdtsm_hive_llap0 zk path are not cleaned then LLAP Daemon startup takes a very 
long time to startup, this may lead to service outage if LLAP daemons are not 
started and the number of retries while checking LLAP app status exceeds. upon 
looking the jstack of llap daemon it seems to traverse the zkdtsm_hive_llap0 zk 
path before starting the secret manager.


{code:java}
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1386)
- locked <0x7fef36cdd338> (a org.apache.zookeeper.ClientCnxn$Packet)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1153)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:302)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:291)
at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:288)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:279)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:142)
at 
org.apache.curator.framework.imps.GetDataBuilderImpl$2.forPath(GetDataBuilderImpl.java:138)
at 
org.apache.curator.framework.recipes.cache.PathChildrenCache.internalRebuildNode(PathChildrenCache.java:591)
at 
org.apache.curator.framework.recipes.cache.PathChildrenCache.rebuild(PathChildrenCache.java:331)
at 
org.apache.curator.framework.recipes.cache.PathChildrenCache.start(PathChildrenCache.java:300)
at 
org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.startThreads(ZKDelegationTokenSecretManager.java:370)
at 
org.apache.hadoop.hive.llap.security.SecretManager.startThreads(SecretManager.java:82)
at 
org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:223)
at 
org.apache.hadoop.hive.llap.security.SecretManager$1.run(SecretManager.java:218)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1846)
at 
org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:218)
at 
org.apache.hadoop.hive.llap.security.SecretManager.createSecretManager(SecretManager.java:212)
at 
org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon.(LlapDaemon.java:279)
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23808) "MSCK REPAIR.. DROP Partitions fail" with kryo Exception

2020-07-06 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23808:
-

 Summary: "MSCK REPAIR.. DROP Partitions fail" with kryo Exception 
 Key: HIVE-23808
 URL: https://issues.apache.org/jira/browse/HIVE-23808
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.2.0
Reporter: Rajkumar Singh


Steps to the repo:
1. Create External partition table
2. Remove some partition manually be using hdfs dfs -rm command
3. run "MSCK REPAIR.. DROP Partitions" and it will fail with following exception


{code:java}
2020-07-06 10:42:11,434 WARN  
org.apache.hadoop.hive.metastore.utils.RetryUtilities$ExponentiallyDecayingBatchWork:
 [HiveServer2-Background-Pool: Thread-210]: Exception thrown while processing 
using a batch size 2
org.apache.hadoop.hive.metastore.utils.MetastoreException: 
MetaException(message:Index: 117, Size: 0)
at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:479) 
~[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.metastore.Msck$2.execute(Msck.java:432) 
~[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hadoop.hive.metastore.utils.RetryUtilities$ExponentiallyDecayingBatchWork.run(RetryUtilities.java:91)
 [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hadoop.hive.metastore.Msck.dropPartitionsInBatches(Msck.java:496) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.metastore.Msck.repair(Msck.java:223) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hadoop.hive.ql.ddl.misc.msck.MsckOperation.execute(MsckOperation.java:74)
 [hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
[hive-exec-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
 [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
 [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:322)
 [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at java.security.AccessController.doPrivileged(Native Method) 
[?:1.8.0_242]
at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_242]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
 [hadoop-common-3.1.1.7.1.1.0-565.jar:?]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:340)
 [hive-service-3.1.3000.7.1.1.0-565.jar:3.1.3000.7.1.1.0-565]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_242]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_242]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_242]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_242]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_242]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_242]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
// Caused by
java.lang.IndexOutOfBoundsException: Index: 117, Size: 0
at java.util.ArrayList.rangeChec

[jira] [Created] (HIVE-23867) Truncate table fail with AccessControlException if doAs enabled and tbl database has source of replication

2020-07-16 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23867:
-

 Summary: Truncate table fail with AccessControlException if doAs 
enabled and tbl database has source of replication
 Key: HIVE-23867
 URL: https://issues.apache.org/jira/browse/HIVE-23867
 Project: Hive
  Issue Type: Bug
  Components: Hive, repl
Affects Versions: 3.1.1
Reporter: Rajkumar Singh


Steps to repro:

1. enable doAs
2. with some user (not a super user) create database 
create database sampledb with dbproperties('repl.source.for'='1,2,3');
3. create table using create table sampledb.sampletble (id int);
4. insert some data into it insert into sampledb.sampletble values (1), (2),(3);
5. Run truncate command on the table which fail with following error


{code:java}
 org.apache.hadoop.ipc.RemoteException: User username is not a super user 
(non-super user cannot change owner).
 at 
org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:85)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1907)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:866)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:531)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
 
 at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1498) 
~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at org.apache.hadoop.ipc.Client.call(Client.java:1444) 
~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at org.apache.hadoop.ipc.Client.call(Client.java:1354) 
~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
 ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
 ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at com.sun.proxy.$Proxy31.setOwner(Unknown Source) ~[?:?]
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setOwner(ClientNamenodeProtocolTranslatorPB.java:470)
 ~[hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
 at sun.reflect.GeneratedMethodAccessor151.invoke(Unknown Source) ~[?:?]
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_232]
 at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
 [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
 ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
 ~[hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
 [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
 [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at com.sun.proxy.$Proxy32.setOwner(Unknown Source) [?:?]
 at org.apache.hadoop.hdfs.DFSClient.setOwner(DFSClient.java:1914) 
[hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1764)
 [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$36.doCall(DistributedFileSystem.java:1761)
 [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 [hadoop-common-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.setOwner(DistributedFileSystem.java:1774)
 [hadoop-hdfs-client-3.1.1.3.1.5.0-152.jar:?]
 at 
org.apache.hadoop.hive.metastore.ReplChangeManager.recycle(ReplChangeManager.java:238)
 [hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
 at 
org.apache.hadoop.hive.metastore.ReplChangeManager.recycle(ReplChangeManager.java:191)
 [hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152

[jira] [Created] (HIVE-23886) Filter Query on External table produce no result if hive.metastore.expression.proxy set to MsckPartitionExpressionProxy

2020-07-20 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23886:
-

 Summary: Filter Query on External table produce no result if 
hive.metastore.expression.proxy set to MsckPartitionExpressionProxy
 Key: HIVE-23886
 URL: https://issues.apache.org/jira/browse/HIVE-23886
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.0
Reporter: Rajkumar Singh


query such as "select count(1) from tpcds_10_parquet.store_returns where 
sr_returned_date_sk=2452802" return row count as 0 even though partition has 
enough rows in it.

upon investigation, I found that partition list  passed during the 
StatsUtils.getNumRows is of zero size.
https://github.com/apache/hive/blob/ccaf783a198e142b408cb57415c4262d27b45831/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L438-L439

it seems partitionlist is retrieved during PartitionPruner
https://github.com/apache/hive/blob/36bf7f00731e3b95af3e5eeaa4ce39b375974a74/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java#L439

Hive serialized this filter expression using Kryo before passing to HMS

https://github.com/apache/hive/blob/36bf7f00731e3b95af3e5eeaa4ce39b375974a74/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3931

on the server-side if the hive.metastore.expression.proxy set to 
MsckPartitionExpressionProxy it tries to convert this expression into the 
string 

https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MsckPartitionExpressionProxy.java#L50

https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MsckPartitionExpressionProxy.java#L56

because of this bad filter expression hive did not retrieve any partitions, I 
think to make it work hive should try to deserialize it similar to 
PartitionExpressionForMetastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23968) CTAS with TBLPROPERTIES ('transactional'='false') does not entertain translated table location

2020-07-31 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-23968:
-

 Summary: CTAS with TBLPROPERTIES ('transactional'='false') does 
not entertain translated table location
 Key: HIVE-23968
 URL: https://issues.apache.org/jira/browse/HIVE-23968
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 4.0.0
Reporter: Rajkumar Singh


HMS translation layer convert the table to external based on the transactional 
property set to false but MoveTask does not entertain the translated table 
location and move the data to the managed table location;

steps to repro:

{code:java}
create table nontxnal TBLPROPERTIES ('transactional'='false') as select * from 
abc;
{code}

select query on table return nothing t but the source table has data in it.
{code:java}
select * from nontxnal;
+--+
| nontxnal.id  |
+--+
+--+
{code}

--show create table

{code:java}
CREATE EXTERNAL TABLE `nontxnal`(  |
|   `id` int)|
| ROW FORMAT SERDE   |
|   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
| STORED AS INPUTFORMAT  |
|   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
| OUTPUTFORMAT   |
|   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
| LOCATION   |
|   'hdfs://hostname:8020/warehouse/tablespace/external/hive/nontxnal' |
| TBLPROPERTIES (|
|   'TRANSLATED_TO_EXTERNAL'='TRUE', |
|   'bucketing_version'='2', |
|   'external.table.purge'='TRUE',   |
|   'transient_lastDdlTime'='1596215634')|

{code}

table data is moved to the managed location:
```
dfs -ls -R  hdfs://hostname:8020/warehouse/tablespace/managed/hive/nontxnal
. . . . . . . . . . . . . . . . . . . . . . .> ;
++
| DFS Output |
++
| -rw-rw+  3 hive hadoop201 2020-07-31 17:05 
hdfs://hostname:8020/warehouse/tablespace/managed/hive/nontxnal/00_0 |
++

```





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24039) update jquery version to mitigate CVE-2020-11023

2020-08-13 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24039:
-

 Summary: update jquery version to mitigate CVE-2020-11023
 Key: HIVE-24039
 URL: https://issues.apache.org/jira/browse/HIVE-24039
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


there is known vulnerability in jquery version used by hive, with this jira 
plan is to upgrade the jquery version 3.5.0 where it's been fixed. more details 
about the vulnerability can be found here.
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24113) NPE in GenericUDFToUnixTimeStamp

2020-09-02 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24113:
-

 Summary: NPE in GenericUDFToUnixTimeStamp
 Key: HIVE-24113
 URL: https://issues.apache.org/jira/browse/HIVE-24113
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.2
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Following query will trigger the getPartitionsByExpr call at HMS, HMS will try 
to evaluate the filter based on the PartitionExpressionForMetastore proxy, this 
proxy uses the QL packages to evaluate the filter and call 
GenericUDFToUnixTimeStamp.

select * from table_name where hour between 
from_unixtime(unix_timestamp('2020090120', 'MMddHH') - 1*60*60, 
'MMddHH') and from_unixtime(unix_timestamp('2020090122', 'MMddHH') + 
2*60*60, 'MMddHH');

I think SessionState in the code path will always be NULL thats why it hit the 
NPE.


{code:java}
java.lang.NullPointerException: null
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initializeInput(GenericUDFToUnixTimeStamp.java:126)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFToUnixTimeStamp.initialize(GenericUDFToUnixTimeStamp.java:75)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:148)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:146)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:140)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartExprEvalUtils.prepareExpr(PartExprEvalUtils.java:119)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prunePartitionNames(PartitionPruner.java:551)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionExpressionForMetastore.filterPartitionsByExpr(PartitionExpressionForMetastore.java:82)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesPrunedByExprNoTxn(ObjectStore.java:3527)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.metastore.ObjectStore.access$1400(ObjectStore.java:252) 
~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3493)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.metastore.ObjectStore$10.getJdoResult(ObjectStore.java:3464)
 ~[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3764)
 [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:3499)
 [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:3452)
 [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_112]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_112]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
[hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at com.sun.proxy.$Proxy28.getPartitionsByExpr(Unknown Source) [?:?]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_expr(HiveMetaStore.java:6637)
 [hive-exec-3.1.0.3.1.5.65-1.jar:3.1.0.3.1.5.65-1]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_112]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.ja


{code}




--
This message was sent by Atlassian Jira
(v8.3.4#8030

[jira] [Created] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail while Move Operation

2020-09-14 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24163:
-

 Summary: Dynamic Partitioning Insert fail for MM table fail while 
Move Operation
 Key: HIVE-24163
 URL: https://issues.apache.org/jira/browse/HIVE-24163
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajkumar Singh
 Fix For: 3.1.2


-- create MM table 
{code:java}
CREATE TABLE `part1`(  |
|   `id` double, |
|   `n` double,  |
|   `name` varchar(8),   |
|   `sex` varchar(1))|
| PARTITIONED BY (   |
|   `weight` string, |
|   `age` string,|
|   `height` string) |
| ROW FORMAT SERDE   |
|   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  |
| WITH SERDEPROPERTIES ( |
|   'field.delim'='\u0001',  |
|   'line.delim'='\n',   |
|   'serialization.format'='\u0001') |
| STORED AS INPUTFORMAT  |
|   'org.apache.hadoop.mapred.TextInputFormat'   |
| OUTPUTFORMAT   |
|   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
| LOCATION   |
|   'hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1' |
| TBLPROPERTIES (|
|   'bucketing_version'='2', |
|   'transactional'='true',  |
|   'transactional_properties'='insert_only',|
|   'transient_lastDdlTime'='1599053368')
{code}

-- create managed table 

{code:java}
CREATE TABLE `class`(  |
|   `name` varchar(8),   |
|   `sex` varchar(1),|
|   `age` double,|
|   `height` double, |
|   `weight` double) |
| ROW FORMAT SERDE   |
|   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
| STORED AS INPUTFORMAT  |
|   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
| OUTPUTFORMAT   |
|   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
| LOCATION   |
|   'hdfs://hostname:8020/warehouse/tablespace/managed/hive/class' |
| TBLPROPERTIES (|
|   'bucketing_version'='2', |
|   'transactional'='true',  |
|   'transactional_properties'='default',|
|   'transient_lastDdlTime'='1599053345')  
{code}


-- Run Insert query

{code:java}
INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`)  SELECT 0, 0, 
`Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`;
{code}

it fail during the MoveTask execution:

{code:java}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition 
hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3
 is not a directory!
at 
org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.

[jira] [Created] (HIVE-24193) Select query on renamed hive acid table does not produce any output

2020-09-23 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24193:
-

 Summary: Select query on renamed hive acid table does not produce 
any output
 Key: HIVE-24193
 URL: https://issues.apache.org/jira/browse/HIVE-24193
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.2
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


During onRename, HMS update COMPLETED_TXN_COMPONENTS which fail with 
CTC_DATABASE column does not exist, upon investigation I found that enclosing 
quotes are missing for columns thats db query fail with this exception



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24194) Query with column mask fail with ParseException if column name has special char

2020-09-23 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24194:
-

 Summary: Query with column mask fail with ParseException if column 
name has special char
 Key: HIVE-24194
 URL: https://issues.apache.org/jira/browse/HIVE-24194
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.2
Reporter: Rajkumar Singh


hive query with column masking failed with ParseException

Table DDL

{code:java}
CREATE TABLE `emp`( `id` string, `name#` string);
{code}

The following query failed
{code:java}
select `emp`.`id`, `emp`.`name#` from (SELECT `id`, 
CAST(mask_show_first_n(name#, 4, 'x', 'x', 'x', -1, '1') AS string) AS `name#`, 
BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM `default`.`emp` 
)`emp`;
{code}


Error: Error while compiling statement: FAILED: ParseException line 1:79 
character '#' not supported here (state=42000,code=4)

quoting manually helped 

{code:java}
select `emp`.`id`, `emp`.`name#` from (SELECT `id`, 
CAST(mask_show_first_n(`name#`, 4, 'x', 'x', 'x', -1, '1') AS string) AS 
`name#`, BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM 
`default`.`emp` )`emp`;
{code}

manual query change will not work for Ranger authorizer as following query


{code:java}
select * from emp;
{code}

will be rewritten to 

{code:java}
select `emp`.`id`, `emp`.`name#` from (SELECT `id`, 
CAST(mask_show_first_n(name#, 4, 'x', 'x', 'x', -1, '1') AS string) AS `name#`, 
BLOCK__OFFSET__INSIDE__FILE, INPUT__FILE__NAME, ROW__ID FROM `default`.`emp` 
)`emp`;
{code}

Ranger apply the transformer for column here so we should consider the 
enclosing the column names in the back-ticks to make it work

https://github.com/apache/ranger/blob/master/hive-agent/src/main/java/org/apache/ranger/authorization/hive/authorizer/RangerHiveAuthorizer.java#L1332

I have opened https://issues.apache.org/jira/browse/RANGER-3009 for Ranger but 
checking we hive can pass the column name enclosed in back-ticks while passing 
the priv-object to ranger 

https://github.com/apache/hive/blob/7dd12cd9d7720f22159062d3c3e5d7bdd127/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L11977



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24276) HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) Vulnerability

2020-10-14 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24276:
-

 Summary: HiveServer2 loggerconf jsp Cross-Site Scripting (XSS) 
Vulnerability 
 Key: HIVE-24276
 URL: https://issues.apache.org/jira/browse/HIVE-24276
 Project: Hive
  Issue Type: Bug
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24469) StatsTask failure while inserting the data into the table partitioned by timestamp

2020-12-02 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24469:
-

 Summary: StatsTask failure while inserting the data into the table 
partitioned by timestamp
 Key: HIVE-24469
 URL: https://issues.apache.org/jira/browse/HIVE-24469
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 4.0.0
Reporter: Rajkumar Singh


Steps to repro:


{code:java}
CREATE EXTERNAL TABLE `tblsource`(
  `x` int, 
  `y` string)
STORED AS PARQUET;
CREATE EXTERNAL TABLE `tblinsert`(
  `x` int)
PARTITIONED BY ( 
  `y` timestamp)
STORED AS PARQUET;
insert into table tblsource values (5,'2020-11-06 00:00:00.000');
insert into table tblinsert partition(y) select * from tblsource distribute by 
(y);
{code}

Query fail while executing the stats task and I can see the exception in HMS


{code:java}
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:657) ~[?:1.8.0_232]
at java.util.ArrayList.get(ArrayList.java:433) ~[?:1.8.0_232]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:8629)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:8590)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_232]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_232]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_232]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at com.sun.proxy.$Proxy28.set_aggr_stats_for(Unknown Source) ~[?:?]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:18937)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:18921)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at java.security.AccessController.doPrivileged(Native Method) 
~[?:1.8.0_232]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_232]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
 ~[hadoop-common-3.1.1.7.2.0.0-237.jar:?]
at 
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 [hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_232]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_232]
{code}

I think the problem is with timestamp containing all 000 in nano seconds, after 
inserting the value 2020-11-06 00:00:00.000, hive perform set_aggr_stats_for 
and construct the SetPartitionsStatsRequest. during construction of the request 
since nano seconds are all 0 hive FetchOperator convert the 2020-11-06 
00:00:00.000 to 2020-11-06 00:00:00 ( Timestamp.valueOf(string)).

https://github.com/apache/hive/blob/f8aa55f9c8f22c4fd293d9531192f7f46099a420/ql/src/java/org/apache/hadoop/hive/ql/stats/ColStatsProcessor.java#L176

on HMS

https://github.com/apache/hive/blob/2ab194d25311e15487ae010b8dd113879ccd501b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8626

does not yield any partition as the filter expression for partition was 
2020-11-06 00:00:00 hence it fail with the above mentioned 
IndexOutOfBoundsException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24491) setting custom job name is ineffective if the tez session pool is configured or in case of session reuse.

2020-12-04 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24491:
-

 Summary: setting custom job name is ineffective if the tez session 
pool is configured or in case of session reuse.
 Key: HIVE-24491
 URL: https://issues.apache.org/jira/browse/HIVE-24491
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


HIVE-23026 add capability to set tez.job.name but it's not effective if tez 
session pool manager is configured or tez session reuse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24523) Vectorized read path for LazySimpleSerde does not honor the SERDEPROPERTIES for timestamp

2020-12-10 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24523:
-

 Summary: Vectorized read path for LazySimpleSerde does not honor 
the SERDEPROPERTIES for timestamp
 Key: HIVE-24523
 URL: https://issues.apache.org/jira/browse/HIVE-24523
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Affects Versions: 3.2.0
Reporter: Rajkumar Singh


Steps to repro:

{code:java}
  create external  table tstable(date_created timestamp)   ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'   WITH SERDEPROPERTIES (  
'timestamp.formats'='MMddHHmmss') stored as textfile;

cat sampledata 
2020120517

hdfs dfs -put sampledata /warehouse/tablespace/external/hive/tstable

{code}

disable fetch task conversion and run select * from tstable which produce no 
results, disabling the set hive.vectorized.use.vector.serde.deserialize=false; 
return the expected output.

while parsing the string to timestamp 
https://github.com/apache/hive/blob/master/serde/src/java/org/apache/hadoop/hive/serde2/lazy/fast/LazySimpleDeserializeRead.java#L812
 does not set the DateTimeFormatter which results IllegalArgumentException 
while parsing the timestamp through TimestampUtils.stringToTimestamp(strValue)





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24724) Create table with LIKE operator does not work correctly

2021-02-02 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24724:
-

 Summary: Create table with LIKE operator does not work correctly
 Key: HIVE-24724
 URL: https://issues.apache.org/jira/browse/HIVE-24724
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Affects Versions: 4.0.0
Reporter: Rajkumar Singh


Steps to repro:

{code:java}
create table atable (id int, str1 string);
alter table atable add constraint pk_atable primary key (id) disable novalidate;

create table btable like atable;

{code}

describe formatted btable lacks the constraints information.

 CreateTableLikeDesc does not set/fetch the constraints for LIKE table

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L13594-L13616

neither DDLTask fetches/set the constraints for the table. 

https://github.com/apache/hive/blob/5ba3dfcb6470ff42c58a3f95f0d5e72050274a42/ql/src/java/org/apache/hadoop/hive/ql/ddl/table/create/like/CreateTableLikeOperation.java#L58-L83




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24848) CBO failed with NPE

2021-03-04 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24848:
-

 Summary: CBO failed with NPE
 Key: HIVE-24848
 URL: https://issues.apache.org/jira/browse/HIVE-24848
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 3.1.1
Reporter: Rajkumar Singh


CBO failed for query having a predicate based on from_unixtime udf
select * from classification where 
CAST(from_unixtime(unix_timestamp(cast(partition_batch_ts as 
string),'MMddHHmmss')) AS TIMESTAMP) = '2021-02-26 02:00:00';


{code:java}
2021-03-04 10:08:58,844 ERROR org.apache.hadoop.hive.ql.parse.CalcitePlanner: 
[4d92f6e5-9a53-41fb-b53f-9003c338ab52 etp2107079200-38767]: CBO failed, 
skipping CBO.
java.lang.RuntimeException: org.apache.hadoop.hive.ql.parse.SemanticException: 
java.lang.NullPointerException
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:159) 
~[calcite-core-1.19.0.7.1.3.0-100.jar:1.19.0.7.1.3.0-100]
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:114) 
~[calcite-core-1.19.0.7.1.3.0-100.jar:1.19.0.7.1.3.0-100]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1544)
 ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:529)
 ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12667)
 ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:422)
 ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:288)
 ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:221) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:188) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:598) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:544) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:538) 
~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:127)
 ~[hive-exec-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199)
 ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:260)
 ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hive.service.cli.operation.Operation.run(Operation.java:274) 
~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:565)
 ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at 
org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:551)
 ~[hive-service-3.1.3000.7.1.3.0-100.jar:3.1.3000.7.1.3.0-100]
at sun.reflect.GeneratedMethodAccessor207.invoke(Unknown Source) ~[?:?]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_282]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24898) Beeline does not honor the credential provided in property-file

2021-03-17 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24898:
-

 Summary: Beeline does not honor the credential provided in 
property-file
 Key: HIVE-24898
 URL: https://issues.apache.org/jira/browse/HIVE-24898
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Beeline read the param correctly from the properties files but again fallback 
to the default beeline connection which require user to provide username and 
password.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24916) EXPORT TABLE command to ADLS Gen2/s3 fail with org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not supported for file system: abfs://

2021-03-19 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24916:
-

 Summary: EXPORT TABLE command to ADLS Gen2/s3 fail with 
org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
supported for file system: abfs://
 Key: HIVE-24916
 URL: https://issues.apache.org/jira/browse/HIVE-24916
 Project: Hive
  Issue Type: Bug
  Components: repl
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


"EXPORT TABLE" command invoked using distcp command failed with following error 
-

org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
supported for file system: abfs://storage...@xx.core.windows.net



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24950) Fixing the logger for TaskQueue

2021-03-26 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24950:
-

 Summary: Fixing the logger for TaskQueue
 Key: HIVE-24950
 URL: https://issues.apache.org/jira/browse/HIVE-24950
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24951) Table created with Uppercase name using CTAS does not produce result for select queries

2021-03-26 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24951:
-

 Summary: Table created with Uppercase name using CTAS does not 
produce result for select queries
 Key: HIVE-24951
 URL: https://issues.apache.org/jira/browse/HIVE-24951
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 4.0.0
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


Steps to repro:


{code:java}
CREATE EXTERNAL TABLE MY_TEST AS SELECT * FROM source

Table created with Location but does not have any data moved to it.
/warehouse/tablespace/external/hive/MY_TEST
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24982) HMS- Postgres: Create table fail if SERDEPROPERTIES contains the NULL character

2021-04-06 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24982:
-

 Summary: HMS- Postgres: Create table fail if SERDEPROPERTIES 
contains the NULL character
 Key: HIVE-24982
 URL: https://issues.apache.org/jira/browse/HIVE-24982
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajkumar Singh
 Fix For: 4.0.0


Postgres does not expect the NULL char ('\u') during the insert (ref: 
https://www.postgresql.org/message-id/1171970019.3101.328.camel%40coppola.muc.ecircle.de
 ) so create table with following SERDEPROPERTIES will fail with exception. 
{code:java}
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
WITH SERDEPROPERTIES ( 
  'field.delim'='\u', 
  'serialization.format'='\u') 
{code}


{code:java}
org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO 
"SERDE_PARAMS" ("PARAM_VALUE","SERDE_ID","PARAM_KEY") VALUES (?,?,?)
at 
org.datanucleus.store.rdbms.scostore.JoinMapStore.internalPut(JoinMapStore.java:1074)
at 
org.datanucleus.store.rdbms.scostore.JoinMapStore.putAll(JoinMapStore.java:224)
at 
org.datanucleus.store.rdbms.mapping.java.MapMapping.postInsert(MapMapping.java:158)
at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:522)
at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162)
at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138)
at 
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363)
at 
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339)
at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080)
at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2172)
at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:603)
at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:357)
at 
org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191)
at 
org.datanucleus.state.AbstractStateManager.providedObjectField(AbstractStateManager.java:1460)
at 
org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:120)
at 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor.dnProvideField(MStorageDescriptor.java)
at 
org.apache.hadoop.hive.metastore.model.MStorageDescriptor.dnProvideFields(MStorageDescriptor.java)
at 
org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:1170)
at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:292)
at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162)
at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138)
at 
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363)
at 
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339)
at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2080)
at 
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2172)
at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObjectAsValue(PersistableMapping.java:603)
at 
org.datanucleus.store.rdbms.mapping.java.PersistableMapping.setObject(PersistableMapping.java:357)
at 
org.datanucleus.store.rdbms.fieldmanager.ParameterSetter.storeObjectField(ParameterSetter.java:191)
at 
org.datanucleus.state.AbstractStateManager.providedObjectField(AbstractStateManager.java:1460)
at 
org.datanucleus.state.StateManagerImpl.providedObjectField(StateManagerImpl.java:120)
at 
org.apache.hadoop.hive.metastore.model.MTable.dnProvideField(MTable.java)
at 
org.apache.hadoop.hive.metastore.model.MTable.dnProvideFields(MTable.java)
at 
org.datanucleus.state.StateManagerImpl.provideFields(StateManagerImpl.java:1170)
at 
org.datanucleus.store.rdbms.request.InsertRequest.execute(InsertRequest.java:292)
at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObjectInTable(RDBMSPersistenceHandler.java:162)
at 
org.datanucleus.store.rdbms.RDBMSPersistenceHandler.insertObject(RDBMSPersistenceHandler.java:138)
at 
org.datanucleus.state.StateManagerImpl.internalMakePersistent(StateManagerImpl.java:3363)
at 
org.datanucleus.state.StateManagerImpl.makePersistent(StateManagerImpl.java:3339)
at 
org.datanucleus.ExecutionContextImpl.persistObject

[jira] [Created] (HIVE-24994) get_aggr_stats_for call fail with "Tried to send an out-of-range integer"

2021-04-08 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-24994:
-

 Summary: get_aggr_stats_for call fail with "Tried to send an 
out-of-range integer"
 Key: HIVE-24994
 URL: https://issues.apache.org/jira/browse/HIVE-24994
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh
 Fix For: 4.0.0


aggrColStatsForPartitions call fail with the Postgres LIMIT if the no of 
partitions passed in the direct sql goes beyond the 32767

{code:java}
postgresql.util.PSQLException: An I/O error occurred while sending to the 
backend.
 at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:337) 
~[postgresql-42.2.8.jar:42.2.8]
 at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:446) 
~[postgresql-42.2.8.jar:42.2.8]
 at 
org.postgresql.jdbc.PgStatement.execute(PgStatement.java:370) 
~[postgresql-42.2.8.jar:42.2.8]
 at 
org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:149)
 ~[postgresql-42.2.8.jar:42.2.8]
 at 
org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:108)
 ~[postgresql-42.2.8.jar:42.2.8]
 at 
com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52)
 ~[HikariCP-2.6.1.jar:?]
 at 
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java)
 [HikariCP-2.6.1.jar:?]
 at 
org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:375)
 [datanucleus-rdbms-4.1.19.jar:?]
 at 
org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:552)
 [datanucleus-rdbms-4.1.19.jar:?]
 at 
org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:645) 
[datanucleus-rdbms-4.1.19.jar:?]
 at 
org.datanucleus.store.query.Query.executeQuery(Query.java:1855) 
[datanucleus-core-4.1.17.jar:?]
 at 
org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) 
[datanucleus-rdbms-4.1.19.jar:?]
 at 
org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:368) 
[datanucleus-api-jdo-4.2.4.jar:?]
 at 
org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) 
[datanucleus-api-jdo-4.2.4.jar:?]
 at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2058)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2050)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.access$1500(MetaStoreDirectSql.java:110)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql$15$1.run(MetaStoreDirectSql.java:1530)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
[hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql$15.run(MetaStoreDirectSql.java:1521)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:73) 
[hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.partsFoundForPartitions(MetaStoreDirectSql.java:1518)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1489)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.ObjectStore$20.getSqlResult(ObjectStore.java:8966)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.ObjectStore$20.getSqlResult(ObjectStore.java:8962)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3757)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:8981)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at 
org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:8951)
 [hive-exec-3.1.0.3.1.5.6019-4.jar:3.1.0.3.1.5.6019-4]
 at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown 
Source) ~[?:?]
 at 
sun.refle

[jira] [Created] (HIVE-25024) Length function on char field yield incorrect result if CBO is enable

2021-04-16 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-25024:
-

 Summary: Length function on char field yield incorrect result if 
CBO is enable
 Key: HIVE-25024
 URL: https://issues.apache.org/jira/browse/HIVE-25024
 Project: Hive
  Issue Type: Bug
  Components: CBO, Hive
Affects Versions: 4.0.0
Reporter: Rajkumar Singh


Steps to repro:

{code:java}
create table char_test(val char(10));
insert into table char_test values ('abc')
select * from char_test;
++
| char_test.val  |
++
| abc|
++

 select length(val) from char_test where val='abc';
+--+
| _c0  |
+--+
| 10   |
+--+
{code}

The problem surface when CBO is enabled and query have a predicate on the char 
field. the filter form in this case is 'abc   ' (extra padded char) of 
string type since this is constant comparison. for string type genericudflength 
will not strip the extra chars.

https://github.com/apache/hive/blob/1758c8c857f8a6dc4c9dc9c522de449f53e5e5cc/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java#L943





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25172) HMS Postgres: Lock acquisition is failing because table name exceeds the char limit of MS table datatype

2021-05-27 Thread Rajkumar Singh (Jira)
Rajkumar Singh created HIVE-25172:
-

 Summary: HMS Postgres: Lock acquisition is failing because table 
name exceeds the char limit of MS table datatype
 Key: HIVE-25172
 URL: https://issues.apache.org/jira/browse/HIVE-25172
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 4.0.0
Reporter: Rajkumar Singh


Only affect user running Postgres as HMS backend database


{code:java}
021-05-11 19:49:41,040 ERROR org.apache.thrift.ProcessFunction: 
[pool-7-thread-199]: Internal error processing lock
org.apache.hadoop.hive.metastore.api.MetaException: Unable to update 
transaction database java.sql.BatchUpdateException: Batch entry 0 INSERT INTO 
"TXN_COMPONENTS" ("TC_TXNID", "TC_DATABASE", "TC_TABLE", "TC_PARTITION", "T
C_OPERATION_TYPE", "TC_WRITEID") VALUES (654299, 'default', 
'$some_big_table_name_exceeding_the_128_char_limit', NULL, 'i', 3) was aborte
d: ERROR: value too long for type character varying(128)  Call getNextException 
to see other errors in the batch.
...
Caused by: org.postgresql.util.PSQLException: ERROR: value too long for type 
character varying(128)
{code}

it seems we are hitting the table name (TXN_COMPONENTS) char limit here which 
is defined as 

TC_TABLE TYPE character varying(128);

there is need to increase the limit of char or limit the table name under the 
128 chars





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-18398) WITH SERDEPROPERTIES option is broken without org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

2018-01-08 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-18398:
-

 Summary: WITH SERDEPROPERTIES option is broken without 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 Key: HIVE-18398
 URL: https://issues.apache.org/jira/browse/HIVE-18398
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
Reporter: Rajkumar Singh
Priority: Minor


*Steps to reproduce:*
1. Create table 
{code}
create table test_serde(id int,value string) ROW FORMAT DELIMITED FIELDS 
TERMINATED BY '|' ESCAPED BY '\\' 
{code}
2. show create table produce following output
{code}
CREATE TABLE `test_serde`(
  `id` int, 
  `value` string)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY '|' 
WITH SERDEPROPERTIES ( 
  'escape.delim'='\\') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://hdp262a.hdp.local:8020/apps/hive/warehouse/test_serde'
TBLPROPERTIES (
  'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 
  'numFiles'='0', 
  'numRows'='0', 
  'rawDataSize'='0', 
  'totalSize'='0', 
  'transient_lastDdlTime'='1515448894')
{code}

3. once you run the  create table using the output of show create it ran into 
the parsing error

{code}
NoViableAltException(296@[1876:103: ( tableRowFormatMapKeysIdentifier )?])
at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
at org.antlr.runtime.DFA.predict(DFA.java:116)
.
FAILED: ParseException line 6:0 cannot recognize input near 'WITH' 
'SERDEPROPERTIES' '(' in serde properties specification
{code}

4. table create with LazySimpleSerde don't have any such issue.

{code}
hive> CREATE TABLE `foo`( 
> `col` string) 
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> WITH SERDEPROPERTIES ( 
> 'serialization.encoding'='UTF-8') 
> STORED AS INPUTFORMAT 
> 'org.apache.hadoop.mapred.TextInputFormat' 
> OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' ;
OK
Time taken: 0.375 seconds
{code}






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HIVE-19192) HiveServer2 query compilation : query compilation time increases sql has multiple unions

2018-04-12 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-19192:
-

 Summary: HiveServer2 query compilation : query compilation time 
increases sql has multiple unions 
 Key: HIVE-19192
 URL: https://issues.apache.org/jira/browse/HIVE-19192
 Project: Hive
  Issue Type: Improvement
  Components: Hive, HiveServer2
Affects Versions: 2.1.0, 1.2.1
 Environment: Hive-1.2.1

Hive-2.1.0

 
Reporter: Rajkumar Singh
 Attachments: query-with-100-union.q, query-with-200-union.q, 
query-with-50-union.q

query compilation time suffer a lot if SQL has many unions, here is the simple 
reproduce of the problem. PFA attached query with 50,100 and 200 unions(forgive 
me for this bad SQL). when run explain against hiveserver2 I can see the 
compilation time increase many folds.

{code}

query-with-50-union.q
1,671 rows selected (10.662 seconds)

query-with-100-union.q
3,321 rows selected (101.709 seconds)

query-with-200-union.q
6,588 rows selected (1074.487 seconds)

{code}

Running such SQL against hiveserver2 can starve other SQL to run into single 
threaded compilation stage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19259) Create view on tables having union all fail with "Table not found"

2018-04-20 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-19259:
-

 Summary: Create view on tables having union all fail with "Table 
not found"
 Key: HIVE-19259
 URL: https://issues.apache.org/jira/browse/HIVE-19259
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 1.2.1
 Environment: hive-1.2.1

 
Reporter: Rajkumar Singh


create view on table with union work well while "union all" failed with table 
not found, here are the reproduce steps.

{code}
_hive> create table foo(id int);_
_OK_
_Time taken: 0.401 seconds_
_hive> create table bar(id int);_
_OK_
 
_// view on table union_
_hive> create view unionview as with tmp_1 as ( select * from foo ), tmp_2 as 
(select * from bar ) select * from tmp_1 union  select * from tmp_2;_ 
_OK_
_Time taken: 0.517 seconds_
_hive> select * from unionview;_
_OK_
_Time taken: 5.805 seconds_
 
 
_// view on union all_ 
_hive> create view unionallview as with tmp_1 as ( select * from foo ), tmp_2 
as (select * from bar ) select * from tmp_1 union all  select * from tmp_2;_ 
_OK_
_Time taken: 1.535 seconds_
_hive> select * from unionallview;_
_FAILED: SemanticException Line 1:134 Table not found 'tmp_1' in definition of 
VIEW unionallview [_
_with tmp_1 as ( select `foo`.`id` from `default`.`foo` ), tmp_2 as (select 
`bar`.`id` from `default`.`bar` ) select `tmp_1`.`id` from tmp_1 union all  
select `tmp_2`.`id` from tmp_2_
_] used as unionallview at Line 1:14_
_{code}_



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19432) HIVE-7575: GetTablesOperation is too slow if the hive has too many databases and tables

2018-05-05 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-19432:
-

 Summary: HIVE-7575: GetTablesOperation is too slow if the hive has 
too many databases and tables
 Key: HIVE-19432
 URL: https://issues.apache.org/jira/browse/HIVE-19432
 Project: Hive
  Issue Type: Improvement
  Components: Hive, HiveServer2
Affects Versions: 2.2.0
Reporter: Rajkumar Singh


GetTablesOperation is too slow since it does not check for the authorization 
for databases and try pulling all the tables from all the databases using 
getTableMeta. for operation like follows

{code}

con.getMetaData().getTables("", "", "%", new String[] \{ "TABLE", "VIEW" });

{code}

build the getTableMeta call with wildcard *

{code}

 metastore.HiveMetaStore: 8: get_table_metas : db=* tbl=*

{code}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19469) HiveServer2: SqlStdAuth take too much time while doing checkFileAccessWithImpersonation if the table location has too many files/dirs

2018-05-08 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-19469:
-

 Summary: HiveServer2: SqlStdAuth take too much time while doing 
checkFileAccessWithImpersonation if the table location has too many files/dirs
 Key: HIVE-19469
 URL: https://issues.apache.org/jira/browse/HIVE-19469
 Project: Hive
  Issue Type: Task
  Components: HiveServer2
Affects Versions: 2.1.0
Reporter: Rajkumar Singh


HiveServer2 doAuthorization call takes too much time while doing 
checkFileAccessWithImpersonation if the table location has too many files/dirs 
which increases the query compilation time.

{code}

at 
org.apache.hadoop.hive.shims.Hadoop23Shims.checkFileAccess(Hadoop23Shims.java:1006)
 at 
org.apache.hadoop.hive.common.FileUtils.checkFileAccessWithImpersonation(FileUtils.java:378)
 at 
org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:417)
 at 
org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
 at 
org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
 at 
org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
 at 
org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
 at 
org.apache.hadoop.hive.common.FileUtils.isActionPermittedForFileHierarchy(FileUtils.java:431)
 at 
org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer.isURIAccessAllowed(RangerHiveAuthorizer.java:752)

{code}

the improvement we can make here is to parallelize 
checkFileAccessWithImpersonation call. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-14856) create table with select from table limit is failing with NFE if limit exceed than allowed 32bit integer length

2016-09-28 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-14856:
-

 Summary: create table with select from table limit is failing with 
NFE if limit exceed than allowed 32bit integer length
 Key: HIVE-14856
 URL: https://issues.apache.org/jira/browse/HIVE-14856
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.2.1
 Environment: centos 6.6
Reporter: Rajkumar Singh
Assignee: Rajkumar Singh


query with limit is failing with NumberFormatException if the limit exceeds 
32bit integer length.
create table sample1 as select * from sample limit 2248321440;
FAILED: NumberFormatException For input string: "2248321440"




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)