from:"Chiran Ravani \(Jira\)"

[jira] [Created] (HIVE-26374) Query based compaction fails for tables with CDT and columns with Reserved Keywords

2022-07-06 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-26374:


 Summary: Query based compaction fails for tables with CDT and 
columns with Reserved Keywords
 Key: HIVE-26374
 URL: https://issues.apache.org/jira/browse/HIVE-26374
 Project: Hive
  Issue Type: Bug
  Components: Hive, Transactions
Affects Versions: 4.0.0-alpha-1, 3.1.3
Reporter: Chiran Ravani


Query based compaction fails on Tables having complex data types with reserved 
keywords for columns. The compaction fails while creating a temporary table as 
it does not quote the columns correctly.

 

Below are the steps to reproduce the issue.
{code:java}
create table complex_dt_compact2(col1 array>);
insert into complex_dt_compact2 SELECT 
ARRAY(NAMED_STRUCT('arr_col1',1,'timestamp','2022-07-05 21:51:20.371'));
insert into complex_dt_compact2 SELECT 
ARRAY(NAMED_STRUCT('arr_col1',2,'timestamp','2022-07-05 21:51:20.371'));
alter table complex_dt_compact2 compact 'major' and wait; {code}

Error:
{code:java}
2022-07-05T22:15:47.710Z 
hiveserver2-0.hiveserver2-service.compute-1657056457-xkcx.svc.cluster.local 
hiveserver2 1 dbb4011d-c788-4b99-a31d-06bb6dd7182e [mdc@18060 
class="compactor.Worker" level="ERROR" 
thread="hiveserver2-0.hiveserver2-service.compute-1657056457-xkcx.svc.cluster.local-64_executor"]
 Caught exception while trying to compact 
id:3,dbname:default,tableName:complex_dt_compact2,partName:null,state:,type:MAJOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId:
 null,initiatorId: null,retryRetention0. Marking failed to avoid repeated 
failures
    java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: 
Failed to run CREATE temporary external table 
default_tmp_compactor_complex_dt_compact2_1657059347578(`operation` int, 
`originalTransaction` bigint, `bucket` int, `rowId` bigint, 
`currentTransaction` bigint, `row` struct<`col1` 
:array>>)  stored as orc LOCATION 
's3a://obfuscated/clusters/obfuscated/obfuscated/warehouse/tablespace/managed/hive/complex_dt_compact2/base_003_v038'
 TBLPROPERTIES ('compactiontable'='true', 'transactional'='false')
        at 
org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:120)
        at 
org.apache.hadoop.hive.ql.txn.compactor.MajorQueryCompactor.runCompaction(MajorQueryCompactor.java:63)
        at 
org.apache.hadoop.hive.ql.txn.compactor.Worker.findNextCompactionAndExecute(Worker.java:517)
        at 
org.apache.hadoop.hive.ql.txn.compactor.Worker.lambda$run$0(Worker.java:120)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
    Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run 
CREATE temporary external table 
default_tmp_compactor_complex_dt_compact2_1657059347578(`operation` int, 
`originalTransaction` bigint, `bucket` int, `rowId` bigint, 
`currentTransaction` bigint, `row` struct<`col1` 
:array>>)  stored as orc LOCATION 
's3a://obfuscated/clusters/obfuscated/obfuscated/warehouse/tablespace/managed/hive/complex_dt_compact2/base_003_v038'
 TBLPROPERTIES ('compactiontable'='true', 'transactional'='false')
        at 
org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:73)
        at 
org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:50)
        at 
org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:113)
        ... 7 more
    Caused by: (responseCode = 4, errorMessage = FAILED: ParseException 
line 1:241 cannot recognize input near 'timestamp' ':' 'string' in column 
specification, SQLState = 42000, exception = line 1:241 cannot recognize input 
near 'timestamp' ':' 'string' in column specification)
        at 
org.apache.hadoop.hive.ql.DriverUtils.createProcessorException(DriverUtils.java:143)
        at org.apache.hadoop.hive.ql.Compiler.handleException(Compiler.java:466)
        at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:122)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:197)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:636)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:694)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:526)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:515)
        at 
org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:70)
        ... 9 more{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-26320) Incorrect case evaluation for Parquet based table.

2022-06-13 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-26320:


 Summary: Incorrect case evaluation for Parquet based table.
 Key: HIVE-26320
 URL: https://issues.apache.org/jira/browse/HIVE-26320
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2, Query Planning
Affects Versions: 4.0.0-alpha-1
Reporter: Chiran Ravani


Query involving case statement with two or more conditions leads to incorrect 
result for tables with parquet format, The problem is not observed with ORC or 
TextFile.


*Steps to reproduce*:

{code:java}
create external table case_test_parquet(kob varchar(2),enhanced_type_code int) 
stored as parquet;
insert into case_test_parquet values('BB',18),('BC',18),('AB',18);

select case when (
   (kob='BB' and enhanced_type_code='18')
   or (kob='BC' and enhanced_type_code='18')
 )
then 1
else 0
end as logic_check
from case_test_parquet;
{code}

Result:
{code}
0
0
0
{code}

Expected result:
{code}
1
1
0
{code}

The problem does not appear when setting hive.optimize.point.lookup=false.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (HIVE-25980) Support HiveMetaStoreChecker.checkTable operation with multi-threaded

2022-02-24 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-25980:


 Summary: Support HiveMetaStoreChecker.checkTable operation with 
multi-threaded
 Key: HIVE-25980
 URL: https://issues.apache.org/jira/browse/HIVE-25980
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 3.1.2, 4.0.0
Reporter: Chiran Ravani
Assignee: Chiran Ravani


MSCK Repair table for high partition table can perform slow on Cloud Storage 
such as S3, one of the case we found where slowness was observed in 
HiveMetaStoreChecker.checkTable.


{code:java}
"HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 
tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at 
sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
at 
sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
at 
sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341)
at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
at 
sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957)
at 
com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
at 
com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
at 
com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
at 
com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
at 
com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
at 
com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
at 
com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
at 
com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
at 
com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
at 
com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82)
at 
com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at 
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
at 
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at 
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at 
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1331)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5437)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5384)
at 
com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1367)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$10(S3AFileSystem.java:2458)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem$$Lambda$437/835000758.apply(Unknown 
Source)
at org.apache.hadoop.fs

[jira] [Created] (HIVE-25661) Cover the test case for HIVE-25626

2021-10-29 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-25661:


 Summary: Cover the test case for HIVE-25626
 Key: HIVE-25661
 URL: https://issues.apache.org/jira/browse/HIVE-25661
 Project: Hive
  Issue Type: Test
  Components: Hive
Reporter: Chiran Ravani


Cover the test case for HIVE-25626 so the problem is not broken in future.

HIVE-25594  introduces multiple JDBCStorageHandler test cases, once this is in 
upstream it would be easy to add the case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25626) JDBCStorageHandler fails when JDBC_PASSWORD_URI is used

2021-10-19 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-25626:


 Summary: JDBCStorageHandler fails when JDBC_PASSWORD_URI is used
 Key: HIVE-25626
 URL: https://issues.apache.org/jira/browse/HIVE-25626
 Project: Hive
  Issue Type: Bug
  Components: Hive, JDBC storage handler
Affects Versions: 3.1.2, 4.0.0
Reporter: Chiran Ravani


When table created with JDBCStorageHandler and JDBC_PASSWORD_URI is used as a 
password mechanism, CBO fails causing all the data to be fetched from DB and 
then process in hive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-25605) JdbcStorageHandler Create table fails when hive.sql.schema is specified and is not the default one

2021-10-08 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-25605:


 Summary: JdbcStorageHandler Create table fails when 
hive.sql.schema is specified and is not the default one
 Key: HIVE-25605
 URL: https://issues.apache.org/jira/browse/HIVE-25605
 Project: Hive
  Issue Type: Bug
  Components: JDBC storage handler
Affects Versions: 4.0.0
Reporter: Chiran Ravani


We have observed create table statement failure for JdbcStorageHandler with 
Oracle when Schema name is specified in Table properties and that schema is not 
the default one for user.

eg:-
Consider Username: DI_METADATA with default schema DI_METADATA in Oracle, 
however this user has access to other schemas as well like schema name CHIRAN, 
when using below create statement in Hive it fails with error
{code}
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: 
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Error 
while trying to get column names: ORA-00942: table or view does not exist
{code}

CREATE EXTERNAL TABLE if not exists 
query_fed_oracle.ABCD_TEST_pw_case_jceks_diff(
  YEAR INT,
  QUANTITY INT,
  NAME STRING
)
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
TBLPROPERTIES (
  "bucketing_version"="2",
  "hive.sql.database.type" = "ORACLE",
  "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver",
  "hive.sql.jdbc.url" = 
"jdbc:oracle:thin:@//obfuscated.compute-1.amazonaws.com",
  "hive.sql.dbcp.username" = "DI_METADATA",
  "hive.sql.dbcp.password.keystore" = 
"jceks://s3a@obfuscated-bucket/test.jceks",
  "hive.sql.dbcp.password.key" = "oracle.secret",
  "hive.sql.schema" = "CHIRAN",
  "hive.sql.table" = "ABCD_TEST_1",
  "hive.sql.dbcp.maxActive" = "1"
);

This can be fixed by using "hive.sql.table" = "CHIRAN.ABCD_TEST_1", but this 
will break CBO as pushdown wont happen. Possible fix would be to include 
schemaName check too after below call.
https://github.com/apache/hive/blob/master/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/conf/JdbcStorageConfigManager.java#L166

Attaching patch 1. Let me know if this looks good.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-24359) Hive Compaction hangs because of doAs when worker set to HS2

2020-11-04 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-24359:


 Summary: Hive Compaction hangs because of doAs when worker set to 
HS2
 Key: HIVE-24359
 URL: https://issues.apache.org/jira/browse/HIVE-24359
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Transactions
Reporter: Chiran Ravani


When creating a managed table and inserting data using Impala, with compaction 
worker set to HiveServer2 - in secured environment (Kerberized Cluster). Worker 
thread hangs indefinitely expecting user to provide kerberos credentials from 
STDIN
The problem appears to be because of no login context being sent from HS2 to 
HMS as part of QueryCompactor and HS2 JVM has property 
javax.security.auth.useSubjectCredsOnly is set to false. Which is causing it to 
prompt for logins via stdin, however setting to true also does not helo as the 
context does not seem to be passed in any case.

Below is observed in HS2 Jstack. If you see the the thread is waiting for stdin 
"com.sun.security.auth.module.Krb5LoginModule.promptForName"

{code}
"c570-node2.abc.host.com-44_executor" #47 daemon prio=1 os_prio=0 
tid=0x01506000 nid=0x1348 runnable [0x7f1beea95000]
   java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x9fa38c90> (a java.io.BufferedInputStream)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0x8c7d5010> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0x8c7d5010> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at 
com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153)
at 
com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120)
at 
com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862)
at 
com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708)
at 
com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
at 
javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at 
javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
at sun.security.jgss.GSSUtil.login(GSSUtil.java:258)
at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175)
at 
sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341)
at 
sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337)
at java.security.AccessController.doPrivileged(Native Method)
at 
sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336)
at 
sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146)
at 
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
at 
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189)
at 
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
at 
org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
at 
org.apache.thrift.transport.TSas

[jira] [Created] (HIVE-24245) Vectorized PTF with count and distinct over partition producing incorrect results.

2020-10-08 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-24245:


 Summary: Vectorized PTF with count and distinct over partition 
producing incorrect results.
 Key: HIVE-24245
 URL: https://issues.apache.org/jira/browse/HIVE-24245
 Project: Hive
  Issue Type: Bug
  Components: Hive, PTF-Windowing, Vectorization
Affects Versions: 3.1.2, 3.1.0
Reporter: Chiran Ravani


Vectorized PTF for count and distinct over partition is broken. It produces 
incorrect results.
Below is the test case.

{code}
CREATE TABLE bigd781b_new (
  id int,
  txt1 string,
  txt2 string,
  cda_date int,
  cda_job_name varchar(12));

INSERT INTO bigd781b_new VALUES 
  (1,'2010005759','7164335675012038',20200528,'load1'),
  (2,'2010005759','7164335675012038',20200528,'load2');
{code}

Running below query produces incorrect results

{code}
SELECT
txt1,
txt2,
count(distinct txt1) over(partition by txt1) as n,
count(distinct txt2) over(partition by txt2) as m
FROM bigd781b_new
WHERE cda_date = 20200528 and ( txt2 = '7164335675012038');
{code}

as below.

{code}
+-+---+++
|txt1 |   txt2| n  | m  |
+-+---+++
| 2010005759  | 7164335675012038  | 2  | 2  |
| 2010005759  | 7164335675012038  | 2  | 2  |
+-+---+++
{code}

While the correct output would be

{code}
+-+---+++
|txt1 |   txt2| n  | m  |
+-+---+++
| 2010005759  | 7164335675012038  | 1  | 1  |
| 2010005759  | 7164335675012038  | 1  | 1  |
+-+---+++
{code}


The problem does not appear after setting below property
set hive.vectorized.execution.ptf.enabled=false;




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

2020-07-17 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-23873:


 Summary: Querying Hive JDBCStorageHandler table fails with NPE
 Key: HIVE-23873
 URL: https://issues.apache.org/jira/browse/HIVE-23873
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, JDBC
Affects Versions: 3.1.2, 3.1.1, 3.1.0
Reporter: Chiran Ravani


Scenario is Hive table having same schema as table in Oracle, however when we 
query the table with data it fails with NPE, below is the trace.

{code}
Caused by: java.io.IOException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
 ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
... 34 more
Caused by: java.lang.NullPointerException
at 
org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) 
~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) 
~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
at 
org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473)
 ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152]
... 34 more
{code}

Problem appears when column names in Oracle are in Upper case and since in 
Hive, table and column names are forced to store in lowercase during creation. 
User runs into NPE error while fetching data.

While deserializing data, input consists of column names in lower case which 
fails to get the value

https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136
{code}
rowVal = ((ObjectWritable)value).get();
{code}

Log Snio:
=
{code}
2020-07-17T16:49:09,598 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) - 
Query to execute is [select * from TESTHIVEJDBCSTORAGE]
2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = ID
2020-07-17T16:49:10,642 INFO  [04ed42ec-91d2-4662-aee7-37e840a06036 
HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value = 
{fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class 
java.lang.Integer,value=1]}
{code}

Simple Reproducer for this case.
=
1. Create table in Oracle
{code}
create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20));
{code}

2. Insert dummy data.
{code}
Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1');
{code}

3. Create JDBCStorageHandler table in Hive.
{code}
CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME 
VARCHAR(20)) 
STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' 
TBLPROPERTIES ( 
"hive.sql.database.type" = "ORACLE", 
"hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", 
"hive.sql.jdbc.url" = "jdbc:oracle:thin:@10.96.95.99:49161/XE", 
"hive.sql.dbcp.username" = "chiran", 
"hive.sql.dbcp.password" = "hadoop", 
"hive.sql.table" = "TESTHIVEJDBCSTORAGE", 
"hive.sql.dbcp.maxActive" = "1" 
);
{code}

4. Query Hive table, fails with NPE.
{code}
> select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL;
INFO  : Compiling 
command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc): 
select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:testhivejdbcstorage_hive_tbl.id, 
type:int, comment:null), FieldSchema(name:testhivejdbcstorage_hive_tbl.fname, 
type:varchar(20), comment:null)

[jira] [Created] (HIVE-23454) Querying hive table which has Materialized view fails with HiveAccessControlException

2020-05-12 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-23454:


 Summary: Querying hive table which has Materialized view fails 
with HiveAccessControlException
 Key: HIVE-23454
 URL: https://issues.apache.org/jira/browse/HIVE-23454
 Project: Hive
  Issue Type: Bug
  Components: Authorization, HiveServer2
Affects Versions: 3.0.0, 3.2.0
Reporter: Chiran Ravani


Query fails with HiveAccessControlException against table when there is  
Materialized view pointing to that table which end user does not have access 
to, but the actual table user has all the privileges.

>From the HiveServer2 logs - it looks as part of optimization Hive uses 
>materialized view to query the data instead of table and since end user does 
>not have access on MV we receive HiveAccessControlException.

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/cost/HiveVolcanoPlanner.java#L99

The Simplest reproducer for this issue is as below.
1. Create a table using hive user and insert some data
{code:java}
create table db1.testmvtable(id int, name string) partitioned by(year int);
insert into db1.testmvtable partition(year=2020) values(1,'Name1');
insert into db1.testmvtable partition(year=2020) values(1,'Name2');
insert into db1.testmvtable partition(year=2016) values(1,'Name1');
insert into db1.testmvtable partition(year=2016) values(1,'Name2');
{code}


2. Create Materialized view on top of above table with partitioned and where 
clause as hive user.
{code:java}
CREATE MATERIALIZED VIEW db2.testmv PARTITIONED ON(year) as select * from 
db1.testmvtable tmv where year >= 2018;
{code}

3. Grant all (Select to be minimum) access to user 'chiran' via Ranger on 
database db1.

4. Run select on base table db1.testmvtable as 'chiran' with where clause 
having partition value >=2018, it runs into HiveAccessControlException on 
db2.testmv

{code:java}
eg:- (select * from db1.testmvtable where year=2020;)
0: jdbc:hive2://node2> select * from db1.testmvtable where year=2020;
Error: Error while compiling statement: FAILED: HiveAccessControlException 
Permission denied: user [chiran] does not have [SELECT] privilege on 
[db2/testmv/*] (state=42000,code=4)
{code}

5. This works when partition column is not in MV
{code:java}
0: jdbc:hive2://node2> select * from db1.testmvtable where year=2016;
DEBUG : Acquired the compile lock.
INFO  : Compiling 
command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a): 
select * from db1.testmvtable where year=2016
DEBUG : Encoding valid txns info 897:9223372036854775807::893,895,896 txnid:897
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:testmvtable.id, type:int, comment:null), 
FieldSchema(name:testmvtable.name, type:string, comment:null), 
FieldSchema(name:testmvtable.year, type:int, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a); Time 
taken: 0.222 seconds
DEBUG : Encoding valid txn write ids info 
897$db1.testmvtable:4:9223372036854775807:: txnid:897
INFO  : Executing 
command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a): 
select * from db1.testmvtable where year=2016
INFO  : Completed executing 
command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a); Time 
taken: 0.008 seconds
INFO  : OK
DEBUG : Shutting down query select * from db1.testmvtable where year=2016
+-+---+---+
| testmvtable.id  | testmvtable.name  | testmvtable.year  |
+-+---+---+
| 1   | Name1 | 2016  |
| 1   | Name2 | 2016  |
+-+---+---+
2 rows selected (0.302 seconds)
0: jdbc:hive2://node2>
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-23439) Hive sessions over 24 hours encounter Kerberos-related StatsTask errors

2020-05-11 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-23439:


 Summary: Hive sessions over 24 hours encounter Kerberos-related 
StatsTask errors
 Key: HIVE-23439
 URL: https://issues.apache.org/jira/browse/HIVE-23439
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Standalone Metastore
Affects Versions: 3.1.0
Reporter: Chiran Ravani


We have an application that uses Hive via JDBC. The interesting thing about 
them is that they have sessions that are established with HiveServer2 for 
multiple days. After 24 hours, their queries are failing with StatsTask-related 
errors. From looking in the logs, it looks like the communication breaks down 
between HiveServer2 and the MetaStore.

Below is error seen:

{code}
2020-04-22T21:25:53,248 ERROR [Thread-1202599]: exec.StatsTask (:()) - Failed 
to run stats task
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table tennis. 
Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
 at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4927)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.persistColumnStats(ColStatsProcessor.java:189)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.ql.stats.ColStatsProcessor.process(ColStatsProcessor.java:86)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:108) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:82) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch 
table tennis. Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1387) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1336) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1316) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1298) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4918)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 ... 6 more
Caused by: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
 at 
org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:86) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:95)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:4790) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4858) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4838) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1378) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1336) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1316) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1298) 
~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 at 
org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4918)
 ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6]
 ... 6 more
{code}

The problem appears to be because of delegation token issued by Hive Metastore 
could not be renewed by HiveServer2 within 24 hours period of time.
There is similar issue reported in upstream HIVE-22033 which adderesses similar 
kind of issue, I backported that fix on my local cluster and deployed the same, 
but that does not seems to adderess the issue. Proble

[jira] [Created] (HIVE-23265) Duplicate rowsets are returned with Limit and Offset ste

2020-04-21 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-23265:


 Summary: Duplicate rowsets are returned with Limit and Offset ste
 Key: HIVE-23265
 URL: https://issues.apache.org/jira/browse/HIVE-23265
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Vectorization
Affects Versions: 3.1.2, 3.1.0
Reporter: Chiran Ravani
 Attachments: 00_0

We have a query which produces duplicate results even when there is no 
duplicate records in underlying tables.

Sample Query
{code:java}
select * from orderdatatest_ext order by col1 limit 1000,50
{code}

The problem appears when order by clause is used with col1 having non-unique 
rows. Apparently the duplicates are being produced during reducer phase of the 
query.
set hive.vectorized.execution.reduce.enabled=false does not cause the problem.

Data in table is as follows.
{code:java}
1,1
1,2
1,3
.
.
1,1500
{code}

Results with hive.vectorized.execution.reduce.enabled=true

{code:java}
+-+-+
| orderdatatest_ext.col1  | orderdatatest_ext.col2  |
+-+-+
| 1   | 1001|
| 1   | 1002|
| 1   | 1003|
| 1   | 1004|
| 1   | 1005|
| 1   | 1006|
| 1   | 1007|
| 1   | 1008|
| 1   | 1009|
| 1   | 1010|
| 1   | 1011|
| 1   | 1012|
| 1   | 1013|
| 1   | 1014|
| 1   | 1015|
| 1   | 1016|
| 1   | 1017|
| 1   | 1018|
| 1   | 1019|
| 1   | 1020|
| 1   | 1021|
| 1   | 1022|
| 1   | 1023|
| 1   | 1024|
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
| 1   | 1   |
+-+-+
{code}

Results with hive.vectorized.execution.reduce.enabled=false

{code:java}
+-+-+
| orderdatatest_ext.col1  | orderdatatest_ext.col2  |
+-+-+
| 1   | 1001|
| 1   | 1002|
| 1   | 1003|
| 1   | 1004|
| 1   | 1005|
| 1   | 1006|
| 1   | 1007|
| 1   | 1008|
| 1   | 1009|
| 1   | 1010|
| 1   | 1011|
| 1   | 1012|
| 1   | 1013|
| 1   | 1014|
| 1   | 1

[jira] [Created] (HIVE-22769) Incorrect query results and query failure during Split generation for compressed text files

2020-01-23 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-22769:


 Summary: Incorrect query results and query failure during Split 
generation for compressed text files
 Key: HIVE-22769
 URL: https://issues.apache.org/jira/browse/HIVE-22769
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 3.1.0, 3.0.0
Reporter: Chiran Ravani
 Attachments: testcase1.csv.bz2, testcase2.csv.bz2

Hive Query produces incorrect results when data is in text format and 
compressed and for certain data the query fails during split generation.

This behavior is seen when skip.header.line.count and skip.footer.line.count 
are set for table.

Case 1: Select count/aggregate query produces Incorrect row counts/displays all 
rows (when hive.fetch.task.conversion=none)

Steps to reproduce:

1. Create table as below
{code}
CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.OpenCSVSerde' LOCATION '/user/hive/testcase1' 
TBLPROPERTIES ("skip.header.line.count"="1", "skip.footer.line.count"="1");
{code}
2. Upload attached testcase1.csv.bz2 file to /user/hive/testcase1
3. Run count(*) on table.

{code}
> select * from testcase1;
INFO  : Compiling 
command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f): 
select * from testcase1
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:testcase1.id, type:string, comment:null), 
FieldSchema(name:testcase1.name, type:string, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f); Time 
taken: 0.07 seconds
INFO  : Executing 
command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f): 
select * from testcase1
INFO  : Completed executing 
command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f); Time 
taken: 0.007 seconds
INFO  : OK
+---+-+
| testcase1.id  | testcase1.name  |
+---+-+
| 2 | 2019-12-31  |
+---+-+
1 row selected (0.111 seconds)


> select count(*) from testcase1
INFO  : Compiling 
command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7): 
select count(*) from testcase1
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 
type:bigint, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7); Time 
taken: 0.073 seconds
INFO  : Executing 
command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7): 
select count(*) from testcase1
INFO  : Query ID = hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Subscribed to counters: [] for queryId: 
hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7
INFO  : Session is already open
INFO  : Dag name: select count(*) from testcase1 (Stage-1)
INFO  : Status: Running (Executing on YARN cluster with App id 
application_1579811438512_0046)
.
.
.

INFO  : Completed executing 
command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7); Time 
taken: 4.228 seconds
INFO  : OK
+--+
| _c0  |
+--+
| 3|
+--+
1 row selected (4.335 seconds)
{code}

Case 2: Select count/aggregate query fails with java.lang.ClassCastException: 
java.io.PushbackInputStream cannot be cast to org.apache.hadoop.fs.Seekable

The issue is only seen when there is a space in a field (eg:- "3,2019-12-31 01" 
second column has a space)

Steps to reproduce:

1. Create table as below
{code}
CREATE EXTERNAL TABLE `testcase2`(id int, name string) ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.OpenCSVSerde' LOCATION '/user/hive/testcase2' 
TBLPROPERTIES ("skip.header.line.count"="1", "skip.footer.line.count"="1");
{code}
2. Upload attached testcase2.csv.bz2 file to /user/hive/testcase2
3. Run count(*) on table.

{code}
0: > select * from testcase2;
INFO  : Compiling 
command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134): 
select * from testcase2
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:testcase2.id, type:string, comment:null), 
FieldSchema(name:testcase2.name, type:string, comment:null)], properties:null)
INFO  : Completed compiling 
command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134); Time 
taken: 0.075 seconds
INFO  : Executing 
command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134): 
select * from testcase2
INFO  : Completed executing 
command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134); Time 
taken: 0.01 seconds
INFO  :

[jira] [Created] (HIVE-22758) Create database with permission error when doas set to true

2020-01-21 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-22758:


 Summary: Create database with permission error when doas set to 
true
 Key: HIVE-22758
 URL: https://issues.apache.org/jira/browse/HIVE-22758
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Affects Versions: 3.1.0, 3.0.0
Reporter: Chiran Ravani
Assignee: Chiran Ravani


With doAs set to true, running create database on external location fails due 
to permission denied to write on directory specified for hive user (User with 
HMS is running).

Steps to reproduce the issue:
1. Turn on, Hive run as end-user to true.
2. Connect to hive as some user other than admin, eg:- chiran
3. Create a database with external location
{code}
create database externaldbexample location '/user/chiran/externaldbexample'
{code}

The above statement fails with HDFS write permission denied error as below.

{code}
> create database externaldbexample location '/user/chiran/externaldbexample';
INFO  : Compiling 
command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): 
create database externaldbexample location '/user/chiran/externaldbexample'
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling 
command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); Time 
taken: 1.377 seconds
INFO  : Executing 
command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): 
create database externaldbexample location '/user/chiran/externaldbexample'
INFO  : Starting task [Stage-0:DDL] in serial mode
ERROR : FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:java.lang.reflect.UndeclaredThrowableException)
INFO  : Completed executing 
command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); Time 
taken: 0.238 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.DDLTask. 
MetaException(message:java.lang.reflect.UndeclaredThrowableException) 
(state=08S01,code=1)
{code}

>From Hive Metastore service log, below is seen.

{code}
2020-01-22T04:36:27,870 WARN  [pool-6-thread-6]: metastore.ObjectStore 
(ObjectStore.java:getDatabase(1010)) - Failed to get database 
hive.externaldbexample, returning NoSuchObjectExcept
ion
2020-01-22T04:36:27,898 INFO  [pool-6-thread-6]: metastore.HiveMetaStore 
(HiveMetaStore.java:run(1339)) - Creating database path in managed directory 
hdfs://c470-node2.squadron.support.
hortonworks.com:8020/user/chiran/externaldbexample
2020-01-22T04:36:27,903 INFO  [pool-6-thread-6]: utils.FileUtils 
(FileUtils.java:mkdir(170)) - Creating directory if it doesn't exist: 
hdfs://namenodeaddress:8020/user/chiran/externaldbexample
2020-01-22T04:36:27,932 ERROR [pool-6-thread-6]: utils.MetaStoreUtils 
(MetaStoreUtils.java:logAndThrowMetaException(169)) - Got exception: 
org.apache.hadoop.security.AccessControlException Permission denied: user=hive, 
access=WRITE, inode="/user/chiran":chiran:chiran:drwxr-xr-x
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255)
at 
org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1859)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1843)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1802)
at 
org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3150)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1126)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:707)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInform

[jira] [Created] (HIVE-22641) Columns returned in sorted order when show columns query is run with no search pattern.

2019-12-12 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-22641:


 Summary: Columns returned in sorted order when show columns query 
is run with no search pattern.
 Key: HIVE-22641
 URL: https://issues.apache.org/jira/browse/HIVE-22641
 Project: Hive
  Issue Type: Improvement
  Components: Hive, HiveServer2
Affects Versions: 3.0.0
Reporter: Chiran Ravani


In Hive 1.2.1 and 2.0 while displaying columns for a table, it used to return 
in same order as it was created. for example

{code}
create table col_order_test(server_name string, task_name string, 
partition_name string, start_time string, end_time string, table_owner string, 
table_name string) stored as orc;
show columns in col_order_test;

+-+--+
|  field  |
+-+--+
| server_name |
| task_name   |
| partition_name  |
| start_time  |
| end_time|
| table_owner |
| table_name  |
+-+--+
{code}

For Hive 3 columns are returned in sorted order for the same query, below is 
output.
{code}
create table col_order_test(server_name string, task_name string, 
partition_name string, start_time string, end_time string, table_owner string, 
table_name string) stored as orc;
show columns in col_order_test;

+-+
|  field  |
+-+
| end_time|
| partition_name  |
| server_name |
| start_time  |
| table_name  |
| table_owner |
| task_name   |
+-+
{code}

Above behaviour looks to be changed with the introduction of search column 
feature as part of Jira [HIVE-18373 
|https://issues.apache.org/jira/browse/HIVE-18373]

This behaviour change can cause code to generate the INSERT OVERWRITE in a 
different manner,  which may result in query failure.

Would like to request community if we can improve the Jira [HIVE-18373 
|https://issues.apache.org/jira/browse/HIVE-18373] by returning column order 
same as it was created if search pattern provided by the user is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-22459) Hive datadiff function provided inconsistent results when hive.ferch.task.conversion is set to none

2019-11-05 Thread Chiran Ravani (Jira)

Chiran Ravani created HIVE-22459:


 Summary: Hive datadiff function provided inconsistent results when 
hive.ferch.task.conversion is set to none
 Key: HIVE-22459
 URL: https://issues.apache.org/jira/browse/HIVE-22459
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 3.0.0
Reporter: Chiran Ravani


Hive datadiff function provided inconsistent results when 
hive.ferch.task.conversion to more

Below is output, whereas in Hive 1.2 the results are consistent

Note: Same query works well on Hive 3 when hive.ferch.task.conversion is set to 
none
Steps to reproduce the problem.
{code}
0: jdbc:hive2://c1113-node2.squadron.support.> select datetimecol from 
testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183;
INFO : Compiling 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268): 
select datetimecol from testdatediff where datediff(cast(current_timestamp as 
string), datetimecol)<183
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:datetimecol, type:string, comment:null)], 
properties:null)
INFO : Completed compiling 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268); Time 
taken: 0.479 seconds
INFO : Executing 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268): 
select datetimecol from testdatediff where datediff(cast(current_timestamp as 
string), datetimecol)<183
INFO : Completed executing 
command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268); Time 
taken: 0.013 seconds
INFO : OK
+--+
| datetimecol |
+--+
| 2019-07-24 |
+--+
1 row selected (0.797 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.>
{code}

After setting fetch task conversion as none.

{code}
0: jdbc:hive2://c1113-node2.squadron.support.> set 
hive.fetch.task.conversion=none;
No rows affected (0.017 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.> set hive.fetch.task.conversion;
+--+
| set |
+--+
| hive.fetch.task.conversion=none |
+--+
1 row selected (0.015 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.> select datetimecol from 
testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183;
INFO : Compiling 
command(queryId=hive_20191105103709_0c38e446-09cf-45dd-9553-365146f42452): 
select datetimecol from testdatediff where datediff(cast(current_timestamp as 
string), datetimecol)<183


++
| datetimecol |
++
| 2019-09-09T10:45:49+02:00 |
| 2019-07-24 |
++
2 rows selected (5.327 seconds)
0: jdbc:hive2://c1113-node2.squadron.support.>
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (HIVE-20756) Disable SARG leaf creation for date column until ORC-135

2018-10-16 Thread Chiran Ravani (JIRA)

Chiran Ravani created HIVE-20756:


 Summary: Disable SARG leaf creation for date column until ORC-135
 Key: HIVE-20756
 URL: https://issues.apache.org/jira/browse/HIVE-20756
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.1.1
Reporter: Chiran Ravani
Assignee: Prasanth Jayachandran


Until ORC-135 is committed and orc version is updated in hive, disable SARG 
creation for timestamp columns in hive. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HIVE-17829) ArrayIndexOutOfBoundsException - HBASE-backed tables with Avro schema in Hive2

2017-10-17 Thread Chiran Ravani (JIRA)

Chiran Ravani created HIVE-17829:


 Summary: ArrayIndexOutOfBoundsException - HBASE-backed tables with 
Avro schema in Hive2
 Key: HIVE-17829
 URL: https://issues.apache.org/jira/browse/HIVE-17829
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Affects Versions: 2.1.0
Reporter: Chiran Ravani
Priority: Critical


Stack
{code}
2017-10-09T09:39:54,804 ERROR [HiveServer2-Background-Pool: Thread-95]: 
metadata.Table (Table.java:getColsInternal(642)) - Unable to get field from 
serde: org.apache.hadoop.hive.hbase.HBaseSerDe
java.lang.ArrayIndexOutOfBoundsException: 1
at java.util.Arrays$ArrayList.get(Arrays.java:3841) ~[?:1.8.0_77]
at 
org.apache.hadoop.hive.serde2.BaseStructObjectInspector.init(BaseStructObjectInspector.java:104)
 ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.init(LazySimpleStructObjectInspector.java:97)
 ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.(LazySimpleStructObjectInspector.java:77)
 ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyObjectInspectorFactory.getLazySimpleStructObjectInspector(LazyObjectInspectorFactory.java:115)
 ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.hbase.HBaseLazyObjectFactory.createLazyHBaseStructInspector(HBaseLazyObjectFactory.java:79)
 ~[hive-hbase-handler-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:127) 
~[hive-hbase-handler-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:54) 
~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:531) 
~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:424)
 ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:411)
 ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:279)
 ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:261) 
~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:639) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:622) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:833) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4228) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:347) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116) 
[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242)
 [hive-service-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91)
 [hive-service-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334)
 [hive-service-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205]
at java.security.AccessController.doPrivileged(Native Method) 
~[?

[jira] [Created] (HIVE-26374) Query based compaction fails for tables with CDT and columns with Reserved Keywords

[jira] [Created] (HIVE-26320) Incorrect case evaluation for Parquet based table.

[jira] [Created] (HIVE-25980) Support HiveMetaStoreChecker.checkTable operation with multi-threaded

[jira] [Created] (HIVE-25661) Cover the test case for HIVE-25626

[jira] [Created] (HIVE-25626) JDBCStorageHandler fails when JDBC_PASSWORD_URI is used

[jira] [Created] (HIVE-25605) JdbcStorageHandler Create table fails when hive.sql.schema is specified and is not the default one

[jira] [Created] (HIVE-24359) Hive Compaction hangs because of doAs when worker set to HS2

[jira] [Created] (HIVE-24245) Vectorized PTF with count and distinct over partition producing incorrect results.

[jira] [Created] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE

[jira] [Created] (HIVE-23454) Querying hive table which has Materialized view fails with HiveAccessControlException

[jira] [Created] (HIVE-23439) Hive sessions over 24 hours encounter Kerberos-related StatsTask errors

[jira] [Created] (HIVE-23265) Duplicate rowsets are returned with Limit and Offset ste

[jira] [Created] (HIVE-22769) Incorrect query results and query failure during Split generation for compressed text files

[jira] [Created] (HIVE-22758) Create database with permission error when doas set to true

[jira] [Created] (HIVE-22641) Columns returned in sorted order when show columns query is run with no search pattern.

[jira] [Created] (HIVE-22459) Hive datadiff function provided inconsistent results when hive.ferch.task.conversion is set to none

[jira] [Created] (HIVE-20756) Disable SARG leaf creation for date column until ORC-135

[jira] [Created] (HIVE-17829) ArrayIndexOutOfBoundsException - HBASE-backed tables with Avro schema in Hive2

18 matches

Site Navigation

Mail list logo

Footer information