[jira] [Created] (HIVE-26374) Query based compaction fails for tables with CDT and columns with Reserved Keywords
Chiran Ravani created HIVE-26374: Summary: Query based compaction fails for tables with CDT and columns with Reserved Keywords Key: HIVE-26374 URL: https://issues.apache.org/jira/browse/HIVE-26374 Project: Hive Issue Type: Bug Components: Hive, Transactions Affects Versions: 4.0.0-alpha-1, 3.1.3 Reporter: Chiran Ravani Query based compaction fails on Tables having complex data types with reserved keywords for columns. The compaction fails while creating a temporary table as it does not quote the columns correctly. Below are the steps to reproduce the issue. {code:java} create table complex_dt_compact2(col1 array>); insert into complex_dt_compact2 SELECT ARRAY(NAMED_STRUCT('arr_col1',1,'timestamp','2022-07-05 21:51:20.371')); insert into complex_dt_compact2 SELECT ARRAY(NAMED_STRUCT('arr_col1',2,'timestamp','2022-07-05 21:51:20.371')); alter table complex_dt_compact2 compact 'major' and wait; {code} Error: {code:java} 2022-07-05T22:15:47.710Z hiveserver2-0.hiveserver2-service.compute-1657056457-xkcx.svc.cluster.local hiveserver2 1 dbb4011d-c788-4b99-a31d-06bb6dd7182e [mdc@18060 class="compactor.Worker" level="ERROR" thread="hiveserver2-0.hiveserver2-service.compute-1657056457-xkcx.svc.cluster.local-64_executor"] Caught exception while trying to compact id:3,dbname:default,tableName:complex_dt_compact2,partName:null,state:,type:MAJOR,enqueueTime:0,start:0,properties:null,runAs:hive,tooManyAborts:false,hasOldAbort:false,highestWriteId:3,errorMessage:null,workerId: null,initiatorId: null,retryRetention0. Marking failed to avoid repeated failures java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run CREATE temporary external table default_tmp_compactor_complex_dt_compact2_1657059347578(`operation` int, `originalTransaction` bigint, `bucket` int, `rowId` bigint, `currentTransaction` bigint, `row` struct<`col1` :array>>) stored as orc LOCATION 's3a://obfuscated/clusters/obfuscated/obfuscated/warehouse/tablespace/managed/hive/complex_dt_compact2/base_003_v038' TBLPROPERTIES ('compactiontable'='true', 'transactional'='false') at org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:120) at org.apache.hadoop.hive.ql.txn.compactor.MajorQueryCompactor.runCompaction(MajorQueryCompactor.java:63) at org.apache.hadoop.hive.ql.txn.compactor.Worker.findNextCompactionAndExecute(Worker.java:517) at org.apache.hadoop.hive.ql.txn.compactor.Worker.lambda$run$0(Worker.java:120) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run CREATE temporary external table default_tmp_compactor_complex_dt_compact2_1657059347578(`operation` int, `originalTransaction` bigint, `bucket` int, `rowId` bigint, `currentTransaction` bigint, `row` struct<`col1` :array>>) stored as orc LOCATION 's3a://obfuscated/clusters/obfuscated/obfuscated/warehouse/tablespace/managed/hive/complex_dt_compact2/base_003_v038' TBLPROPERTIES ('compactiontable'='true', 'transactional'='false') at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:73) at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:50) at org.apache.hadoop.hive.ql.txn.compactor.QueryCompactor.runCompactionQueries(QueryCompactor.java:113) ... 7 more Caused by: (responseCode = 4, errorMessage = FAILED: ParseException line 1:241 cannot recognize input near 'timestamp' ':' 'string' in column specification, SQLState = 42000, exception = line 1:241 cannot recognize input near 'timestamp' ':' 'string' in column specification) at org.apache.hadoop.hive.ql.DriverUtils.createProcessorException(DriverUtils.java:143) at org.apache.hadoop.hive.ql.Compiler.handleException(Compiler.java:466) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:122) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:197) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:636) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:694) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:526) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:515) at org.apache.hadoop.hive.ql.DriverUtils.runOnDriver(DriverUtils.java:70) ... 9 more{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26320) Incorrect case evaluation for Parquet based table.
Chiran Ravani created HIVE-26320: Summary: Incorrect case evaluation for Parquet based table. Key: HIVE-26320 URL: https://issues.apache.org/jira/browse/HIVE-26320 Project: Hive Issue Type: Improvement Components: HiveServer2, Query Planning Affects Versions: 4.0.0-alpha-1 Reporter: Chiran Ravani Query involving case statement with two or more conditions leads to incorrect result for tables with parquet format, The problem is not observed with ORC or TextFile. *Steps to reproduce*: {code:java} create external table case_test_parquet(kob varchar(2),enhanced_type_code int) stored as parquet; insert into case_test_parquet values('BB',18),('BC',18),('AB',18); select case when ( (kob='BB' and enhanced_type_code='18') or (kob='BC' and enhanced_type_code='18') ) then 1 else 0 end as logic_check from case_test_parquet; {code} Result: {code} 0 0 0 {code} Expected result: {code} 1 1 0 {code} The problem does not appear when setting hive.optimize.point.lookup=false. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-25980) Support HiveMetaStoreChecker.checkTable operation with multi-threaded
Chiran Ravani created HIVE-25980: Summary: Support HiveMetaStoreChecker.checkTable operation with multi-threaded Key: HIVE-25980 URL: https://issues.apache.org/jira/browse/HIVE-25980 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.1.2, 4.0.0 Reporter: Chiran Ravani Assignee: Chiran Ravani MSCK Repair table for high partition table can perform slow on Cloud Storage such as S3, one of the case we found where slowness was observed in HiveMetaStoreChecker.checkTable. {code:java} "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464) at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341) at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957) at com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) at com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) at com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280) at com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) at com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) at com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) at com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) at com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) at com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82) at com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186) at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1331) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5437) at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5384) at com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1367) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$10(S3AFileSystem.java:2458) at org.apache.hadoop.fs.s3a.S3AFileSystem$$Lambda$437/835000758.apply(Unknown Source) at org.apache.hadoop.fs
[jira] [Created] (HIVE-25661) Cover the test case for HIVE-25626
Chiran Ravani created HIVE-25661: Summary: Cover the test case for HIVE-25626 Key: HIVE-25661 URL: https://issues.apache.org/jira/browse/HIVE-25661 Project: Hive Issue Type: Test Components: Hive Reporter: Chiran Ravani Cover the test case for HIVE-25626 so the problem is not broken in future. HIVE-25594 introduces multiple JDBCStorageHandler test cases, once this is in upstream it would be easy to add the case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25626) JDBCStorageHandler fails when JDBC_PASSWORD_URI is used
Chiran Ravani created HIVE-25626: Summary: JDBCStorageHandler fails when JDBC_PASSWORD_URI is used Key: HIVE-25626 URL: https://issues.apache.org/jira/browse/HIVE-25626 Project: Hive Issue Type: Bug Components: Hive, JDBC storage handler Affects Versions: 3.1.2, 4.0.0 Reporter: Chiran Ravani When table created with JDBCStorageHandler and JDBC_PASSWORD_URI is used as a password mechanism, CBO fails causing all the data to be fetched from DB and then process in hive. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25605) JdbcStorageHandler Create table fails when hive.sql.schema is specified and is not the default one
Chiran Ravani created HIVE-25605: Summary: JdbcStorageHandler Create table fails when hive.sql.schema is specified and is not the default one Key: HIVE-25605 URL: https://issues.apache.org/jira/browse/HIVE-25605 Project: Hive Issue Type: Bug Components: JDBC storage handler Affects Versions: 4.0.0 Reporter: Chiran Ravani We have observed create table statement failure for JdbcStorageHandler with Oracle when Schema name is specified in Table properties and that schema is not the default one for user. eg:- Consider Username: DI_METADATA with default schema DI_METADATA in Oracle, however this user has access to other schemas as well like schema name CHIRAN, when using below create statement in Hive it fails with error {code} org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Error while trying to get column names: ORA-00942: table or view does not exist {code} CREATE EXTERNAL TABLE if not exists query_fed_oracle.ABCD_TEST_pw_case_jceks_diff( YEAR INT, QUANTITY INT, NAME STRING ) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "bucketing_version"="2", "hive.sql.database.type" = "ORACLE", "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", "hive.sql.jdbc.url" = "jdbc:oracle:thin:@//obfuscated.compute-1.amazonaws.com", "hive.sql.dbcp.username" = "DI_METADATA", "hive.sql.dbcp.password.keystore" = "jceks://s3a@obfuscated-bucket/test.jceks", "hive.sql.dbcp.password.key" = "oracle.secret", "hive.sql.schema" = "CHIRAN", "hive.sql.table" = "ABCD_TEST_1", "hive.sql.dbcp.maxActive" = "1" ); This can be fixed by using "hive.sql.table" = "CHIRAN.ABCD_TEST_1", but this will break CBO as pushdown wont happen. Possible fix would be to include schemaName check too after below call. https://github.com/apache/hive/blob/master/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/conf/JdbcStorageConfigManager.java#L166 Attaching patch 1. Let me know if this looks good. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24359) Hive Compaction hangs because of doAs when worker set to HS2
Chiran Ravani created HIVE-24359: Summary: Hive Compaction hangs because of doAs when worker set to HS2 Key: HIVE-24359 URL: https://issues.apache.org/jira/browse/HIVE-24359 Project: Hive Issue Type: Bug Components: HiveServer2, Transactions Reporter: Chiran Ravani When creating a managed table and inserting data using Impala, with compaction worker set to HiveServer2 - in secured environment (Kerberized Cluster). Worker thread hangs indefinitely expecting user to provide kerberos credentials from STDIN The problem appears to be because of no login context being sent from HS2 to HMS as part of QueryCompactor and HS2 JVM has property javax.security.auth.useSubjectCredsOnly is set to false. Which is causing it to prompt for logins via stdin, however setting to true also does not helo as the context does not seem to be passed in any case. Below is observed in HS2 Jstack. If you see the the thread is waiting for stdin "com.sun.security.auth.module.Krb5LoginModule.promptForName" {code} "c570-node2.abc.host.com-44_executor" #47 daemon prio=1 os_prio=0 tid=0x01506000 nid=0x1348 runnable [0x7f1beea95000] java.lang.Thread.State: RUNNABLE at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:255) at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <0x9fa38c90> (a java.io.BufferedInputStream) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) - locked <0x8c7d5010> (a java.io.InputStreamReader) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.readLine(BufferedReader.java:324) - locked <0x8c7d5010> (a java.io.InputStreamReader) at java.io.BufferedReader.readLine(BufferedReader.java:389) at com.sun.security.auth.callback.TextCallbackHandler.readLine(TextCallbackHandler.java:153) at com.sun.security.auth.callback.TextCallbackHandler.handle(TextCallbackHandler.java:120) at com.sun.security.auth.module.Krb5LoginModule.promptForName(Krb5LoginModule.java:862) at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:708) at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:617) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755) at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682) at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680) at javax.security.auth.login.LoginContext.login(LoginContext.java:587) at sun.security.jgss.GSSUtil.login(GSSUtil.java:258) at sun.security.jgss.krb5.Krb5Util.getInitialTicket(Krb5Util.java:175) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:341) at sun.security.jgss.krb5.Krb5InitCredential$1.run(Krb5InitCredential.java:337) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jgss.krb5.Krb5InitCredential.getTgt(Krb5InitCredential.java:336) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:146) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:189) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192) at org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) at org.apache.thrift.transport.TSas
[jira] [Created] (HIVE-24245) Vectorized PTF with count and distinct over partition producing incorrect results.
Chiran Ravani created HIVE-24245: Summary: Vectorized PTF with count and distinct over partition producing incorrect results. Key: HIVE-24245 URL: https://issues.apache.org/jira/browse/HIVE-24245 Project: Hive Issue Type: Bug Components: Hive, PTF-Windowing, Vectorization Affects Versions: 3.1.2, 3.1.0 Reporter: Chiran Ravani Vectorized PTF for count and distinct over partition is broken. It produces incorrect results. Below is the test case. {code} CREATE TABLE bigd781b_new ( id int, txt1 string, txt2 string, cda_date int, cda_job_name varchar(12)); INSERT INTO bigd781b_new VALUES (1,'2010005759','7164335675012038',20200528,'load1'), (2,'2010005759','7164335675012038',20200528,'load2'); {code} Running below query produces incorrect results {code} SELECT txt1, txt2, count(distinct txt1) over(partition by txt1) as n, count(distinct txt2) over(partition by txt2) as m FROM bigd781b_new WHERE cda_date = 20200528 and ( txt2 = '7164335675012038'); {code} as below. {code} +-+---+++ |txt1 | txt2| n | m | +-+---+++ | 2010005759 | 7164335675012038 | 2 | 2 | | 2010005759 | 7164335675012038 | 2 | 2 | +-+---+++ {code} While the correct output would be {code} +-+---+++ |txt1 | txt2| n | m | +-+---+++ | 2010005759 | 7164335675012038 | 1 | 1 | | 2010005759 | 7164335675012038 | 1 | 1 | +-+---+++ {code} The problem does not appear after setting below property set hive.vectorized.execution.ptf.enabled=false; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23873) Querying Hive JDBCStorageHandler table fails with NPE
Chiran Ravani created HIVE-23873: Summary: Querying Hive JDBCStorageHandler table fails with NPE Key: HIVE-23873 URL: https://issues.apache.org/jira/browse/HIVE-23873 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 3.1.2, 3.1.1, 3.1.0 Reporter: Chiran Ravani Scenario is Hive table having same schema as table in Oracle, however when we query the table with data it fails with NPE, below is the trace. {code} Caused by: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:617) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473) ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] ... 34 more Caused by: java.lang.NullPointerException at org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:164) ~[hive-jdbc-handler-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:598) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:524) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:2739) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:229) ~[hive-exec-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:473) ~[hive-service-3.1.0.3.1.5.0-152.jar:3.1.0.3.1.5.0-152] ... 34 more {code} Problem appears when column names in Oracle are in Upper case and since in Hive, table and column names are forced to store in lowercase during creation. User runs into NPE error while fetching data. While deserializing data, input consists of column names in lower case which fails to get the value https://github.com/apache/hive/blob/rel/release-3.1.2/jdbc-handler/src/main/java/org/apache/hive/storage/jdbc/JdbcSerDe.java#L136 {code} rowVal = ((ObjectWritable)value).get(); {code} Log Snio: = {code} 2020-07-17T16:49:09,598 INFO [04ed42ec-91d2-4662-aee7-37e840a06036 HiveServer2-Handler-Pool: Thread-104]: dao.GenericJdbcDatabaseAccessor (:()) - Query to execute is [select * from TESTHIVEJDBCSTORAGE] 2020-07-17T16:49:10,642 INFO [04ed42ec-91d2-4662-aee7-37e840a06036 HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** ColumnKey = ID 2020-07-17T16:49:10,642 INFO [04ed42ec-91d2-4662-aee7-37e840a06036 HiveServer2-Handler-Pool: Thread-104]: jdbc.JdbcSerDe (:()) - *** Blob value = {fname=OW[class=class java.lang.String,value=Name1], id=OW[class=class java.lang.Integer,value=1]} {code} Simple Reproducer for this case. = 1. Create table in Oracle {code} create table TESTHIVEJDBCSTORAGE(ID INT, FNAME VARCHAR(20)); {code} 2. Insert dummy data. {code} Insert into TESTHIVEJDBCSTORAGE values (1, 'Name1'); {code} 3. Create JDBCStorageHandler table in Hive. {code} CREATE EXTERNAL TABLE default.TESTHIVEJDBCSTORAGE_HIVE_TBL (ID INT, FNAME VARCHAR(20)) STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler' TBLPROPERTIES ( "hive.sql.database.type" = "ORACLE", "hive.sql.jdbc.driver" = "oracle.jdbc.OracleDriver", "hive.sql.jdbc.url" = "jdbc:oracle:thin:@10.96.95.99:49161/XE", "hive.sql.dbcp.username" = "chiran", "hive.sql.dbcp.password" = "hadoop", "hive.sql.table" = "TESTHIVEJDBCSTORAGE", "hive.sql.dbcp.maxActive" = "1" ); {code} 4. Query Hive table, fails with NPE. {code} > select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL; INFO : Compiling command(queryId=hive_20200717164857_cd6f5020-4a69-4a2d-9e63-9db99d0121bc): select * from default.TESTHIVEJDBCSTORAGE_HIVE_TBL INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testhivejdbcstorage_hive_tbl.id, type:int, comment:null), FieldSchema(name:testhivejdbcstorage_hive_tbl.fname, type:varchar(20), comment:null)
[jira] [Created] (HIVE-23454) Querying hive table which has Materialized view fails with HiveAccessControlException
Chiran Ravani created HIVE-23454: Summary: Querying hive table which has Materialized view fails with HiveAccessControlException Key: HIVE-23454 URL: https://issues.apache.org/jira/browse/HIVE-23454 Project: Hive Issue Type: Bug Components: Authorization, HiveServer2 Affects Versions: 3.0.0, 3.2.0 Reporter: Chiran Ravani Query fails with HiveAccessControlException against table when there is Materialized view pointing to that table which end user does not have access to, but the actual table user has all the privileges. >From the HiveServer2 logs - it looks as part of optimization Hive uses >materialized view to query the data instead of table and since end user does >not have access on MV we receive HiveAccessControlException. https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/cost/HiveVolcanoPlanner.java#L99 The Simplest reproducer for this issue is as below. 1. Create a table using hive user and insert some data {code:java} create table db1.testmvtable(id int, name string) partitioned by(year int); insert into db1.testmvtable partition(year=2020) values(1,'Name1'); insert into db1.testmvtable partition(year=2020) values(1,'Name2'); insert into db1.testmvtable partition(year=2016) values(1,'Name1'); insert into db1.testmvtable partition(year=2016) values(1,'Name2'); {code} 2. Create Materialized view on top of above table with partitioned and where clause as hive user. {code:java} CREATE MATERIALIZED VIEW db2.testmv PARTITIONED ON(year) as select * from db1.testmvtable tmv where year >= 2018; {code} 3. Grant all (Select to be minimum) access to user 'chiran' via Ranger on database db1. 4. Run select on base table db1.testmvtable as 'chiran' with where clause having partition value >=2018, it runs into HiveAccessControlException on db2.testmv {code:java} eg:- (select * from db1.testmvtable where year=2020;) 0: jdbc:hive2://node2> select * from db1.testmvtable where year=2020; Error: Error while compiling statement: FAILED: HiveAccessControlException Permission denied: user [chiran] does not have [SELECT] privilege on [db2/testmv/*] (state=42000,code=4) {code} 5. This works when partition column is not in MV {code:java} 0: jdbc:hive2://node2> select * from db1.testmvtable where year=2016; DEBUG : Acquired the compile lock. INFO : Compiling command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a): select * from db1.testmvtable where year=2016 DEBUG : Encoding valid txns info 897:9223372036854775807::893,895,896 txnid:897 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testmvtable.id, type:int, comment:null), FieldSchema(name:testmvtable.name, type:string, comment:null), FieldSchema(name:testmvtable.year, type:int, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a); Time taken: 0.222 seconds DEBUG : Encoding valid txn write ids info 897$db1.testmvtable:4:9223372036854775807:: txnid:897 INFO : Executing command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a): select * from db1.testmvtable where year=2016 INFO : Completed executing command(queryId=hive_20200507130248_841458fe-7048-4727-8816-3f9472d2a67a); Time taken: 0.008 seconds INFO : OK DEBUG : Shutting down query select * from db1.testmvtable where year=2016 +-+---+---+ | testmvtable.id | testmvtable.name | testmvtable.year | +-+---+---+ | 1 | Name1 | 2016 | | 1 | Name2 | 2016 | +-+---+---+ 2 rows selected (0.302 seconds) 0: jdbc:hive2://node2> {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-23439) Hive sessions over 24 hours encounter Kerberos-related StatsTask errors
Chiran Ravani created HIVE-23439: Summary: Hive sessions over 24 hours encounter Kerberos-related StatsTask errors Key: HIVE-23439 URL: https://issues.apache.org/jira/browse/HIVE-23439 Project: Hive Issue Type: Bug Components: HiveServer2, Standalone Metastore Affects Versions: 3.1.0 Reporter: Chiran Ravani We have an application that uses Hive via JDBC. The interesting thing about them is that they have sessions that are established with HiveServer2 for multiple days. After 24 hours, their queries are failing with StatsTask-related errors. From looking in the logs, it looks like the communication breaks down between HiveServer2 and the MetaStore. Below is error seen: {code} 2020-04-22T21:25:53,248 ERROR [Thread-1202599]: exec.StatsTask (:()) - Failed to run stats task org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table tennis. Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4927) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.stats.ColStatsProcessor.persistColumnStats(ColStatsProcessor.java:189) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.stats.ColStatsProcessor.process(ColStatsProcessor.java:86) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:108) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:82) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table tennis. Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1387) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1336) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1316) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1298) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4918) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] ... 6 more Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:86) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:95) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:148) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:119) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:4790) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4858) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:4838) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1378) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1336) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1316) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1298) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] at org.apache.hadoop.hive.ql.metadata.Hive.setPartitionColumnStatistics(Hive.java:4918) ~[hive-exec-3.1.0.3.1.4.39-6.jar:3.1.0.3.1.4.39-6] ... 6 more {code} The problem appears to be because of delegation token issued by Hive Metastore could not be renewed by HiveServer2 within 24 hours period of time. There is similar issue reported in upstream HIVE-22033 which adderesses similar kind of issue, I backported that fix on my local cluster and deployed the same, but that does not seems to adderess the issue. Proble
[jira] [Created] (HIVE-23265) Duplicate rowsets are returned with Limit and Offset ste
Chiran Ravani created HIVE-23265: Summary: Duplicate rowsets are returned with Limit and Offset ste Key: HIVE-23265 URL: https://issues.apache.org/jira/browse/HIVE-23265 Project: Hive Issue Type: Bug Components: HiveServer2, Vectorization Affects Versions: 3.1.2, 3.1.0 Reporter: Chiran Ravani Attachments: 00_0 We have a query which produces duplicate results even when there is no duplicate records in underlying tables. Sample Query {code:java} select * from orderdatatest_ext order by col1 limit 1000,50 {code} The problem appears when order by clause is used with col1 having non-unique rows. Apparently the duplicates are being produced during reducer phase of the query. set hive.vectorized.execution.reduce.enabled=false does not cause the problem. Data in table is as follows. {code:java} 1,1 1,2 1,3 . . 1,1500 {code} Results with hive.vectorized.execution.reduce.enabled=true {code:java} +-+-+ | orderdatatest_ext.col1 | orderdatatest_ext.col2 | +-+-+ | 1 | 1001| | 1 | 1002| | 1 | 1003| | 1 | 1004| | 1 | 1005| | 1 | 1006| | 1 | 1007| | 1 | 1008| | 1 | 1009| | 1 | 1010| | 1 | 1011| | 1 | 1012| | 1 | 1013| | 1 | 1014| | 1 | 1015| | 1 | 1016| | 1 | 1017| | 1 | 1018| | 1 | 1019| | 1 | 1020| | 1 | 1021| | 1 | 1022| | 1 | 1023| | 1 | 1024| | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | | 1 | 1 | +-+-+ {code} Results with hive.vectorized.execution.reduce.enabled=false {code:java} +-+-+ | orderdatatest_ext.col1 | orderdatatest_ext.col2 | +-+-+ | 1 | 1001| | 1 | 1002| | 1 | 1003| | 1 | 1004| | 1 | 1005| | 1 | 1006| | 1 | 1007| | 1 | 1008| | 1 | 1009| | 1 | 1010| | 1 | 1011| | 1 | 1012| | 1 | 1013| | 1 | 1014| | 1 | 1
[jira] [Created] (HIVE-22769) Incorrect query results and query failure during Split generation for compressed text files
Chiran Ravani created HIVE-22769: Summary: Incorrect query results and query failure during Split generation for compressed text files Key: HIVE-22769 URL: https://issues.apache.org/jira/browse/HIVE-22769 Project: Hive Issue Type: Bug Components: File Formats Affects Versions: 3.1.0, 3.0.0 Reporter: Chiran Ravani Attachments: testcase1.csv.bz2, testcase2.csv.bz2 Hive Query produces incorrect results when data is in text format and compressed and for certain data the query fails during split generation. This behavior is seen when skip.header.line.count and skip.footer.line.count are set for table. Case 1: Select count/aggregate query produces Incorrect row counts/displays all rows (when hive.fetch.task.conversion=none) Steps to reproduce: 1. Create table as below {code} CREATE EXTERNAL TABLE `testcase1`(id int, name string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' LOCATION '/user/hive/testcase1' TBLPROPERTIES ("skip.header.line.count"="1", "skip.footer.line.count"="1"); {code} 2. Upload attached testcase1.csv.bz2 file to /user/hive/testcase1 3. Run count(*) on table. {code} > select * from testcase1; INFO : Compiling command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f): select * from testcase1 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testcase1.id, type:string, comment:null), FieldSchema(name:testcase1.name, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f); Time taken: 0.07 seconds INFO : Executing command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f): select * from testcase1 INFO : Completed executing command(queryId=hive_20200124053854_454b03c1-d4c5-4dba-a2c2-91c09f4b670f); Time taken: 0.007 seconds INFO : OK +---+-+ | testcase1.id | testcase1.name | +---+-+ | 2 | 2019-12-31 | +---+-+ 1 row selected (0.111 seconds) > select count(*) from testcase1 INFO : Compiling command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7): select count(*) from testcase1 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7); Time taken: 0.073 seconds INFO : Executing command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7): select count(*) from testcase1 INFO : Query ID = hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7 INFO : Total jobs = 1 INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-1:MAPRED] in serial mode INFO : Subscribed to counters: [] for queryId: hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7 INFO : Session is already open INFO : Dag name: select count(*) from testcase1 (Stage-1) INFO : Status: Running (Executing on YARN cluster with App id application_1579811438512_0046) . . . INFO : Completed executing command(queryId=hive_20200124053645_a7d699b7-c7e1-4d92-8d99-666b0a010ba7); Time taken: 4.228 seconds INFO : OK +--+ | _c0 | +--+ | 3| +--+ 1 row selected (4.335 seconds) {code} Case 2: Select count/aggregate query fails with java.lang.ClassCastException: java.io.PushbackInputStream cannot be cast to org.apache.hadoop.fs.Seekable The issue is only seen when there is a space in a field (eg:- "3,2019-12-31 01" second column has a space) Steps to reproduce: 1. Create table as below {code} CREATE EXTERNAL TABLE `testcase2`(id int, name string) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' LOCATION '/user/hive/testcase2' TBLPROPERTIES ("skip.header.line.count"="1", "skip.footer.line.count"="1"); {code} 2. Upload attached testcase2.csv.bz2 file to /user/hive/testcase2 3. Run count(*) on table. {code} 0: > select * from testcase2; INFO : Compiling command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134): select * from testcase2 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testcase2.id, type:string, comment:null), FieldSchema(name:testcase2.name, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134); Time taken: 0.075 seconds INFO : Executing command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134): select * from testcase2 INFO : Completed executing command(queryId=hive_20200124053159_5d8ce56a-183d-4359-a147-bd470d82e134); Time taken: 0.01 seconds INFO :
[jira] [Created] (HIVE-22758) Create database with permission error when doas set to true
Chiran Ravani created HIVE-22758: Summary: Create database with permission error when doas set to true Key: HIVE-22758 URL: https://issues.apache.org/jira/browse/HIVE-22758 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.1.0, 3.0.0 Reporter: Chiran Ravani Assignee: Chiran Ravani With doAs set to true, running create database on external location fails due to permission denied to write on directory specified for hive user (User with HMS is running). Steps to reproduce the issue: 1. Turn on, Hive run as end-user to true. 2. Connect to hive as some user other than admin, eg:- chiran 3. Create a database with external location {code} create database externaldbexample location '/user/chiran/externaldbexample' {code} The above statement fails with HDFS write permission denied error as below. {code} > create database externaldbexample location '/user/chiran/externaldbexample'; INFO : Compiling command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): create database externaldbexample location '/user/chiran/externaldbexample' INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); Time taken: 1.377 seconds INFO : Executing command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d): create database externaldbexample location '/user/chiran/externaldbexample' INFO : Starting task [Stage-0:DDL] in serial mode ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.reflect.UndeclaredThrowableException) INFO : Completed executing command(queryId=hive_20200122043626_5c95e1fd-ce00-45fd-b58d-54f5e579f87d); Time taken: 0.238 seconds Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.reflect.UndeclaredThrowableException) (state=08S01,code=1) {code} >From Hive Metastore service log, below is seen. {code} 2020-01-22T04:36:27,870 WARN [pool-6-thread-6]: metastore.ObjectStore (ObjectStore.java:getDatabase(1010)) - Failed to get database hive.externaldbexample, returning NoSuchObjectExcept ion 2020-01-22T04:36:27,898 INFO [pool-6-thread-6]: metastore.HiveMetaStore (HiveMetaStore.java:run(1339)) - Creating database path in managed directory hdfs://c470-node2.squadron.support. hortonworks.com:8020/user/chiran/externaldbexample 2020-01-22T04:36:27,903 INFO [pool-6-thread-6]: utils.FileUtils (FileUtils.java:mkdir(170)) - Creating directory if it doesn't exist: hdfs://namenodeaddress:8020/user/chiran/externaldbexample 2020-01-22T04:36:27,932 ERROR [pool-6-thread-6]: utils.MetaStoreUtils (MetaStoreUtils.java:logAndThrowMetaException(169)) - Got exception: org.apache.hadoop.security.AccessControlException Permission denied: user=hive, access=WRITE, inode="/user/chiran":chiran:chiran:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:399) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:255) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:193) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1859) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1843) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1802) at org.apache.hadoop.hdfs.server.namenode.FSDirMkdirOp.mkdirs(FSDirMkdirOp.java:59) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3150) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1126) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:707) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInform
[jira] [Created] (HIVE-22641) Columns returned in sorted order when show columns query is run with no search pattern.
Chiran Ravani created HIVE-22641: Summary: Columns returned in sorted order when show columns query is run with no search pattern. Key: HIVE-22641 URL: https://issues.apache.org/jira/browse/HIVE-22641 Project: Hive Issue Type: Improvement Components: Hive, HiveServer2 Affects Versions: 3.0.0 Reporter: Chiran Ravani In Hive 1.2.1 and 2.0 while displaying columns for a table, it used to return in same order as it was created. for example {code} create table col_order_test(server_name string, task_name string, partition_name string, start_time string, end_time string, table_owner string, table_name string) stored as orc; show columns in col_order_test; +-+--+ | field | +-+--+ | server_name | | task_name | | partition_name | | start_time | | end_time| | table_owner | | table_name | +-+--+ {code} For Hive 3 columns are returned in sorted order for the same query, below is output. {code} create table col_order_test(server_name string, task_name string, partition_name string, start_time string, end_time string, table_owner string, table_name string) stored as orc; show columns in col_order_test; +-+ | field | +-+ | end_time| | partition_name | | server_name | | start_time | | table_name | | table_owner | | task_name | +-+ {code} Above behaviour looks to be changed with the introduction of search column feature as part of Jira [HIVE-18373 |https://issues.apache.org/jira/browse/HIVE-18373] This behaviour change can cause code to generate the INSERT OVERWRITE in a different manner, which may result in query failure. Would like to request community if we can improve the Jira [HIVE-18373 |https://issues.apache.org/jira/browse/HIVE-18373] by returning column order same as it was created if search pattern provided by the user is null. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-22459) Hive datadiff function provided inconsistent results when hive.ferch.task.conversion is set to none
Chiran Ravani created HIVE-22459: Summary: Hive datadiff function provided inconsistent results when hive.ferch.task.conversion is set to none Key: HIVE-22459 URL: https://issues.apache.org/jira/browse/HIVE-22459 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 3.0.0 Reporter: Chiran Ravani Hive datadiff function provided inconsistent results when hive.ferch.task.conversion to more Below is output, whereas in Hive 1.2 the results are consistent Note: Same query works well on Hive 3 when hive.ferch.task.conversion is set to none Steps to reproduce the problem. {code} 0: jdbc:hive2://c1113-node2.squadron.support.> select datetimecol from testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183; INFO : Compiling command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268): select datetimecol from testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183 INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:datetimecol, type:string, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268); Time taken: 0.479 seconds INFO : Executing command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268): select datetimecol from testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183 INFO : Completed executing command(queryId=hive_20191105103636_1dff22a1-02f3-48a8-b076-0b91272f2268); Time taken: 0.013 seconds INFO : OK +--+ | datetimecol | +--+ | 2019-07-24 | +--+ 1 row selected (0.797 seconds) 0: jdbc:hive2://c1113-node2.squadron.support.> {code} After setting fetch task conversion as none. {code} 0: jdbc:hive2://c1113-node2.squadron.support.> set hive.fetch.task.conversion=none; No rows affected (0.017 seconds) 0: jdbc:hive2://c1113-node2.squadron.support.> set hive.fetch.task.conversion; +--+ | set | +--+ | hive.fetch.task.conversion=none | +--+ 1 row selected (0.015 seconds) 0: jdbc:hive2://c1113-node2.squadron.support.> select datetimecol from testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183; INFO : Compiling command(queryId=hive_20191105103709_0c38e446-09cf-45dd-9553-365146f42452): select datetimecol from testdatediff where datediff(cast(current_timestamp as string), datetimecol)<183 ++ | datetimecol | ++ | 2019-09-09T10:45:49+02:00 | | 2019-07-24 | ++ 2 rows selected (5.327 seconds) 0: jdbc:hive2://c1113-node2.squadron.support.> {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-20756) Disable SARG leaf creation for date column until ORC-135
Chiran Ravani created HIVE-20756: Summary: Disable SARG leaf creation for date column until ORC-135 Key: HIVE-20756 URL: https://issues.apache.org/jira/browse/HIVE-20756 Project: Hive Issue Type: Bug Affects Versions: 2.1.1 Reporter: Chiran Ravani Assignee: Prasanth Jayachandran Until ORC-135 is committed and orc version is updated in hive, disable SARG creation for timestamp columns in hive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-17829) ArrayIndexOutOfBoundsException - HBASE-backed tables with Avro schema in Hive2
Chiran Ravani created HIVE-17829: Summary: ArrayIndexOutOfBoundsException - HBASE-backed tables with Avro schema in Hive2 Key: HIVE-17829 URL: https://issues.apache.org/jira/browse/HIVE-17829 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 2.1.0 Reporter: Chiran Ravani Priority: Critical Stack {code} 2017-10-09T09:39:54,804 ERROR [HiveServer2-Background-Pool: Thread-95]: metadata.Table (Table.java:getColsInternal(642)) - Unable to get field from serde: org.apache.hadoop.hive.hbase.HBaseSerDe java.lang.ArrayIndexOutOfBoundsException: 1 at java.util.Arrays$ArrayList.get(Arrays.java:3841) ~[?:1.8.0_77] at org.apache.hadoop.hive.serde2.BaseStructObjectInspector.init(BaseStructObjectInspector.java:104) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.init(LazySimpleStructObjectInspector.java:97) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.(LazySimpleStructObjectInspector.java:77) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.serde2.lazy.objectinspector.LazyObjectInspectorFactory.getLazySimpleStructObjectInspector(LazyObjectInspectorFactory.java:115) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.hbase.HBaseLazyObjectFactory.createLazyHBaseStructInspector(HBaseLazyObjectFactory.java:79) ~[hive-hbase-handler-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.hbase.HBaseSerDe.initialize(HBaseSerDe.java:127) ~[hive-hbase-handler-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:54) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:531) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:424) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:411) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:279) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:261) ~[hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:639) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:622) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:833) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:869) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4228) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:347) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1905) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1607) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1354) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1123) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116) [hive-exec-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242) [hive-service-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) [hive-service-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:334) [hive-service-2.1.0.2.6.2.0-205.jar:2.1.0.2.6.2.0-205] at java.security.AccessController.doPrivileged(Native Method) ~[?