[jira] [Created] (HIVE-26936) A predicate that compares 0 with -0 yields an incorrect result
Dayakar M created HIVE-26936: Summary: A predicate that compares 0 with -0 yields an incorrect result Key: HIVE-26936 URL: https://issues.apache.org/jira/browse/HIVE-26936 Project: Hive Issue Type: Bug Reporter: Dayakar M Assignee: Dayakar M Steps to reproduce: CREATE TABLE t0(c0 INT);CREATE TABLE t1(c0 DOUBLE);INSERT INTO t0 VALUES(0);INSERT INTO t1 VALUES('-0'); SELECT * FROM t0, t1 WHERE t0.c0 = t1.c0; -- expected: \{0.0, -0.0}, actual: {}+++ | t0.c0 | t1.c0 | +++ +++ That the predicate should evaluate to TRUE can be verified with the following statement: SELECT t0.c0 = t1.c0 FROM t0, t1; -- 1+---+ | _c0 | +---+ | true | +---+ Similar issue fixed earlier as a part of [link HIVE-11174|https://issues.apache.org/jira/browse/HIVE-11174] for where clause condition, now join condition is having issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Fix the hiveserver2 error(No valid credentials provided) connecting metastore when HADOOP_USER_NAME environment variable exists and kerberos is enabled
hi, all I found a problem. When kerberos is enabled, if the environment variable HADOOP_ USER_ NAME is set, hiveserver2 will report an error(No valid credentials provided) when connecting to hive metastore. I want to fix it and have submitted the PR. Who can help me review?THX! jira: https://issues.apache.org/jira/browse/HIVE-26739?filter=-2 PR: https://github.com/apache/hive/pull/3764 Weiliang Hao
[jira] [Created] (HIVE-26935) Expose root cause of MetaException to client sides
Wechar created HIVE-26935: - Summary: Expose root cause of MetaException to client sides Key: HIVE-26935 URL: https://issues.apache.org/jira/browse/HIVE-26935 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 4.0.0-alpha-2 Reporter: Wechar Assignee: Wechar MetaException is generated by thrift, and only {{message}} filed will be transport to client, we should expose the root cause in message to the clients with following advantages: * More friendly for user troubleshooting * Some root cause is unrecoverable, exposing it can disable the unnecessary retry. *How to Reproduce:* - Step 1: Disable direct sql for HMS for our test case. - Step 2: Add an illegal {{PART_COL_STATS}} for a partition, - Step 3: Try to {{drop table}} with Spark. The exception in Hive metastore is: {code:sh} 2023-01-11T17:13:51,259 ERROR [Metastore-Handler-Pool: Thread-39]: metastore.ObjectStore (ObjectStore.java:run(4369)) - javax.jdo.JDOUserException: One or more instances could not be deleted at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:625) ~[datanucleus-api-jdo-5.2.8.jar:?] at org.datanucleus.api.jdo.JDOQuery.deletePersistentInternal(JDOQuery.java:530) ~[datanucleus-api-jdo-5.2.8.jar:?] at org.datanucleus.api.jdo.JDOQuery.deletePersistentAll(JDOQuery.java:499) ~[datanucleus-api-jdo-5.2.8.jar:?] at org.apache.hadoop.hive.metastore.QueryWrapper.deletePersistentAll(QueryWrapper.java:108) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsNoTxn(ObjectStore.java:4207) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore.access$1000(ObjectStore.java:285) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore$7.run(ObjectStore.java:3086) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:74) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsViaJdo(ObjectStore.java:3074) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore.access$400(ObjectStore.java:285) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore$6.getJdoResult(ObjectStore.java:3058) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore$6.getJdoResult(ObjectStore.java:3050) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:4362) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsInternal(ObjectStore.java:3061) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.ObjectStore.dropPartitions(ObjectStore.java:3040) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_332] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_332] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_332] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332] at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at com.sun.proxy.$Proxy24.dropPartitions(Unknown Source) ~[?:?] at org.apache.hadoop.hive.metastore.HMSHandler.dropPartitionsAndGetLocations(HMSHandler.java:3186) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HMSHandler.drop_table_core(HMSHandler.java:2963) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HMSHandler.drop_table_with_environment_context(HMSHandler.java:3211) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.metastore.HMSHandler.drop_table_with_environment_context(HMSHandler.java:3199) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_332] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_332] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_332] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:146)
[jira] [Created] (HIVE-26934) Update TXN_WRITE_NOTIFICATION_LOG table schema to make WNL_TABLE width consistent with other tables
Harshal Patel created HIVE-26934: Summary: Update TXN_WRITE_NOTIFICATION_LOG table schema to make WNL_TABLE width consistent with other tables Key: HIVE-26934 URL: https://issues.apache.org/jira/browse/HIVE-26934 Project: Hive Issue Type: Improvement Reporter: Harshal Patel * TXN_WRITE_NOTIFICATION_LOG has table as varchar(128) and in other metastore tables, hive table type is varchar(256). * So, if user creates hive table with name > 128 characters then during Incremental Repl Load operation, EVENT_COMMIT_TXN event reads table name from the TXN_WRITE_NOTIFICATION_LOG table and it only gets first 128 characters and eventually creates wrong file path -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle
Harshal Patel created HIVE-26933: Summary: Cleanup dump directory for eventId which was failed in previous dump cycle Key: HIVE-26933 URL: https://issues.apache.org/jira/browse/HIVE-26933 Project: Hive Issue Type: Improvement Reporter: Harshal Patel Assignee: Harshal Patel # If Incremental Dump operation failes while dumping any event id in the staging directory. Then dump directory for this event id along with file _dumpmetadata still exists in the dump location. which is getting stored in _events_dump file # When user triggers dump operation for this policy again, It again resumes dumping from failed event id, and tries to dump it again but as that event id directory already created in previous cycle, it fails with the exception {noformat} [Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: FAILED: Execution Error, return code 4 from org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. org.apache.hadoop.fs.FileAlreadyExistsException: /warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata for client 172.27.182.5 already exists at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26932) Correct stage name value in replication_metrics.progress column in replication_metrics table
Harshal Patel created HIVE-26932: Summary: Correct stage name value in replication_metrics.progress column in replication_metrics table Key: HIVE-26932 URL: https://issues.apache.org/jira/browse/HIVE-26932 Project: Hive Issue Type: Improvement Reporter: Harshal Patel Assignee: Harshal Patel To improve diagnostic capability from Source to backup replication, update replication_metrics table by adding pre_optimized_bootstrap in progress bar in case of optimized bootstrap first cycle. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26931) REPL LOAD command does not throw any error for incorrect syntax
Subhasis Gorai created HIVE-26931: - Summary: REPL LOAD command does not throw any error for incorrect syntax Key: HIVE-26931 URL: https://issues.apache.org/jira/browse/HIVE-26931 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Subhasis Gorai In some cases, users are using the REPL LOAD command incorrectly. It does not really throw any meaningful error/warning message, and as expected it does not replicate the database as well. For example, {code:java} repl load target_db with ('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/hive/repl', 'hive.repl.include.external.tables'= 'true', 'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/target_db.db'){code} The above command does not follow the REPL LOAD syntax. This does not produce any error message, nor it replicates the database. So, it causes confusion. {code:java} 0: jdbc:hive2://nightly7x-us-bj-3.nightly7x-u> repl load test_1_replica with ('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 'hive.repl.include.external.tables'= 'true', 'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db'); INFO : Compiling command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd): repl load test_1_replica with ('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 'hive.repl.include.external.tables'= 'true', 'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db') INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd); Time taken: 0.051 seconds INFO : Executing command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd): repl load test_1_replica with ('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 'hive.repl.include.external.tables'= 'true', 'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db') INFO : Completed executing command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd); Time taken: 0.001 seconds INFO : OK No rows affected (0.065 seconds) 0: jdbc:hive2://nightly7x-us-bj-3.nightly7x-u>{code} Ideally, since this is a wrong command, it should throw an error. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26930) Support for increased retention of Notification Logs and Change Manager entries
Subhasis Gorai created HIVE-26930: - Summary: Support for increased retention of Notification Logs and Change Manager entries Key: HIVE-26930 URL: https://issues.apache.org/jira/browse/HIVE-26930 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Subhasis Gorai Assignee: Subhasis Gorai In order to support the Planned/Unplanned Failover use cases, we need the capability to increase the retention period for both the Notification Logs and Change Manager entries until the successful reverse replication is done (i.e. the Optimized Bootstrap). If the relevant Notification logs and Change Manager entries are not retained, we can't perform a successful Optimized Bootstrap. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26929) Allow creating iceberg tables without column definition when 'metadata_location' tblproperties is set.
Dharmik Thakkar created HIVE-26929: -- Summary: Allow creating iceberg tables without column definition when 'metadata_location' tblproperties is set. Key: HIVE-26929 URL: https://issues.apache.org/jira/browse/HIVE-26929 Project: Hive Issue Type: Improvement Components: Iceberg integration Reporter: Dharmik Thakkar Allow creating iceberg tables without column definition when 'metadata_location' tblproperties is set. Iceberg supports pointing to external metadata.json file to infer table schema. Irrespective of the schema defined as part of create table statement the metadata.json is used to create table. We should allow creating table without column definition in case the metadata_location is defined in tblproperties. {code:java} create table test_meta (id int, name string, cgpa decimal) stored by iceberg stored as orc; describe formatted test_meta; create table test_meta_copy(id int) stored by iceberg tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json');{code} As a result of above SQL we get test_meta_copy with same schema as test_meta irrespective of the columns specified in create table statement. |{color:#00}*col_name*{color}|{color:#00}*data_type*{color}| |{color:#00}*id*{color}|{color:#00}int{color}| |{color:#00}*name*{color}|{color:#00}string{color}| |{color:#00}*cgpa*{color}|{color:#00}decimal(10,0){color}| | |{color:#00}NULL{color}| |{color:#00}*# Detailed Table Information*{color}|{color:#00}NULL{color}| |{color:#00}*Database:* {color}|{color:#00}iceberg_test_db_hive{color}| |{color:#00}*OwnerType: * {color}|{color:#00}USER {color}| |{color:#00}*Owner: * {color}|{color:#00}hive {color}| |{color:#00}*CreateTime:* {color}|{color:#00}Tue Jan 10 21:49:08 UTC 2023{color}| |{color:#00}*LastAccessTime:* {color}|{color:#00}Fri Dec 12 21:41:41 UTC 1969{color}| |{color:#00}*Retention: * {color}|{color:#00}2147483647{color}| |{color:#00}*Location:* {color}|{color:#00}+s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta+{color}| |{color:#00}*Table Type:* {color}|{color:#00}EXTERNAL_TABLE {color}| |{color:#00}*Table Parameters:*{color}|{color:#00}NULL{color}| | |{color:#00}EXTERNAL {color}| | |{color:#00}bucketing_version {color}| | |{color:#00}engine.hive.enabled{color}| | |{color:#00}metadata_location {color}| | |{color:#00}numFiles {color}| | |{color:#00}numRows {color}| | |{color:#00}rawDataSize {color}| | |{color:#00}serialization.format{color}| | |{color:#00}storage_handler {color}| | |{color:#00}table_type {color}| | |{color:#00}totalSize {color}| | |{color:#00}transient_lastDdlTime{color}| | |{color:#00}uuid {color}| | |{color:#00}write.format.default{color}| | |{color:#00}NULL{color}| |{color:#00}*# Storage Information*{color}|{color:#00}NULL{color}| |{color:#00}*SerDe Library: * {color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergSerDe{color}| |{color:#00}*InputFormat: * {color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergInputFormat{color}| |{color:#00}*OutputFormat:* {color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergOutputFormat{color}| |{color:#00}*Compressed:* {color}|{color:#00}No {color}| |{color:#00}*Sort Columns:* {color}|{color:#00}[] {color}| However if we skip passing column definition the query fails {code:java} create table test_meta_copy2 stored by iceberg tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json');{code} error {code:java} INFO : Compiling command(queryId=hive_20230110220019_94ffafef-f531-4532-a07c-0e46e3879f19): create table test_meta_copy2 stored by iceberg tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json') INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: