[jira] [Created] (HIVE-26936) A predicate that compares 0 with -0 yields an incorrect result

2023-01-11 Thread Dayakar M (Jira)
Dayakar M created HIVE-26936:


 Summary: A predicate that compares 0 with -0 yields an incorrect 
result
 Key: HIVE-26936
 URL: https://issues.apache.org/jira/browse/HIVE-26936
 Project: Hive
  Issue Type: Bug
Reporter: Dayakar M
Assignee: Dayakar M


Steps to reproduce:
CREATE TABLE t0(c0 INT);CREATE TABLE t1(c0 DOUBLE);INSERT INTO t0 
VALUES(0);INSERT INTO t1 VALUES('-0');
SELECT * FROM t0, t1 WHERE t0.c0 = t1.c0; -- expected: \{0.0, -0.0}, actual: 
{}+++
| t0.c0  | t1.c0  |
+++
+++
That the predicate should evaluate to TRUE can be verified with the following 
statement:
SELECT t0.c0 = t1.c0 FROM t0, t1; -- 1+---+
|  _c0  |
+---+
| true  |
+---+
Similar issue fixed earlier as a part of [link 
HIVE-11174|https://issues.apache.org/jira/browse/HIVE-11174]  for where clause 
condition, now join condition is having issue.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Fix the hiveserver2 error(No valid credentials provided) connecting metastore when HADOOP_USER_NAME environment variable exists and kerberos is enabled

2023-01-11 Thread Weiliang Hao
hi, all
I found a problem. When kerberos is enabled, if the environment variable 
HADOOP_ USER_ NAME is set, hiveserver2 will report an error(No valid 
credentials provided) when connecting to hive metastore.
I want to fix it and have submitted the PR. Who can help me review?THX!
jira: https://issues.apache.org/jira/browse/HIVE-26739?filter=-2
PR: https://github.com/apache/hive/pull/3764


Weiliang Hao

[jira] [Created] (HIVE-26935) Expose root cause of MetaException to client sides

2023-01-11 Thread Wechar (Jira)
Wechar created HIVE-26935:
-

 Summary: Expose root cause of MetaException to client sides
 Key: HIVE-26935
 URL: https://issues.apache.org/jira/browse/HIVE-26935
 Project: Hive
  Issue Type: Improvement
  Components: Hive
Affects Versions: 4.0.0-alpha-2
Reporter: Wechar
Assignee: Wechar


MetaException is generated by thrift, and only {{message}} filed will be 
transport to client, we should expose the root cause in message to the clients 
with following advantages:
 * More friendly for user troubleshooting
 * Some root cause is unrecoverable, exposing it can disable the unnecessary 
retry.

*How to Reproduce:*
 - Step 1: Disable direct sql for HMS for our test case.
 - Step 2: Add an illegal {{PART_COL_STATS}} for a partition,
 - Step 3: Try to {{drop table}} with Spark.

The exception in Hive metastore is:
{code:sh}
2023-01-11T17:13:51,259 ERROR [Metastore-Handler-Pool: Thread-39]: 
metastore.ObjectStore (ObjectStore.java:run(4369)) - 
javax.jdo.JDOUserException: One or more instances could not be deleted
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:625)
 ~[datanucleus-api-jdo-5.2.8.jar:?]
at 
org.datanucleus.api.jdo.JDOQuery.deletePersistentInternal(JDOQuery.java:530) 
~[datanucleus-api-jdo-5.2.8.jar:?]
at 
org.datanucleus.api.jdo.JDOQuery.deletePersistentAll(JDOQuery.java:499) 
~[datanucleus-api-jdo-5.2.8.jar:?]
at 
org.apache.hadoop.hive.metastore.QueryWrapper.deletePersistentAll(QueryWrapper.java:108)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsNoTxn(ObjectStore.java:4207)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.access$1000(ObjectStore.java:285) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$7.run(ObjectStore.java:3086) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.Batchable.runBatched(Batchable.java:74) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsViaJdo(ObjectStore.java:3074)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.access$400(ObjectStore.java:285) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$6.getJdoResult(ObjectStore.java:3058)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$6.getJdoResult(ObjectStore.java:3050)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:4362)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitionsInternal(ObjectStore.java:3061)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.ObjectStore.dropPartitions(ObjectStore.java:3040)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_332]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_332]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_332]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332]
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at com.sun.proxy.$Proxy24.dropPartitions(Unknown Source) ~[?:?]
at 
org.apache.hadoop.hive.metastore.HMSHandler.dropPartitionsAndGetLocations(HMSHandler.java:3186)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.drop_table_core(HMSHandler.java:2963)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.drop_table_with_environment_context(HMSHandler.java:3211)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.metastore.HMSHandler.drop_table_with_environment_context(HMSHandler.java:3199)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_332]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_332]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_332]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_332]
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:146)
 

[jira] [Created] (HIVE-26934) Update TXN_WRITE_NOTIFICATION_LOG table schema to make WNL_TABLE width consistent with other tables

2023-01-11 Thread Harshal Patel (Jira)
Harshal Patel created HIVE-26934:


 Summary: Update TXN_WRITE_NOTIFICATION_LOG table schema to make 
WNL_TABLE width consistent with other tables
 Key: HIVE-26934
 URL: https://issues.apache.org/jira/browse/HIVE-26934
 Project: Hive
  Issue Type: Improvement
Reporter: Harshal Patel


* TXN_WRITE_NOTIFICATION_LOG has table as varchar(128) and in other metastore 
tables, hive table type is varchar(256).
 * So, if user creates hive table with name > 128 characters then during 
Incremental Repl Load operation, EVENT_COMMIT_TXN event reads table name from 
the TXN_WRITE_NOTIFICATION_LOG table and it only gets first 128 characters and 
eventually creates wrong file path



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26933) Cleanup dump directory for eventId which was failed in previous dump cycle

2023-01-11 Thread Harshal Patel (Jira)
Harshal Patel created HIVE-26933:


 Summary: Cleanup dump directory for eventId which was failed in 
previous dump cycle
 Key: HIVE-26933
 URL: https://issues.apache.org/jira/browse/HIVE-26933
 Project: Hive
  Issue Type: Improvement
Reporter: Harshal Patel
Assignee: Harshal Patel


# If Incremental Dump operation failes while dumping any event id  in the 
staging directory. Then dump directory for this event id along with file 
_dumpmetadata  still exists in the dump location. which is getting stored in 
_events_dump file
 # When user triggers dump operation for this policy again, It again resumes 
dumping from failed event id, and tries to dump it again but as that event id 
directory already created in previous cycle, it fails with the exception

{noformat}
[Scheduled Query Executor(schedule:repl_policytest7, execution_id:7181)]: 
FAILED: Execution Error, return code 4 from 
org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. 
org.apache.hadoop.fs.FileAlreadyExistsException: 
/warehouse/tablespace/staging/policytest7/dGVzdDc=/14bcf976-662b-4237-b5bb-e7d63a1d089f/hive/137961/_dumpmetadata
 for client 172.27.182.5 already exists
    at 
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:388)
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2576)
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2473)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:773)
    at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:490)
    at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
    at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894){noformat}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26932) Correct stage name value in replication_metrics.progress column in replication_metrics table

2023-01-11 Thread Harshal Patel (Jira)
Harshal Patel created HIVE-26932:


 Summary: Correct stage name value in replication_metrics.progress 
column in replication_metrics table
 Key: HIVE-26932
 URL: https://issues.apache.org/jira/browse/HIVE-26932
 Project: Hive
  Issue Type: Improvement
Reporter: Harshal Patel
Assignee: Harshal Patel


 To improve diagnostic capability from Source to backup replication, update 
replication_metrics table by adding pre_optimized_bootstrap in progress bar in 
case of optimized bootstrap first cycle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26931) REPL LOAD command does not throw any error for incorrect syntax

2023-01-11 Thread Subhasis Gorai (Jira)
Subhasis Gorai created HIVE-26931:
-

 Summary: REPL LOAD command does not throw any error for incorrect 
syntax
 Key: HIVE-26931
 URL: https://issues.apache.org/jira/browse/HIVE-26931
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Reporter: Subhasis Gorai


In some cases, users are using the REPL LOAD command incorrectly. It does not 
really throw any meaningful error/warning message, and as expected it does not 
replicate the database as well.

For example,
{code:java}
repl load target_db with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/hive/repl',
 'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/target_db.db'){code}
The above command does not follow the REPL LOAD syntax. This does not produce 
any error message, nor it replicates the database. So, it causes confusion.
{code:java}
0: jdbc:hive2://nightly7x-us-bj-3.nightly7x-u> repl load test_1_replica with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 
'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db');
INFO  : Compiling 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd): repl 
load test_1_replica with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 
'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db')
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd); Time 
taken: 0.051 seconds
INFO  : Executing 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd): repl 
load test_1_replica with 
('hive.repl.rootdir'='hdfs://c3649-node2.coelab.cloudera.com:8020/user/repl', 
'hive.repl.include.external.tables'= 'true', 
'hive.repl.replica.external.table.base.dir'='hdfs://c3649node2.coelab.cloudera.com:8020/warehouse/tablespace/external/hive/test_1_replica.db')
INFO  : Completed executing 
command(queryId=hive_20221201113704_08ee46a6-ede9-4c92-9502-82b9fbc416bd); Time 
taken: 0.001 seconds
INFO  : OK
No rows affected (0.065 seconds)
0: jdbc:hive2://nightly7x-us-bj-3.nightly7x-u>{code}
Ideally, since this is a wrong command, it should throw an error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26930) Support for increased retention of Notification Logs and Change Manager entries

2023-01-11 Thread Subhasis Gorai (Jira)
Subhasis Gorai created HIVE-26930:
-

 Summary: Support for increased retention of Notification Logs and 
Change Manager entries
 Key: HIVE-26930
 URL: https://issues.apache.org/jira/browse/HIVE-26930
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: Subhasis Gorai
Assignee: Subhasis Gorai


In order to support the Planned/Unplanned Failover use cases, we need the 
capability to increase the retention period for both the Notification Logs and 
Change Manager entries until the successful reverse replication is done (i.e. 
the Optimized Bootstrap).

If the relevant Notification logs and Change Manager entries are not retained, 
we can't perform a successful Optimized Bootstrap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26929) Allow creating iceberg tables without column definition when 'metadata_location' tblproperties is set.

2023-01-11 Thread Dharmik Thakkar (Jira)
Dharmik Thakkar created HIVE-26929:
--

 Summary: Allow creating iceberg tables without column definition 
when 'metadata_location' tblproperties is set.
 Key: HIVE-26929
 URL: https://issues.apache.org/jira/browse/HIVE-26929
 Project: Hive
  Issue Type: Improvement
  Components: Iceberg integration
Reporter: Dharmik Thakkar


Allow creating iceberg tables without column definition when 
'metadata_location' tblproperties is set.

Iceberg supports pointing to external metadata.json file to infer table schema. 
Irrespective of the schema defined as part of create table statement the 
metadata.json is used to create table. We should allow creating table without 
column definition in case the metadata_location is defined in tblproperties.
{code:java}
create table test_meta (id int, name string, cgpa decimal) stored by iceberg 
stored as orc;
describe formatted test_meta;
create table test_meta_copy(id int) stored by iceberg 
tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json');{code}
As a result of above SQL we get test_meta_copy with same schema as test_meta 
irrespective of the columns specified in create table statement.
|{color:#00}*col_name*{color}|{color:#00}*data_type*{color}|
|{color:#00}*id*{color}|{color:#00}int{color}|
|{color:#00}*name*{color}|{color:#00}string{color}|
|{color:#00}*cgpa*{color}|{color:#00}decimal(10,0){color}|
| |{color:#00}NULL{color}|
|{color:#00}*# Detailed Table 
Information*{color}|{color:#00}NULL{color}|
|{color:#00}*Database:*           
{color}|{color:#00}iceberg_test_db_hive{color}|
|{color:#00}*OwnerType: *         {color}|{color:#00}USER               
 {color}|
|{color:#00}*Owner: *             {color}|{color:#00}hive               
 {color}|
|{color:#00}*CreateTime:*         {color}|{color:#00}Tue Jan 10 
21:49:08 UTC 2023{color}|
|{color:#00}*LastAccessTime:*     {color}|{color:#00}Fri Dec 12 
21:41:41 UTC 1969{color}|
|{color:#00}*Retention: *         {color}|{color:#00}2147483647{color}|
|{color:#00}*Location:*           
{color}|{color:#00}+s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta+{color}|
|{color:#00}*Table Type:*         {color}|{color:#00}EXTERNAL_TABLE     
 {color}|
|{color:#00}*Table Parameters:*{color}|{color:#00}NULL{color}|
| |{color:#00}EXTERNAL            {color}|
| |{color:#00}bucketing_version   {color}|
| |{color:#00}engine.hive.enabled{color}|
| |{color:#00}metadata_location   {color}|
| |{color:#00}numFiles            {color}|
| |{color:#00}numRows             {color}|
| |{color:#00}rawDataSize         {color}|
| |{color:#00}serialization.format{color}|
| |{color:#00}storage_handler     {color}|
| |{color:#00}table_type          {color}|
| |{color:#00}totalSize           {color}|
| |{color:#00}transient_lastDdlTime{color}|
| |{color:#00}uuid                {color}|
| |{color:#00}write.format.default{color}|
| |{color:#00}NULL{color}|
|{color:#00}*# Storage Information*{color}|{color:#00}NULL{color}|
|{color:#00}*SerDe Library: *     
{color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergSerDe{color}|
|{color:#00}*InputFormat: *       
{color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergInputFormat{color}|
|{color:#00}*OutputFormat:*       
{color}|{color:#00}org.apache.iceberg.mr.hive.HiveIcebergOutputFormat{color}|
|{color:#00}*Compressed:*         {color}|{color:#00}No                 
 {color}|
|{color:#00}*Sort Columns:*       {color}|{color:#00}[]                 
 {color}|

However if we skip passing column definition the query fails
{code:java}
create table test_meta_copy2 stored by iceberg 
tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json');{code}
error
{code:java}
INFO  : Compiling 
command(queryId=hive_20230110220019_94ffafef-f531-4532-a07c-0e46e3879f19): 
create table test_meta_copy2 stored by iceberg 
tblproperties('metadata_location'='s3a://qe-s3-bucket-weekly-dj5h-dwx-external/clusters/env-dqdj5h/warehouse-1673341391-kkzh/warehouse/tablespace/external/hive/iceberg_test_db_hive.db/test_meta/metadata/0-7dfd7602-f5e1-4473-97cb-79377d358aa3.metadata.json')
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Created Hive schema: