date:20230810

[jira] [Created] (HIVE-27597) Implement JDBC Connector for HiveServer

2023-08-10 Thread Naveen Gangam (Jira)

Naveen Gangam created HIVE-27597:


 Summary: Implement JDBC Connector for HiveServer 
 Key: HIVE-27597
 URL: https://issues.apache.org/jira/browse/HIVE-27597
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Reporter: Naveen Gangam
Assignee: Naveen Gangam


The initial idea of having a thrift based connector, that would enable Hive 
Metastore to use thrift APIs to interact with another metastore from another 
cluster, has some limitations. Features like column masking support become a 
challenge as we may bypass the authz controls on the remote cluster.

Instead if we could federate a query from one instance of HS2 to another 
instance of HS2 over JDBC, we would address the above concerns. This will 
atleast give us the ability to access tables across cluster boundaries.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27596) Make the cached InputFormat stateless

2023-08-10 Thread Zhihua Deng (Jira)

Zhihua Deng created HIVE-27596:
--

 Summary: Make the cached InputFormat stateless
 Key: HIVE-27596
 URL: https://issues.apache.org/jira/browse/HIVE-27596
 Project: Hive
  Issue Type: Improvement
Reporter: Zhihua Deng


InputFormat is cached in FetchOperator:

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java#L233|https://github.com/ganeshashree/hive/blob/c2fb5827761b99b1270582bd9c41c9ab5a6ab549/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java#L233]

and HiveFileInputFormat:

[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java#L391]

which makes the InputFormat instance is accessible across different sessions, 
so it's better to be stateless to avoid concurrency problem.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27582) Do not cache HBase table input format in FetchOperator

2023-08-10 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27582:
---
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fix has been merged. Thank you for the PR [~ganeshas] !

> Do not cache HBase table input format in FetchOperator
> --
>
> Key: HIVE-27582
> URL: https://issues.apache.org/jira/browse/HIVE-27582
> Project: Hive
>  Issue Type: Bug
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Caching of HBase table input format in FetchOperator causes Hive query to 
> fail with following exception. 
>  
> {code:java}
> 2023-08-08T09:43:28,800 WARN  [HiveServer2-Handler-Pool: Thread-47([])]: 
> thrift.ThriftCLIService (ThriftCLIService.java:FetchResults(809)) - Error 
> fetching results:
> org.apache.hive.service.cli.HiveSQLException: java.io.IOException: 
> java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
> Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@2ae0e353
>  rejected from java.util.concurrent.ThreadPoolExecutor@663dd540[Terminated, 
> pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2]
>         at 
> org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:485)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:328)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:926)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at sun.reflect.GeneratedMethodAccessor34.invoke(Unknown Source) ~[?:?]
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_382]
>         at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_382]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_382]
>         at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_382]
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
>  ~[hadoop-common-3.3.3-amzn-2.jar:?]
>         at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>  ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at com.sun.proxy.$Proxy43.fetchResults(Unknown Source) ~[?:?]
>         at 
> org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:568) 
> ~[hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:800)
>  [hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1900)
>  [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1880)
>  [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) 
> [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) 
> [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>  [hive-service-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
>  [hive-exec-3.1.3-amzn-3.jar:3.1.3-amzn-3]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_382]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_382]
>         at java.lang.Thread.run(Thread.java:750) [?:1.8.0_382]
> Caused by: java.io.IOException: java.lang.RuntimeException: 
> java.util.concurrent.RejectedExecutionException: Task 
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@2ae0e353
>  rejected from

[jira] [Updated] (HIVE-27592) Mockito dependency from root pom.xml to individual modules

2023-08-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27592:
--
Labels: pull-request-available  (was: )

> Mockito dependency from root pom.xml to individual modules
> --
>
> Key: HIVE-27592
> URL: https://issues.apache.org/jira/browse/HIVE-27592
> Project: Hive
>  Issue Type: Task
>Reporter: Zsolt Miskolczi
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27594) Hive 3.1.3 with extremal PG meta referring wrong default column in DBS

2023-08-10 Thread Harish (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish updated HIVE-27594:
--
Description: 
I set up hive/spark/hadoop(3.3.0) on same machine and hive meta on external PG 
server

Hive 3.1.3,
PG 12 - remote meta, (tried 9.2 as well)
changed spark and hive site.xml
used schematool to populate default tables
USING oracle object storage as hadoop storage. I have replaced actual path with 
place holder

When i try to open hive from terminal i see below error. I am total confused on 
columns it refers in error vs in DBS table. But it works fine from spark.

Here CATALOG_NAME is not part of 
[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-3.1.0.postgres.sql]
 how come code is referring different structure?

2023-08-05 09:30:21,188 INFO objectstorage.ObjectStorageClient: Setting 
endpoint to [https://oracle_ojectstorage|https://oracle_ojectstorage/] 
2023-08-05 09:30:21,438 INFO jersey.JerseyHttpClientBuilder: Setting connector 
provider to ApacheConnectorProvider 2023-08-05 09:30:21,548 INFO 
store.BmcDataStore: Using upload configuration: 
UploadConfiguration(minimumLengthForMultipartUpload=128, 
lengthPerUploadPart=128, maxPartsForMultipartUpload=1, 
enforceMd5BeforeUpload=false, enforceMd5BeforeMultipartUpload=false, 
allowMultipartUploads=true, allowParallelUploads=true, disableAutoAbort=false) 
2023-08-05 09:30:21,551 INFO bmc.ClientRuntime: Using SDK: 
Oracle-JavaSDK/3.17.1 2023-08-05 09:30:21,551 INFO bmc.ClientRuntime: User 
agent set to: Oracle-JavaSDK/3.17.1 (Linux/3.10.0-1160.66.1.el7.x86_64; 
Java/1.8.0_342; OpenJDK 64-Bit Server VM/25.342-b07) 
Oracle-HDFS_Connector/3.3.4.1.2.0 2023-08-05 09:30:21,556 INFO 
store.BmcDataStore: Object metadata caching disabled 2023-08-05 09:30:21,556 
INFO store.BmcDataStore: fs.oci.caching.object.parquet.enabled is disabled, 
setting parquet cache spec to 'maximumSize=0', which disables the cache 
2023-08-05 09:30:21,557 INFO hdfs.BmcFilesystemImpl: Setting working directory 
to oci://path/user/user, and initialized uri to oci://path 2023-08-05 
09:30:21,570 INFO hdfs.BmcFilesystem: Physically closing delegate for 
oci://path/ 2023-08-05 09:30:21,598 WARN metastore.HiveMetaStore: Retrying 
creating default database after error: Exception thrown flushing changes to 
datastore javax.jdo.JDODataStoreException: Exception thrown flushing changes to 
datastore at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
 at org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:171) at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:766)
 at 
org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:954)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) at 
com.sun.proxy.$Proxy36.createDatabase(Unknown Source) at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:753)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:771)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:540)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:80)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8678)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:169)
 at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:94)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at 
org.apache.hadoop.hive.metastore.utils.JavaUtils.newInstance(JavaUtils.java:84) 
at

[jira] [Created] (HIVE-27595) Improve efficiency in the filtering hooks

2023-08-10 Thread Naveen Gangam (Jira)

Naveen Gangam created HIVE-27595:


 Summary: Improve efficiency in the filtering hooks
 Key: HIVE-27595
 URL: https://issues.apache.org/jira/browse/HIVE-27595
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 4.0.0-alpha-2
Reporter: Naveen Gangam
Assignee: Henri Biestro


https://github.com/apache/hive/blob/a406d6d4417277e45b93f1733bed5201afdee29b/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/metastore/HiveMetaStoreAuthorizer.java#L353-L377

In case where the tableList has large amounts of tables (tested with 200k in my 
case), the hivePrivilegedObjects could just as big. So both these lists are 
200k. 

Essentially. the code is trying to return a subset of tableList collection that 
matches the objects returned in hivePrivilegedObjects. This results in a N*N 
iteration that causes bad performance. (in my case, the HMS client timeout 
expired and show tables failed). 

This code needs to be optimized for performance. 

we have a similar problem in this code as well.
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/AuthorizationMetaStoreFilterHook.java




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27594) Hive 3.1.3 with extremal PG meta referring wrong default column in DBS

2023-08-10 Thread Harish (Jira)

Harish created HIVE-27594:
-

 Summary: Hive 3.1.3 with extremal PG meta referring wrong default 
column in DBS 
 Key: HIVE-27594
 URL: https://issues.apache.org/jira/browse/HIVE-27594
 Project: Hive
  Issue Type: Bug
Reporter: Harish


I set up hive/spark/hadoop(3.3.0) on same machine and hive meta on external PG 
server

Hive 3.1.3,
PG 12 - remote meta, (tried 9.2 as well)
changed spark and hive site.xml
used schematool to populate default tables
USING oracle object storage as hadoop storage. I have replaced actual path with 
place holder

When i try to open hive from terminal i see below error. I am total confused on 
columns it refers in error vs in DBS table. But it works fine from spark.

Here CATALOG_NAME is not part of 
[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-3.1.0.postgres.sql]
 how come code is referring different structure?

2023-08-05 09:30:21,188 INFO objectstorage.ObjectStorageClient: Setting 
endpoint to https://oracle_ojectstorage 2023-08-05 09:30:21,438 INFO 
jersey.JerseyHttpClientBuilder: Setting connector provider to 
ApacheConnectorProvider 2023-08-05 09:30:21,548 INFO store.BmcDataStore: Using 
upload configuration: UploadConfiguration(minimumLengthForMultipartUpload=128, 
lengthPerUploadPart=128, maxPartsForMultipartUpload=1, 
enforceMd5BeforeUpload=false, enforceMd5BeforeMultipartUpload=false, 
allowMultipartUploads=true, allowParallelUploads=true, disableAutoAbort=false) 
2023-08-05 09:30:21,551 INFO bmc.ClientRuntime: Using SDK: 
Oracle-JavaSDK/3.17.1 2023-08-05 09:30:21,551 INFO bmc.ClientRuntime: User 
agent set to: Oracle-JavaSDK/3.17.1 (Linux/3.10.0-1160.66.1.el7.x86_64; 
Java/1.8.0_342; OpenJDK 64-Bit Server VM/25.342-b07) 
Oracle-HDFS_Connector/3.3.4.1.2.0 2023-08-05 09:30:21,556 INFO 
store.BmcDataStore: Object metadata caching disabled 2023-08-05 09:30:21,556 
INFO store.BmcDataStore: fs.oci.caching.object.parquet.enabled is disabled, 
setting parquet cache spec to 'maximumSize=0', which disables the cache 
2023-08-05 09:30:21,557 INFO hdfs.BmcFilesystemImpl: Setting working directory 
to oci://path/user/user, and initialized uri to oci://path 2023-08-05 
09:30:21,570 INFO hdfs.BmcFilesystem: Physically closing delegate for 
oci://path/ 2023-08-05 09:30:21,598 WARN metastore.HiveMetaStore: Retrying 
creating default database after error: Exception thrown flushing changes to 
datastore javax.jdo.JDODataStoreException: Exception thrown flushing changes to 
datastore at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
 at org.datanucleus.api.jdo.JDOTransaction.commit(JDOTransaction.java:171) at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:766)
 at 
org.apache.hadoop.hive.metastore.ObjectStore.createDatabase(ObjectStore.java:954)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) at 
com.sun.proxy.$Proxy36.createDatabase(Unknown Source) at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:753)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:771)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:540)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:80)
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8678)
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:169)
 at 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:94)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at

[jira] [Updated] (HIVE-27589) Iceberg: Branches of Merge/Update statements should be committed atomically

2023-08-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27589:
--
Labels: pull-request-available  (was: )

> Iceberg: Branches of Merge/Update statements should be committed atomically
> ---
>
> Key: HIVE-27589
> URL: https://issues.apache.org/jira/browse/HIVE-27589
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Simhadri Govindappa
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26522) Test for HIVE-22033 and backport to 3.1 and 2.3

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-26522:

Fix Version/s: 2.3.10

> Test for HIVE-22033 and backport to 3.1 and 2.3
> ---
>
> Key: HIVE-26522
> URL: https://issues.apache.org/jira/browse/HIVE-26522
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 2.3.8, 3.1.3
>Reporter: Pavan Lanka
>Assignee: Pavan Lanka
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> HIVE-22033 fixes the issue with Hive Delegation tokens so that the renewal 
> time is effective.
> This looks at adding a test for HIVE-22033 and backporting this fix to 3.1 
> and 2.3 branches in Hive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26882) Allow transactional check of Table parameter before altering the Table

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-26882:

Fix Version/s: 2.3.10

> Allow transactional check of Table parameter before altering the Table
> --
>
> Key: HIVE-26882
> URL: https://issues.apache.org/jira/browse/HIVE-26882
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-beta-1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> We should add the possibility to transactionally check if a Table parameter 
> is changed before altering the table in the HMS.
> This would provide an alternative, less error-prone and faster way to commit 
> an Iceberg table, as the Iceberg table currently needs to:
> - Create an exclusive lock
> - Get the table metadata to check if the current snapshot is not changed
> - Update the table metadata
> - Release the lock
> After the change these 4 HMS calls could be substituted with a single alter 
> table call.
> Also we could avoid cases where the locks are left hanging by failed processes



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25616) Backport HIVE-24741 to Hive 2, 3

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-25616:

Fix Version/s: 2.3.10

> Backport HIVE-24741 to Hive 2, 3 
> -
>
> Key: HIVE-25616
> URL: https://issues.apache.org/jira/browse/HIVE-25616
> Project: Hive
>  Issue Type: Improvement
>Reporter: Neelesh Srinivas Salian
>Assignee: Neelesh Srinivas Salian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> HIVE-24741 adds a major improvement to the `{{get_partitions_ps_with_auth}}` 
> API that is used by Spark to retrieve all partitions of the table.
> This has caused problems in Spark 3 - running on a Hive 2.3.x metastore. 
> Patched this in my org and it helped get past large partitioned tables having 
> problems reading metadata through Spark.
> [~vihangk1], thoughts?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26890) Disable TestSSL#testConnectionWrongCertCN (Done as part of HIVE-22621 in master)

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-26890:

Fix Version/s: 2.3.10

> Disable TestSSL#testConnectionWrongCertCN (Done as part of HIVE-22621 in 
> master)
> 
>
> Key: HIVE-26890
> URL: https://issues.apache.org/jira/browse/HIVE-26890
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.2.0, 2.3.10
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> TestSSL fails with the following error (this happens in the Hive-3.1.3 
> release also, so disabling this test) :
> {code:java}
> [ERROR] Tests run: 10, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 
> 23.143 s <<< FAILURE! - in org.apache.hive.jdbc.TestSSL
> [ERROR] testConnectionWrongCertCN(org.apache.hive.jdbc.TestSSL)  Time 
> elapsed: 0.64 s  <<< FAILURE!
> java.lang.AssertionError
>         at org.junit.Assert.fail(Assert.java:86)
>         at org.junit.Assert.assertTrue(Assert.java:41)
>         at org.junit.Assert.assertTrue(Assert.java:52)
>         at 
> org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN(TestSSL.java:408)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>         at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>         at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>         at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>         at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>         at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>         at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>         at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
>         at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26880) Upgrade Apache Directory Server to 1.5.7 for release 3.2.

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-26880:

Fix Version/s: 2.3.10

> Upgrade Apache Directory Server to 1.5.7 for release 3.2.
> -
>
> Key: HIVE-26880
> URL: https://issues.apache.org/jira/browse/HIVE-26880
> Project: Hive
>  Issue Type: Improvement
>  Components: Test
>Reporter: Chris Nauroth
>Assignee: Chris Nauroth
>Priority: Minor
>  Labels: hive-3.2.0-must, pull-request-available
> Fix For: 3.2.0, 2.3.10
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> branch-3 uses Apache Directory Server in some tests. It currently uses 
> version 1.5.6. This version has a transitive dependency to a SNAPSHOT, making 
> it awkward to build and release. We can upgrade to 1.5.7 to remove the 
> SNAPSHOT dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-25173:

Fix Version/s: 2.3.10

> Fix build failure of hive-pre-upgrade due to missing dependency on 
> pentaho-aggdesigner-algorithm
> 
>
> Key: HIVE-25173
> URL: https://issues.apache.org/jira/browse/HIVE-25173
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10, 4.0.0-alpha-1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve 
> dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: 
> Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27518) 2.3 - Upgrade log4j2 from 2.17.0 to 2.17.2

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun resolved HIVE-27518.
-
Fix Version/s: 2.3.10
 Hadoop Flags: Reviewed
   Resolution: Fixed

> 2.3 - Upgrade log4j2 from 2.17.0 to 2.17.2
> --
>
> Key: HIVE-27518
> URL: https://issues.apache.org/jira/browse/HIVE-27518
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27518) 2.3 - Upgrade log4j2 from 2.17.0 to 2.17.2

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun reassigned HIVE-27518:
---

Assignee: Cheng Pan

> 2.3 - Upgrade log4j2 from 2.17.0 to 2.17.2
> --
>
> Key: HIVE-27518
> URL: https://issues.apache.org/jira/browse/HIVE-27518
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.9
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27581) Backport jackson upgrade related patch to branch-2.3

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun reassigned HIVE-27581:
---

Assignee: Yuming Wang

> Backport jackson upgrade related patch to branch-2.3
> 
>
> Key: HIVE-27581
> URL: https://issues.apache.org/jira/browse/HIVE-27581
> Project: Hive
>  Issue Type: Task
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>
> 2.9.4 -> 2.9.5: 
> https://github.com/apache/hive/commit/33e208c0709fac5bd6380aacfba49448412d112b
> 2.9.5 -> 2.9.8: 
> https://github.com/apache/hive/commit/2fa22bf360898dc8fd1408bfcc96e1c6aeaf9a53
> 2.9.8 -> 2.9.9: 
> https://github.com/apache/hive/commit/7fc5a88a149cf0767a5846cbb6ace22d8e99a63c
> 2.9.9 -> 2.10.0: 
> https://github.com/apache/hive/commit/31935896a78f95ae0792ae7f29960d1b604fbe9d
> 2.10.0 -> 2.10.5: 
> https://github.com/apache/hive/commit/aa5b6b7968d90d027c5336bf430719acbff70f68
> 2.10.5 -> 2.12.0: 
> https://github.com/apache/hive/commit/1e8cc12f2d60973b7674813ae82c8f3372423d54
> ---
> 2.12.0 -> 2.12.7: 
> https://github.com/apache/hive/commit/568ded4b22a020f4d2d3567f15b287b25a3f2b71
> 2.12.7 -> 2.13.5: 
> https://github.com/apache/hive/commit/8236426ed7aa87430e82d47effe946e38fa1f7f2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27581) Backport jackson upgrade related patch to branch-2.3

2023-08-10 Thread Chao Sun (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun resolved HIVE-27581.
-
Fix Version/s: 2.3.10
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Backport jackson upgrade related patch to branch-2.3
> 
>
> Key: HIVE-27581
> URL: https://issues.apache.org/jira/browse/HIVE-27581
> Project: Hive
>  Issue Type: Task
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.3.10
>
>
> 2.9.4 -> 2.9.5: 
> https://github.com/apache/hive/commit/33e208c0709fac5bd6380aacfba49448412d112b
> 2.9.5 -> 2.9.8: 
> https://github.com/apache/hive/commit/2fa22bf360898dc8fd1408bfcc96e1c6aeaf9a53
> 2.9.8 -> 2.9.9: 
> https://github.com/apache/hive/commit/7fc5a88a149cf0767a5846cbb6ace22d8e99a63c
> 2.9.9 -> 2.10.0: 
> https://github.com/apache/hive/commit/31935896a78f95ae0792ae7f29960d1b604fbe9d
> 2.10.0 -> 2.10.5: 
> https://github.com/apache/hive/commit/aa5b6b7968d90d027c5336bf430719acbff70f68
> 2.10.5 -> 2.12.0: 
> https://github.com/apache/hive/commit/1e8cc12f2d60973b7674813ae82c8f3372423d54
> ---
> 2.12.0 -> 2.12.7: 
> https://github.com/apache/hive/commit/568ded4b22a020f4d2d3567f15b287b25a3f2b71
> 2.12.7 -> 2.13.5: 
> https://github.com/apache/hive/commit/8236426ed7aa87430e82d47effe946e38fa1f7f2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-27578) Refactor genJoinRelNode to use genAllRexNode instead of genAllExprNodeDesc

2023-08-10 Thread Soumyakanti Das (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-27578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752851#comment-17752851
 ] 

Soumyakanti Das commented on HIVE-27578:


Hey [~zabetak], I have updated the Jira with more context. Hopefully this helps!

> Refactor genJoinRelNode to use genAllRexNode instead of genAllExprNodeDesc
> --
>
> Key: HIVE-27578
> URL: https://issues.apache.org/jira/browse/HIVE-27578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>
> Currently {{genJoinRelNode}} method relies on {{genAllExprNodeDesc}} for 
> adding backticks to the ON clause conditions, but we can use 
> {{genAllRexNode}} method instead, and not rely on ExprNodes.
> There was a previous effort to try to get RexNodes directly from AST, and 
> this method call was probably overlooked. We can see that changes were made 
> around this method call to use RexNodes instead of ExprNodes, 
> [here|https://github.com/apache/hive/pull/970/files#diff-fc58b141b1cc612eb221bb781c83e1a5c98e054790b2803be60b4842d0e9a5d9R2753].
>  
> Relevant previous Jiras: 
>  # HIVE-23100
>  # HIVE-22746
> With this change, we can avoid going through the method 
> {{genAllExprNodeDesc}} and avoid mixing RexNodes and ExprNodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27578) Refactor genJoinRelNode to use genAllRexNode instead of genAllExprNodeDesc

2023-08-10 Thread Soumyakanti Das (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Soumyakanti Das updated HIVE-27578:
---
Description: 
Currently {{genJoinRelNode}} method relies on {{genAllExprNodeDesc}} for adding 
backticks to the ON clause conditions, but we can use {{genAllRexNode}} method 
instead, and not rely on ExprNodes.

There was a previous effort to try to get RexNodes directly from AST, and this 
method call was probably overlooked. We can see that changes were made around 
this method call to use RexNodes instead of ExprNodes, 
[here|https://github.com/apache/hive/pull/970/files#diff-fc58b141b1cc612eb221bb781c83e1a5c98e054790b2803be60b4842d0e9a5d9R2753].
 

Relevant previous Jiras: 
 # HIVE-23100
 # HIVE-22746

With this change, we can avoid going through the method {{genAllExprNodeDesc}} 
and avoid mixing RexNodes and ExprNodes.

  was:Currently {{genJoinRelNode}} method relies on {{genAllExprNodeDesc}} for 
adding backticks to the ON clause, but we can use {{genAllRexNode}} method 
instead, which is better because we already use RexNodes in the method and this 
change will make it depend on only RexNodes


> Refactor genJoinRelNode to use genAllRexNode instead of genAllExprNodeDesc
> --
>
> Key: HIVE-27578
> URL: https://issues.apache.org/jira/browse/HIVE-27578
> Project: Hive
>  Issue Type: Improvement
>Reporter: Soumyakanti Das
>Assignee: Soumyakanti Das
>Priority: Major
>  Labels: pull-request-available
>
> Currently {{genJoinRelNode}} method relies on {{genAllExprNodeDesc}} for 
> adding backticks to the ON clause conditions, but we can use 
> {{genAllRexNode}} method instead, and not rely on ExprNodes.
> There was a previous effort to try to get RexNodes directly from AST, and 
> this method call was probably overlooked. We can see that changes were made 
> around this method call to use RexNodes instead of ExprNodes, 
> [here|https://github.com/apache/hive/pull/970/files#diff-fc58b141b1cc612eb221bb781c83e1a5c98e054790b2803be60b4842d0e9a5d9R2753].
>  
> Relevant previous Jiras: 
>  # HIVE-23100
>  # HIVE-22746
> With this change, we can avoid going through the method 
> {{genAllExprNodeDesc}} and avoid mixing RexNodes and ExprNodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-26806) Precommit tests in CI are timing out after HIVE-26796

2023-08-10 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-26806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752828#comment-17752828
 ] 

Zoltan Haindrich commented on HIVE-26806:
-

seems like there is a helpfull feature in the parallel-test-executor
https://github.com/jenkinsci/parallel-test-executor-plugin/commit/c9145a5f849f01d6e99c2240eb51d9aaf283ef6a
upgrade to >380 could make this go away

> Precommit tests in CI are timing out after HIVE-26796
> -
>
> Key: HIVE-26806
> URL: https://issues.apache.org/jira/browse/HIVE-26806
> Project: Hive
>  Issue Type: Bug
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> http://ci.hive.apache.org/job/hive-precommit/job/master/1506/
> {noformat}
> ancelling nested steps due to timeout
> 15:22:08  Sending interrupt signal to process
> 15:22:08  Killing processes
> 15:22:09  kill finished with exit code 0
> 15:22:19  Terminated
> 15:22:19  script returned exit code 143
> [Pipeline] }
> [Pipeline] // withEnv
> [Pipeline] }
> 15:22:19  Deleting 1 temporary files
> [Pipeline] // configFileProvider
> [Pipeline] }
> [Pipeline] // stage
> [Pipeline] stage
> [Pipeline] { (PostProcess)
> [Pipeline] sh
> [Pipeline] sh
> [Pipeline] sh
> [Pipeline] junit
> 15:22:25  Recording test results
> 15:22:32  [Checks API] No suitable checks publisher found.
> [Pipeline] }
> [Pipeline] // stage
> [Pipeline] }
> [Pipeline] // container
> [Pipeline] }
> [Pipeline] // node
> [Pipeline] }
> [Pipeline] // timeout
> [Pipeline] }
> [Pipeline] // podTemplate
> [Pipeline] }
> 15:22:32  Failed in branch split-01
> [Pipeline] // parallel
> [Pipeline] }
> [Pipeline] // stage
> [Pipeline] stage
> [Pipeline] { (Archive)
> [Pipeline] podTemplate
> [Pipeline] {
> [Pipeline] timeout
> 15:22:33  Timeout set to expire in 6 hr 0 min
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-25576) Add config to parse date with older date format

2023-08-10 Thread Stamatis Zampetakis (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752810#comment-17752810
 ] 

Stamatis Zampetakis commented on HIVE-25576:


[~ashish-kumar-sharma] I would like to take this ticket to completion. The 
current PR has been closed due to inactivity. I will open a new one along with 
some improvements that I think can be done in terms of code & tests. If you are 
still interested in this PR let me know and I can ping you to review it once I 
am done.

> Add config to parse date with older date format
> ---
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 4.0.0
>Reporter: Ashish Sharma
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-25576) Add config to parse date with older date format

2023-08-10 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-25576:
--

Assignee: Stamatis Zampetakis  (was: Ashish Sharma)

> Add config to parse date with older date format
> ---
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 4.0.0
>Reporter: Ashish Sharma
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27589) Iceberg: Branches of Merge/Update statements should be committed atomically

2023-08-10 Thread Simhadri Govindappa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simhadri Govindappa reassigned HIVE-27589:
--

Assignee: Simhadri Govindappa

> Iceberg: Branches of Merge/Update statements should be committed atomically
> ---
>
> Key: HIVE-27589
> URL: https://issues.apache.org/jira/browse/HIVE-27589
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Assignee: Simhadri Govindappa
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27396) Use -strict argument for Thrift code generation to prevent compatibility issues

2023-08-10 Thread Zhihua Deng (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27396:
---
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Fix has been merged. Thank you for the PR [~joemcdonnell] !

> Use -strict argument for Thrift code generation to prevent compatibility 
> issues
> ---
>
> Key: HIVE-27396
> URL: https://issues.apache.org/jira/browse/HIVE-27396
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Affects Versions: 4.0.0
>Reporter: Joe McDonnell
>Assignee: Joe McDonnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> When generating code, the Thrift compiler has a "-strict" option that errors 
> out for certain warnings. Specifically, it errors out when there are implicit 
> field keys:
> {noformat}
>         pwarning(1, "No field key specified for %s, resulting protocol may 
> have conflicts or not be backwards compatible!\n", $6);
>         if (g_strict >= 192) {
>           yyerror("Implicit field keys are deprecated and not allowed with 
> -strict");
>           exit(1);
>         }{noformat}
> [https://github.com/apache/thrift/blob/master/compiler/cpp/src/thrift/thrifty.yy#L824-L828]
> This is a warning that has been introduced in the past (see HIVE-27103 and 
> HIVE-20365)
> I think that it would be useful to add "-strict" to the arguments for Thrift 
> code generation. It would prevent the introduction of new compatibility 
> issues, because the command would fail rather than generating a warning that 
> is easy to miss.
> The current Thrift files already work with -strict, so this should be a 
> painless thing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27593) Iceberg: Keep iceberg properties in sync with hms properties

2023-08-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27593:
--
Labels: pull-request-available  (was: )

> Iceberg: Keep iceberg properties in sync with hms properties
> 
>
> Key: HIVE-27593
> URL: https://issues.apache.org/jira/browse/HIVE-27593
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>
> In HIVE-26596, as we have not implement all COW mode, we enforced *mor* mode 
> for iceberg v2 table in the tow scenarios:
>  # create a v2 iceberg table, the delete mode will be set *mor* if not 
> specified
>  # upgrage v1 table to v2, and the delete mode will be set mor
>  
> In HS2, we check the mode(cow/mor) from hms table properties instead of 
> *iceberg* {*}properties(metadata json file){*}, and HIVE-26596 only change 
> hms table properties.  Therefore, it is ok for HS2 to operate iceberg table 
> by checking cow/mor mode from hms properties, but for others like Spark, they 
> operate the iceberg table by checking cow/mor from {*}iceberg 
> properties(metadata json file){*}.
> Before we implement all COW mode, we need keep iceberg properties in sync 
> with hms properties to  make the users have the same experience on multiple 
> engines(HS2 & Spark).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27593) Iceberg: Keep iceberg properties in sync with hms properties

2023-08-10 Thread zhangbutao (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-27593:
--
Description: 
In HIVE-26596, as we have not implement all COW mode, we enforced *mor* mode 
for iceberg v2 table in the tow scenarios:
 # create a v2 iceberg table, the delete mode will be set *mor* if not specified
 # upgrage v1 table to v2, and the delete mode will be set mor

 

In HS2, we check the mode(cow/mor) from hms table properties instead of 
*iceberg* {*}properties(metadata json file){*}, and HIVE-26596 only change hms 
table properties.  Therefore, it is ok for HS2 to operate iceberg table by 
checking cow/mor mode from hms properties, but for others like Spark, they 
operate the iceberg table by checking cow/mor from {*}iceberg 
properties(metadata json file){*}.

Before we implement all COW mode, we need keep iceberg properties in sync with 
hms properties to  make the users have the same experience on multiple 
engines(HS2 & Spark).

  was:
In HIVE-26596, as we have not implement all COW mode, we enforced *mor* mode 
for iceberg v2 table in the tow scenarios:
 # create a v2 iceberg table, the delete mode will be set *mor* if not specified
 # upgrage v1 table to v2, and the delete mode will be set mor

 

In HS2, we check the mode(cow/mor) from hms table properties instead of 
*iceberg* {*}properties(metadata json file){*}, and HIVE-26596 only change hms 
table properties.  Therefore, it is ok for HS2 to operate iceberg table by 
checking cow/mor mode from hms properties, but for others like Spark, they 
operate the iceberg table by checking cow/mor from {*}iceberg 
properties(metadata json file){*}.

Before we implement all COW mode, we need keep iceberg properties in sync with 
hms properties to ** make the users have the same experience on multiple 
engines(HS2 & Spark).


> Iceberg: Keep iceberg properties in sync with hms properties
> 
>
> Key: HIVE-27593
> URL: https://issues.apache.org/jira/browse/HIVE-27593
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>
> In HIVE-26596, as we have not implement all COW mode, we enforced *mor* mode 
> for iceberg v2 table in the tow scenarios:
>  # create a v2 iceberg table, the delete mode will be set *mor* if not 
> specified
>  # upgrage v1 table to v2, and the delete mode will be set mor
>  
> In HS2, we check the mode(cow/mor) from hms table properties instead of 
> *iceberg* {*}properties(metadata json file){*}, and HIVE-26596 only change 
> hms table properties.  Therefore, it is ok for HS2 to operate iceberg table 
> by checking cow/mor mode from hms properties, but for others like Spark, they 
> operate the iceberg table by checking cow/mor from {*}iceberg 
> properties(metadata json file){*}.
> Before we implement all COW mode, we need keep iceberg properties in sync 
> with hms properties to  make the users have the same experience on multiple 
> engines(HS2 & Spark).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27593) Iceberg: Keep iceberg properties in sync with hms properties

2023-08-10 Thread zhangbutao (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-27593:
--
Description: 
In HIVE-26596, as we have not implement all COW mode, we enforced *mor* mode 
for iceberg v2 table in the tow scenarios:
 # create a v2 iceberg table, the delete mode will be set *mor* if not specified
 # upgrage v1 table to v2, and the delete mode will be set mor

 

In HS2, we check the mode(cow/mor) from hms table properties instead of 
*iceberg* {*}properties(metadata json file){*}, and HIVE-26596 only change hms 
table properties.  Therefore, it is ok for HS2 to operate iceberg table by 
checking cow/mor mode from hms properties, but for others like Spark, they 
operate the iceberg table by checking cow/mor from {*}iceberg 
properties(metadata json file){*}.

Before we implement all COW mode, we need keep iceberg properties in sync with 
hms properties to ** make the users have the same experience on multiple 
engines(HS2 & Spark).

> Iceberg: Keep iceberg properties in sync with hms properties
> 
>
> Key: HIVE-27593
> URL: https://issues.apache.org/jira/browse/HIVE-27593
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>
> In HIVE-26596, as we have not implement all COW mode, we enforced *mor* mode 
> for iceberg v2 table in the tow scenarios:
>  # create a v2 iceberg table, the delete mode will be set *mor* if not 
> specified
>  # upgrage v1 table to v2, and the delete mode will be set mor
>  
> In HS2, we check the mode(cow/mor) from hms table properties instead of 
> *iceberg* {*}properties(metadata json file){*}, and HIVE-26596 only change 
> hms table properties.  Therefore, it is ok for HS2 to operate iceberg table 
> by checking cow/mor mode from hms properties, but for others like Spark, they 
> operate the iceberg table by checking cow/mor from {*}iceberg 
> properties(metadata json file){*}.
> Before we implement all COW mode, we need keep iceberg properties in sync 
> with hms properties to ** make the users have the same experience on multiple 
> engines(HS2 & Spark).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27593) Iceberg: Keep iceberg properties in sync with hms properties

2023-08-10 Thread zhangbutao (Jira)

zhangbutao created HIVE-27593:
-

 Summary: Iceberg: Keep iceberg properties in sync with hms 
properties
 Key: HIVE-27593
 URL: https://issues.apache.org/jira/browse/HIVE-27593
 Project: Hive
  Issue Type: Improvement
  Components: Iceberg integration
Reporter: zhangbutao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27593) Iceberg: Keep iceberg properties in sync with hms properties

2023-08-10 Thread zhangbutao (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao reassigned HIVE-27593:
-

Assignee: zhangbutao

> Iceberg: Keep iceberg properties in sync with hms properties
> 
>
> Key: HIVE-27593
> URL: https://issues.apache.org/jira/browse/HIVE-27593
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27590) Make LINES TERMINATED BY work when creating table

2023-08-10 Thread lvhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lvhu updated HIVE-27590:

Priority: Blocker  (was: Major)

> Make LINES TERMINATED BY work when creating table
> -
>
> Key: HIVE-27590
> URL: https://issues.apache.org/jira/browse/HIVE-27590
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, SQL
>Affects Versions: 3.1.3
> Environment: Any
>Reporter: lvhu
>Assignee: lvhu
>Priority: Blocker
>
> *The only way to set line delimiters when creating tables in the current hive 
> is like this:*
> {code:java}
> package abc.hive.MyFstTextInputFormat
> public class MyFstTextInputFormat extends FileInputFormat 
> implements JobConfigurable {
>  ...
> }
> create table test  (  
>     id string,  
>     name string  
> )  
> INPUTFORMAT 'abc.hive.MyFstTextInputFormat'   {code}
> If there are multiple different record delimiters, multiple TextInputFormats 
> need to be rewritten.
> Unluckily, The ideal method is not supported yet:
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> row format delimited fields terminated by '\t'  -- supported
> LINES TERMINATED BY '|@|' ;   -- not supported  {code}
> I have a solution that supports setting line delimiters when creating tables 
> just like above.
> *1.create a new HiveTextInputFormat class to replace TextInputFormatn class.*
> HiveTextInputFormat class read  file to support setting 
> record delimiter for input files based on the prefix of the file path.
> {code:java}
> public class HiveTextInputFormat extends FileInputFormat
>   implements JobConfigurable {
>   
>   public RecordReader getRecordReader(
>                                           InputSplit genericSplit, JobConf 
> job,
>                                           Reporter reporter)
>     throws IOException {
>     
>     reporter.setStatus(genericSplit.toString());
>     // default delimiter
>     String delimiter = job.get("textinputformat.record.delimiter");
>     //Obtain the path of the file
>     String filePath = genericSplit.getPath().toUri().getPath();
>     //Obtain a list of file paths and delimiter relationships by parsing the 
>  file
>     Map pathToDelimiterMap = parsePathToDelimite()//Obtain by parsing the 
>  file
>     for(Map.Entry entry: pathToDelimiterMap.entrySet()){
>      //config path
>      String configPath = entry.getKey();   
>      //if configPath is the prefix of filePath, set delimiter corresponding 
> to the file path
>      if(filePath.startsWith(configPath))  delimiter = entry.getValue();       
>  
>     }
>     byte[] recordDelimiterBytes = null;
>     if (null != delimiter) {
>       recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
>     }
>     return new LineRecordReader(job, (FileSplit) genericSplit,
>         recordDelimiterBytes);
>   }
> } {code}
> *2. modify hive create table class to support *
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> LINES TERMINATED BY '|@|' ;  
> LOCATION  hdfs_path; {code}
> If Users execute above SQL, hive will insert  (hdfs_path,'|@|')  to 
>  file.
> Set HiveTextInputFormat  as default INPUTFORMAT  .
> Looking forward to receiving your suggestions and feedback！
> *If you accept my idea, I hope you can assign the task to me. My Github 
> account is: _lvhu-goodluck_*
> I really hope to contribute code to the community
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27591) Remove PowerMock

2023-08-10 Thread Zsolt Miskolczi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zsolt Miskolczi updated HIVE-27591:
---
Description: 
PowerMock seems to be a dead project. 

 

Discussions:

[https://lists.apache.org/thread/xrwt6lw09snwl0o3w06v6v5k5h090p68]

[https://lists.apache.org/thread/xpvc76kh84ddd56z82mo0pqo0yylqys5]

 

Affected modules: 
 * ql
 * llap-client
 * beeline
 * itests/hive-jmh
 * jdbc-handler
 * service

  was:
PowerMock seems to be a dead project. 

 

Upstream discussions:

[https://lists.apache.org/thread/xrwt6lw09snwl0o3w06v6v5k5h090p68]

[https://lists.apache.org/thread/xpvc76kh84ddd56z82mo0pqo0yylqys5]

 

Affected modules: 
 * ql
 * llap-client
 * beeline
 * itests/hive-jmh
 * jdbc-handler
 * service


> Remove PowerMock
> 
>
> Key: HIVE-27591
> URL: https://issues.apache.org/jira/browse/HIVE-27591
> Project: Hive
>  Issue Type: Test
>Reporter: Zsolt Miskolczi
>Priority: Major
>
> PowerMock seems to be a dead project. 
>  
> Discussions:
> [https://lists.apache.org/thread/xrwt6lw09snwl0o3w06v6v5k5h090p68]
> [https://lists.apache.org/thread/xpvc76kh84ddd56z82mo0pqo0yylqys5]
>  
> Affected modules: 
>  * ql
>  * llap-client
>  * beeline
>  * itests/hive-jmh
>  * jdbc-handler
>  * service



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27590) Make LINES TERMINATED BY work when creating table

2023-08-10 Thread lvhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lvhu updated HIVE-27590:

Environment: Any  (was: {code:java}
//代码占位符
{code})

> Make LINES TERMINATED BY work when creating table
> -
>
> Key: HIVE-27590
> URL: https://issues.apache.org/jira/browse/HIVE-27590
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, SQL
>Affects Versions: 3.1.3
> Environment: Any
>Reporter: lvhu
>Assignee: lvhu
>Priority: Major
>
> *The only way to set line delimiters when creating tables in the current hive 
> is like this:*
> {code:java}
> package abc.hive.MyFstTextInputFormat
> public class MyFstTextInputFormat extends FileInputFormat 
> implements JobConfigurable {
>  ...
> }
> create table test  (  
>     id string,  
>     name string  
> )  
> INPUTFORMAT 'abc.hive.MyFstTextInputFormat'   {code}
> If there are multiple different record delimiters, multiple TextInputFormats 
> need to be rewritten.
> Unluckily, The ideal method is not supported yet:
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> row format delimited fields terminated by '\t'  -- supported
> LINES TERMINATED BY '|@|' ;   -- not supported  {code}
> I have a solution that supports setting line delimiters when creating tables 
> just like above.
> *1.create a new HiveTextInputFormat class to replace TextInputFormatn class.*
> HiveTextInputFormat class read  file to support setting 
> record delimiter for input files based on the prefix of the file path.
> {code:java}
> public class HiveTextInputFormat extends FileInputFormat
>   implements JobConfigurable {
>   
>   public RecordReader getRecordReader(
>                                           InputSplit genericSplit, JobConf 
> job,
>                                           Reporter reporter)
>     throws IOException {
>     
>     reporter.setStatus(genericSplit.toString());
>     // default delimiter
>     String delimiter = job.get("textinputformat.record.delimiter");
>     //Obtain the path of the file
>     String filePath = genericSplit.getPath().toUri().getPath();
>     //Obtain a list of file paths and delimiter relationships by parsing the 
>  file
>     Map pathToDelimiterMap = parsePathToDelimite()//Obtain by parsing the 
>  file
>     for(Map.Entry entry: pathToDelimiterMap.entrySet()){
>      //config path
>      String configPath = entry.getKey();   
>      //if configPath is the prefix of filePath, set delimiter corresponding 
> to the file path
>      if(filePath.startsWith(configPath))  delimiter = entry.getValue();       
>  
>     }
>     byte[] recordDelimiterBytes = null;
>     if (null != delimiter) {
>       recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
>     }
>     return new LineRecordReader(job, (FileSplit) genericSplit,
>         recordDelimiterBytes);
>   }
> } {code}
> *2. modify hive create table class to support *
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> LINES TERMINATED BY '|@|' ;  
> LOCATION  hdfs_path; {code}
> If Users execute above SQL, hive will insert  (hdfs_path,'|@|')  to 
>  file.
> Set HiveTextInputFormat  as default INPUTFORMAT  .
> Looking forward to receiving your suggestions and feedback！
> *If you accept my idea, I hope you can assign the task to me. My Github 
> account is: _lvhu-goodluck_*
> I really hope to contribute code to the community
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27592) Mockito dependency from root pom.xml to individual modules

2023-08-10 Thread Zsolt Miskolczi (Jira)

Zsolt Miskolczi created HIVE-27592:
--

 Summary: Mockito dependency from root pom.xml to individual modules
 Key: HIVE-27592
 URL: https://issues.apache.org/jira/browse/HIVE-27592
 Project: Hive
  Issue Type: Task
Reporter: Zsolt Miskolczi






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (HIVE-27515) Upgrade mockito to 4.x from 3.x

2023-08-10 Thread Zsolt Miskolczi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zsolt Miskolczi resolved HIVE-27515.

  Assignee: Zsolt Miskolczi
Resolution: Won't Do

PowerMock is incompatible with Mockito 4.x so we cannot upgrade until powermock 
is removed. 

> Upgrade mockito to 4.x from 3.x
> ---
>
> Key: HIVE-27515
> URL: https://issues.apache.org/jira/browse/HIVE-27515
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Zsolt Miskolczi
>Assignee: Zsolt Miskolczi
>Priority: Major
>
> Full context in this PR: [https://github.com/apache/hive/pull/3798]
>  
> This is a prerequisite for moving from PowerMock to mockito-inline



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27591) Remove PowerMock

2023-08-10 Thread Zsolt Miskolczi (Jira)

Zsolt Miskolczi created HIVE-27591:
--

 Summary: Remove PowerMock
 Key: HIVE-27591
 URL: https://issues.apache.org/jira/browse/HIVE-27591
 Project: Hive
  Issue Type: Test
Reporter: Zsolt Miskolczi


PowerMock seems to be a dead project. 

 

Upstream discussions:

[https://lists.apache.org/thread/xrwt6lw09snwl0o3w06v6v5k5h090p68]

[https://lists.apache.org/thread/xpvc76kh84ddd56z82mo0pqo0yylqys5]

 

Affected modules: 
 * ql
 * llap-client
 * beeline
 * itests/hive-jmh
 * jdbc-handler
 * service



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27590) Make LINES TERMINATED BY work when creating table

2023-08-10 Thread lvhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lvhu reassigned HIVE-27590:
---

Assignee: lvhu

> Make LINES TERMINATED BY work when creating table
> -
>
> Key: HIVE-27590
> URL: https://issues.apache.org/jira/browse/HIVE-27590
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, SQL
>Affects Versions: 3.1.3
> Environment: {code:java}
> //代码占位符
> {code}
>Reporter: lvhu
>Assignee: lvhu
>Priority: Major
>
> *The only way to set line delimiters when creating tables in the current hive 
> is like this:*
> {code:java}
> package abc.hive.MyFstTextInputFormat
> public class MyFstTextInputFormat extends FileInputFormat 
> implements JobConfigurable {
>  ...
> }
> create table test  (  
>     id string,  
>     name string  
> )  
> INPUTFORMAT 'abc.hive.MyFstTextInputFormat'   {code}
> If there are multiple different record delimiters, multiple TextInputFormats 
> need to be rewritten.
> Unluckily, The ideal method is not supported yet:
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> row format delimited fields terminated by '\t'  -- supported
> LINES TERMINATED BY '|@|' ;   -- not supported  {code}
> I have a solution that supports setting line delimiters when creating tables 
> just like above.
> *1.create a new HiveTextInputFormat class to replace TextInputFormatn class.*
> HiveTextInputFormat class read  file to support setting 
> record delimiter for input files based on the prefix of the file path.
> {code:java}
> public class HiveTextInputFormat extends FileInputFormat
>   implements JobConfigurable {
>   
>   public RecordReader getRecordReader(
>                                           InputSplit genericSplit, JobConf 
> job,
>                                           Reporter reporter)
>     throws IOException {
>     
>     reporter.setStatus(genericSplit.toString());
>     // default delimiter
>     String delimiter = job.get("textinputformat.record.delimiter");
>     //Obtain the path of the file
>     String filePath = genericSplit.getPath().toUri().getPath();
>     //Obtain a list of file paths and delimiter relationships by parsing the 
>  file
>     Map pathToDelimiterMap = parsePathToDelimite()//Obtain by parsing the 
>  file
>     for(Map.Entry entry: pathToDelimiterMap.entrySet()){
>      //config path
>      String configPath = entry.getKey();   
>      //if configPath is the prefix of filePath, set delimiter corresponding 
> to the file path
>      if(filePath.startsWith(configPath))  delimiter = entry.getValue();       
>  
>     }
>     byte[] recordDelimiterBytes = null;
>     if (null != delimiter) {
>       recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
>     }
>     return new LineRecordReader(job, (FileSplit) genericSplit,
>         recordDelimiterBytes);
>   }
> } {code}
> *2. modify hive create table class to support *
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> LINES TERMINATED BY '|@|' ;  
> LOCATION  hdfs_path; {code}
> If Users execute above SQL, hive will insert  (hdfs_path,'|@|')  to 
>  file.
> Set HiveTextInputFormat  as default INPUTFORMAT  .
> Looking forward to receiving your suggestions and feedback！
> *If you accept my idea, I hope you can assign the task to me. My Github 
> account is: _lvhu-goodluck_*
> I really hope to contribute code to the community
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Assigned] (HIVE-27590) Make LINES TERMINATED BY work when creating table

2023-08-10 Thread lvhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lvhu reassigned HIVE-27590:
---

Assignee: (was: lvhu)

> Make LINES TERMINATED BY work when creating table
> -
>
> Key: HIVE-27590
> URL: https://issues.apache.org/jira/browse/HIVE-27590
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, SQL
>Affects Versions: 3.1.3
> Environment: {code:java}
> //代码占位符
> {code}
>Reporter: lvhu
>Priority: Major
>
> *The only way to set line delimiters when creating tables in the current hive 
> is like this:*
> {code:java}
> package abc.hive.MyFstTextInputFormat
> public class MyFstTextInputFormat extends FileInputFormat 
> implements JobConfigurable {
>  ...
> }
> create table test  (  
>     id string,  
>     name string  
> )  
> INPUTFORMAT 'abc.hive.MyFstTextInputFormat'   {code}
> If there are multiple different record delimiters, multiple TextInputFormats 
> need to be rewritten.
> Unluckily, The ideal method is not supported yet:
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> row format delimited fields terminated by '\t'  -- supported
> LINES TERMINATED BY '|@|' ;   -- not supported  {code}
> I have a solution that supports setting line delimiters when creating tables 
> just like above.
> *1.create a new HiveTextInputFormat class to replace TextInputFormatn class.*
> HiveTextInputFormat class read  file to support setting 
> record delimiter for input files based on the prefix of the file path.
> {code:java}
> public class HiveTextInputFormat extends FileInputFormat
>   implements JobConfigurable {
>   
>   public RecordReader getRecordReader(
>                                           InputSplit genericSplit, JobConf 
> job,
>                                           Reporter reporter)
>     throws IOException {
>     
>     reporter.setStatus(genericSplit.toString());
>     // default delimiter
>     String delimiter = job.get("textinputformat.record.delimiter");
>     //Obtain the path of the file
>     String filePath = genericSplit.getPath().toUri().getPath();
>     //Obtain a list of file paths and delimiter relationships by parsing the 
>  file
>     Map pathToDelimiterMap = parsePathToDelimite()//Obtain by parsing the 
>  file
>     for(Map.Entry entry: pathToDelimiterMap.entrySet()){
>      //config path
>      String configPath = entry.getKey();   
>      //if configPath is the prefix of filePath, set delimiter corresponding 
> to the file path
>      if(filePath.startsWith(configPath))  delimiter = entry.getValue();       
>  
>     }
>     byte[] recordDelimiterBytes = null;
>     if (null != delimiter) {
>       recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
>     }
>     return new LineRecordReader(job, (FileSplit) genericSplit,
>         recordDelimiterBytes);
>   }
> } {code}
> *2. modify hive create table class to support *
> {code:java}
> create table test  (  
>     id string,  
>     name string  
> )  
> LINES TERMINATED BY '|@|' ;  
> LOCATION  hdfs_path; {code}
> If Users execute above SQL, hive will insert  (hdfs_path,'|@|')  to 
>  file.
> Set HiveTextInputFormat  as default INPUTFORMAT  .
> Looking forward to receiving your suggestions and feedback！
> *If you accept my idea, I hope you can assign the task to me. My Github 
> account is: _lvhu-goodluck_*
> I really hope to contribute code to the community
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27590) Make LINES TERMINATED BY work when creating table

2023-08-10 Thread lvhu (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lvhu updated HIVE-27590:

Description: 
*The only way to set line delimiters when creating tables in the current hive 
is like this:*
{code:java}
package abc.hive.MyFstTextInputFormat
public class MyFstTextInputFormat extends FileInputFormat 
implements JobConfigurable {
 ...
}
create table test  (  
    id string,  
    name string  
)  
INPUTFORMAT 'abc.hive.MyFstTextInputFormat'   {code}
If there are multiple different record delimiters, multiple TextInputFormats 
need to be rewritten.

Unluckily, The ideal method is not supported yet:
{code:java}
create table test  (  
    id string,  
    name string  
)  
row format delimited fields terminated by '\t'  -- supported
LINES TERMINATED BY '|@|' ;   -- not supported  {code}
I have a solution that supports setting line delimiters when creating tables 
just like above.

*1.create a new HiveTextInputFormat class to replace TextInputFormatn class.*

HiveTextInputFormat class read  file to support setting record 
delimiter for input files based on the prefix of the file path.
{code:java}
public class HiveTextInputFormat extends FileInputFormat
  implements JobConfigurable {
  
  public RecordReader getRecordReader(
                                          InputSplit genericSplit, JobConf job,
                                          Reporter reporter)
    throws IOException {
    
    reporter.setStatus(genericSplit.toString());
    // default delimiter
    String delimiter = job.get("textinputformat.record.delimiter");
    //Obtain the path of the file
    String filePath = genericSplit.getPath().toUri().getPath();
    //Obtain a list of file paths and delimiter relationships by parsing the 
 file
    Map pathToDelimiterMap = parsePathToDelimite()//Obtain by parsing the 
 file
    for(Map.Entry entry: pathToDelimiterMap.entrySet()){
     //config path
     String configPath = entry.getKey();   
     //if configPath is the prefix of filePath, set delimiter corresponding to 
the file path
     if(filePath.startsWith(configPath))  delimiter = entry.getValue();        
    }
    byte[] recordDelimiterBytes = null;
    if (null != delimiter) {
      recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
    }
    return new LineRecordReader(job, (FileSplit) genericSplit,
        recordDelimiterBytes);
  }
} {code}
*2. modify hive create table class to support *
{code:java}
create table test  (  
    id string,  
    name string  
)  
LINES TERMINATED BY '|@|' ;  
LOCATION  hdfs_path; {code}
If Users execute above SQL, hive will insert  (hdfs_path,'|@|')  to 
 file.

Set HiveTextInputFormat  as default INPUTFORMAT  .

Looking forward to receiving your suggestions and feedback！

*If you accept my idea, I hope you can assign the task to me. My Github account 
is: _lvhu-goodluck_*

I really hope to contribute code to the community

 

 

 

 

 

 

  was:
*The only way to set line delimiters when creating tables in the current hive 
is like this:*
{code:java}
package abc.hive.MyFstTextInputFormat
public class MyFstTextInputFormat extends FileInputFormat 
implements JobConfigurable {
 ...
}
create table test  (  
    id string,  
    name string  
)  
INPUTFORMAT 'abc.hive.MyFstTextInputFormat'   {code}
If there are multiple different record delimiters, multiple TextInputFormats 
need to be rewritten.

Unluckily, The ideal method is not supported yet:
{code:java}
create table test  (  
    id string,  
    name string  
)  
row format delimited fields terminated by '\t'  -- supported
LINES TERMINATED BY '|@|' ;   -- not supported  {code}
I have a solution that supports setting line delimiters when creating tables 
just like above.

*1. create a new HiveTextInputFormat class to replace TextInputFormatn class.* 
HiveTextInputFormat class read  file to support setting record 
delimiter for input files based on the prefix of the file path.
{code:java}
public class HiveTextInputFormat extends FileInputFormat
  implements JobConfigurable {
  
  public RecordReader getRecordReader(
                                          InputSplit genericSplit, JobConf job,
                                          Reporter reporter)
    throws IOException {
    
    reporter.setStatus(genericSplit.toString());
    // default delimiter
    String delimiter = job.get("textinputformat.record.delimiter");
    //Obtain the path of the file
    String filePath = genericSplit.getPath().toUri().getPath();
    //Obtain a list of file paths and delimiter relationships by parsing the 
 file
    Map pathToDelimiterMap = parsePathToDelimite()//Obtain by parsing the 
 file
    for(Map.Entry entry: pathToDelimiterMap.entrySet()){
     //config path
     String configPath = entry.getKey();   
     //if configPath is the prefix of filePath, set delimiter corresponding to 
the file path
     if(filePath.startsWith(configPath))

[jira] [Created] (HIVE-27590) Make LINES TERMINATED BY work when creating table

2023-08-10 Thread lvhu (Jira)

lvhu created HIVE-27590:
---

 Summary: Make LINES TERMINATED BY work when creating table
 Key: HIVE-27590
 URL: https://issues.apache.org/jira/browse/HIVE-27590
 Project: Hive
  Issue Type: Improvement
  Components: Hive, SQL
Affects Versions: 3.1.3
 Environment: {code:java}
//代码占位符
{code}
Reporter: lvhu
Assignee: lvhu


*The only way to set line delimiters when creating tables in the current hive 
is like this:*
{code:java}
package abc.hive.MyFstTextInputFormat
public class MyFstTextInputFormat extends FileInputFormat 
implements JobConfigurable {
 ...
}
create table test  (  
    id string,  
    name string  
)  
INPUTFORMAT 'abc.hive.MyFstTextInputFormat'   {code}
If there are multiple different record delimiters, multiple TextInputFormats 
need to be rewritten.

Unluckily, The ideal method is not supported yet:
{code:java}
create table test  (  
    id string,  
    name string  
)  
row format delimited fields terminated by '\t'  -- supported
LINES TERMINATED BY '|@|' ;   -- not supported  {code}
I have a solution that supports setting line delimiters when creating tables 
just like above.

*1. create a new HiveTextInputFormat class to replace TextInputFormatn class.* 
HiveTextInputFormat class read  file to support setting record 
delimiter for input files based on the prefix of the file path.
{code:java}
public class HiveTextInputFormat extends FileInputFormat
  implements JobConfigurable {
  
  public RecordReader getRecordReader(
                                          InputSplit genericSplit, JobConf job,
                                          Reporter reporter)
    throws IOException {
    
    reporter.setStatus(genericSplit.toString());
    // default delimiter
    String delimiter = job.get("textinputformat.record.delimiter");
    //Obtain the path of the file
    String filePath = genericSplit.getPath().toUri().getPath();
    //Obtain a list of file paths and delimiter relationships by parsing the 
 file
    Map pathToDelimiterMap = parsePathToDelimite()//Obtain by parsing the 
 file
    for(Map.Entry entry: pathToDelimiterMap.entrySet()){
     //config path
     String configPath = entry.getKey();   
     //if configPath is the prefix of filePath, set delimiter corresponding to 
the file path
     if(filePath.startsWith(configPath))  delimiter = entry.getValue();        
    }
    byte[] recordDelimiterBytes = null;
    if (null != delimiter) {
      recordDelimiterBytes = delimiter.getBytes(Charsets.UTF_8);
    }
    return new LineRecordReader(job, (FileSplit) genericSplit,
        recordDelimiterBytes);
  }
} {code}
*2. modify hive create table class to support *
{code:java}
create table test  (  
    id string,  
    name string  
)  
LINES TERMINATED BY '|@|' ;  
LOCATION  hdfs_path; {code}
If Users execute above SQL, hive will insert  (hdfs_path,'|@|')  to 
 file.

Looking forward to receiving your suggestions and feedback！

*If you accept my idea, I hope you can assign the task to me. My Github account 
is: _lvhu-goodluck_* 

I really hope to contribute code to the community

 

 

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (HIVE-27589) Iceberg: Branches of Merge/Update statements should be committed atomically

2023-08-10 Thread Denys Kuzmenko (Jira)

Denys Kuzmenko created HIVE-27589:
-

 Summary: Iceberg: Branches of Merge/Update statements should be 
committed atomically
 Key: HIVE-27589
 URL: https://issues.apache.org/jira/browse/HIVE-27589
 Project: Hive
  Issue Type: Task
Reporter: Denys Kuzmenko






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-27586) Parse dates from strings ignoring trailing (potentialy) invalid chars

2023-08-10 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-27586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27586:
--
Labels: backwards-compatibility pull-request-available  (was: 
backwards-compatibility)

> Parse dates from strings ignoring trailing (potentialy) invalid chars
> -
>
> Key: HIVE-27586
> URL: https://issues.apache.org/jira/browse/HIVE-27586
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: backwards-compatibility, pull-request-available
>
> The goal of this ticket is to extract and return a valid date from a string 
> value when there is a valid date prefix in the string.
> The following table contains a few illustrative examples highlighting what 
> happens now and what will happen after the proposed changes to ignore 
> trailing characters. HIVE-20007 introduced some behavior changes around this 
> area so the table also displays what was the Hive behavior before that change.
> ||ID||String value||Before HIVE-20007||Current behavior||Ignore trailing 
> chars||
> |1|2023-08-03_16:02:00|2023-08-03|null|2023-08-03|
> |2|2023-08-03-16:02:00|2023-08-03|null|2023-08-03|
> |3|2023-08-0316:02:00|2024-06-11|null|2023-08-03|
> |4|03-08-2023|0009-02-12|null|0003-08-20|
> |5|2023-08-03 GARBAGE|2023-08-03|2023-08-03|2023-08-03|
> |6|2023-08-03TGARBAGE|2023-08-03|2023-08-03|2023-08-03|
> |7|2023-08-03_GARBAGE|2023-08-03|null|2023-08-03|
> This change partially (see example 3 and 4) restores the behavior changes 
> introduced by HIVE-20007 and at the same time makes the current behavior of 
> handling trailing invalid chars more uniform. 
> This change will have an impact on various Hive SQL functions and operators 
> (+/-) that accept dates from string values. A partial list of affected 
> functions is outlined below:
> * CAST (V AS DATE)
> * CAST (V AS TIMESTAMP)
> * TO_DATE
> * DATE_ADD
> * DATE_DIFF
> * WEEKOFYEAR
> * DAYOFWEEK
> * TRUNC



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

42 matches

Mail list logo