[jira] [Created] (HIVE-27781) HMS crashes with OOM even though there is enough heap space

2023-10-09 Thread Pravin Sinha (Jira)
Pravin Sinha created HIVE-27781:
---

 Summary: HMS crashes with OOM even though there is enough heap 
space
 Key: HIVE-27781
 URL: https://issues.apache.org/jira/browse/HIVE-27781
 Project: Hive
  Issue Type: Bug
Reporter: Pravin Sinha
Assignee: Pravin Sinha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27765) Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966

2023-10-09 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27765:

Description: 
* HIVE-20052: Arrow serde should fill ArrowColumnVector(Decimal) with the 
given schema precision/scale
* HIVE-20093: LlapOutputFomatService: Use ArrowBuf with Netty for Accounting
* HIVE-20203: Arrow SerDe leaks a DirectByteBuffer
* HIVE-20290: Lazy initialize ArrowColumnarBatchSerDe so it doesn't 
allocate buffers during GetSplits
* HIVE-20300: VectorFileSinkArrowOperator
* HIVE-20312: Allow arrow clients to use their own BufferAllocator with 
LlapOutputFormatService
* HIVE-20044: Arrow Serde should pad char values and handle empty strings 
correctly
* HIVE-21966: Llap external client - Arrow Serializer throws 
ArrayIndexOutOfBoundsException in some cases

> Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, 
> HIVE-20312, HIVE-20044, HIVE-21966
> --
>
> Key: HIVE-27765
> URL: https://issues.apache.org/jira/browse/HIVE-27765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.3
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>
> * HIVE-20052: Arrow serde should fill ArrowColumnVector(Decimal) with the 
> given schema precision/scale
> * HIVE-20093: LlapOutputFomatService: Use ArrowBuf with Netty for 
> Accounting
> * HIVE-20203: Arrow SerDe leaks a DirectByteBuffer
> * HIVE-20290: Lazy initialize ArrowColumnarBatchSerDe so it doesn't 
> allocate buffers during GetSplits
> * HIVE-20300: VectorFileSinkArrowOperator
> * HIVE-20312: Allow arrow clients to use their own BufferAllocator with 
> LlapOutputFormatService
> * HIVE-20044: Arrow Serde should pad char values and handle empty strings 
> correctly
> * HIVE-21966: Llap external client - Arrow Serializer throws 
> ArrayIndexOutOfBoundsException in some cases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27765) Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966

2023-10-09 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-27765.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

> Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, 
> HIVE-20312, HIVE-20044, HIVE-21966
> --
>
> Key: HIVE-27765
> URL: https://issues.apache.org/jira/browse/HIVE-27765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27765) Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966

2023-10-09 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-27765:

Affects Version/s: 3.1.3
   (was: 3.2.0)

> Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, 
> HIVE-20312, HIVE-20044, HIVE-21966
> --
>
> Key: HIVE-27765
> URL: https://issues.apache.org/jira/browse/HIVE-27765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.1.3
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-22989) Don't close parent classloader when session being closed

2023-10-09 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-22989:
--

Assignee: (was: Zhihua Deng)

> Don't close parent classloader when session being closed
> 
>
> Key: HIVE-22989
> URL: https://issues.apache.org/jira/browse/HIVE-22989
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-22989.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When hiveserver2 loads udfs,  Registry will use the session's specified 
> classloader to load them and cache the classloader.  When user don't set the 
> aux jars,  the classloader cached is equal to the session's parent 
> classloader, in our case, we don't set the aux jars while update the 
> session's parent classloader periodicity to update user jars dynamically. 
> It's should do a sanity check when Registry closes the cached classloaders.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-19818) SessionState getQueryId returns an empty string

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-19818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503185#comment-16503185
 ] 

Zhihua Deng edited comment on HIVE-19818 at 10/10/23 1:24 AM:
--

One possible way to solve the problem is that getting the query id from the job 
configuration :

HiveConf.getVar(job, HiveConf.ConfVars.HIVEQUERYID, "").trim()


was (Author: dengzh):
one possible way to solve the problem is that getting the query id from the job 
configuration like:

HiveConf.getVar(job, HiveConf.ConfVars.HIVEQUERYID, "").trim()

> SessionState getQueryId returns an empty string
> ---
>
> Key: HIVE-19818
> URL: https://issues.apache.org/jira/browse/HIVE-19818
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Minor
> Attachments: HIVE-19818.patch
>
>
> When we execute sql asynchronously,  a new configuration based on the session 
> holds will be created and passed to the driver instance, which resulting to 
> return an empty string when SessionState#getQueryId called later on. This 
> problem can be seen in HadoopJobExecHelper.java.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17139107#comment-17139107
 ] 

Zhihua Deng edited comment on HIVE-23720 at 10/10/23 1:23 AM:
--

Need a further research as I see there no releaseDriverContext() is called.

https://issues.apache.org/jira/browse/HIVE-16426


was (Author: dengzh):
Should do a further research as I see there no releaseDriverContext() is called.

https://issues.apache.org/jira/browse/HIVE-16426

> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-23720) Background task may not be interrupted when operation being canceled or timeout

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140382#comment-17140382
 ] 

Zhihua Deng edited comment on HIVE-23720 at 10/10/23 1:23 AM:
--

The running Task can be interrupted by releaseTaskQueue() in driver::close(), 
it's ok to shutdown by this instead of canceling the background.


was (Author: dengzh):
The running Task can be interrupted by releaseTaskQueue() in driver::close(), 
so it's ok to shutdown by this instead of canceling the background.

> Background task may not be interrupted when operation being canceled or 
> timeout
> ---
>
> Key: HIVE-23720
> URL: https://issues.apache.org/jira/browse/HIVE-23720
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Now SQLOperation cancels the background task only when the condition is met:
> if (shouldRunAsync() && state != OperationState.CANCELED && state != 
> OperationState.TIMEDOUT)
> The condition is evaluated to false when state is OperationState.CANCELED or 
> OperationState.TIMEDOUT,  but operations in such states should stop the 
> background tasks to release resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-23185) Historic queries lost after HS2 restart

2023-10-09 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-23185:
--

Assignee: (was: Zhihua Deng)

> Historic queries lost after HS2 restart
> ---
>
> Key: HIVE-23185
> URL: https://issues.apache.org/jira/browse/HIVE-23185
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Priority: Major
>
> QueryInfoCache caches historic queries in memory, when HS2 restart due to OOM 
> or upgrade, the queries are no longer seen at webui.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-23727) Improve SQLOperation log handling when canceling background

2023-10-09 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-23727:
--

Assignee: (was: Zhihua Deng)

> Improve SQLOperation log handling when canceling background
> ---
>
> Key: HIVE-23727
> URL: https://issues.apache.org/jira/browse/HIVE-23727
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The SQLOperation checks _if (shouldRunAsync() && state != 
> OperationState.CANCELED && state != OperationState.TIMEDOUT)_ to cancel the 
> background task. If true, the state should not be OperationState.CANCELED, so 
> logging under the state == OperationState.CANCELED should never happen.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-24422) Throw SemanticException when CTE alias is conflicted with table name

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17252750#comment-17252750
 ] 

Zhihua Deng edited comment on HIVE-24422 at 10/10/23 1:21 AM:
--

Seems I was wrong. In the trunk, the query 
{code:java}
with game_info as ( select distinct ext_id, dev_app_id, game_name from 
game_info_extend ) select count(game_name) from game_info;{code}
The game_info is materialized, this can reduce recomputing of game_info no 
matter how many times the game_info is referenced.


was (Author: dengzh):
Seems I am wrong. In the trunk, the query 
{code:java}
with game_info as ( select distinct ext_id, dev_app_id, game_name from 
game_info_extend ) select count(game_name) from game_info;{code}
The game_info is firstly materialized, this can reduce recomputing of game_info 
no matter how many times the game_info is referenced , the materialized view is 
used other than the table game_info for analyzing the tree later on.

> Throw SemanticException when CTE alias is conflicted with table name
> 
>
> Key: HIVE-24422
> URL: https://issues.apache.org/jira/browse/HIVE-24422
> Project: Hive
>  Issue Type: Improvement
>  Components: Parser
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If the alias of CTE is conflicted with the table name, we use the alias 
> fetching the table other than replacing it with the ASTNode tree, this may 
> cause some confusing problems. For example:
> {noformat}
> create table game_info (game_name string);
> with game_info as (
> select distinct ext_id, dev_app_id, game_name
> from game_info_extend )
> select count(game_name) from game_info;{noformat}
> The query will return the number of rows of the table game_info, instead of 
> the game_info_extend. Maybe we should better throw an exception to avoid such 
> cases.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-24351) Report progress to prevent merge task from timeout

2023-10-09 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-24351:
--

Assignee: (was: Zhihua Deng)

> Report progress to prevent merge task from timeout
> --
>
> Key: HIVE-24351
> URL: https://issues.apache.org/jira/browse/HIVE-24351
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If the MergeFileTask tries to merge lots of empty files,  the task may be 
> terminated due to task timeout. It’s rare, but it happens.  Report the 
> progress regularly to prevent the mapper from timeout.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-24639) Raises SemanticException other than ClassCastException when filter has non-boolean expressions

2023-10-09 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-24639:
--

Assignee: (was: Zhihua Deng)

> Raises SemanticException other than ClassCastException when filter has 
> non-boolean expressions
> --
>
> Key: HIVE-24639
> URL: https://issues.apache.org/jira/browse/HIVE-24639
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Sometimes we see ClassCastException in filters when fetching some rows of a 
> table or executing the query.  The 
> GenericUDFOPOr/GenericUDFOPAnd/FilterOperator assume that the output of their 
> conditions should be a boolean,  but there is no garanteed.  For example: 
> _select * from ccn_table where src + 1;_ 
> will throw ClassCastException:
> {code:java}
> Caused by: java.lang.ClassCastException: java.lang.Integer cannot be cast to 
> java.lang.Boolean
> at 
> org.apache.hadoop.hive.ql.exec.FilterOperator.process(FilterOperator.java:125)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
> at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:173)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:153)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:553)
> ...{code}
> We'd better to validate the filter during analyzing instead of at runtime and 
> bring more meaningful messages.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-25294) Optimise the metadata count queries for local mode

2023-10-09 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-25294:
--

Assignee: (was: Zhihua Deng)

> Optimise the metadata count queries for local mode
> --
>
> Key: HIVE-25294
> URL: https://issues.apache.org/jira/browse/HIVE-25294
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When Metastore is in local mode,  the client uses his own private HMSHandler 
> to get the meta data,  the HMSHandler should be initialized before being 
> ready to serve. When the metrics is enabled, HMSHandler will count the number 
> of db, table, partitions on initializing,  which cloud lead to some 
> performance problems.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27201) Inconsistency between session Hive and thread-local Hive may cause HS2 deadlock

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17716189#comment-17716189
 ] 

Zhihua Deng edited comment on HIVE-27201 at 10/10/23 1:14 AM:
--

Attach a document to explain the case.

Thank you all for the test, review and merge the PR!


was (Author: dengzh):
Attach a document to elaborate the case.

Thank you all for the test, review and merge the PR!

> Inconsistency between session Hive and thread-local Hive may cause HS2 
> deadlock
> ---
>
> Key: HIVE-27201
> URL: https://issues.apache.org/jira/browse/HIVE-27201
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
> Attachments: HIVE-27201.pdf
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> The HiveServer2’s server handler can switch to process the operation from 
> other session, in such case, the Hive cached in ThreadLocal is not the same 
> as the Hive in SessionState, and can be referenced by another session. 
> If the two handlers swap their sessions to process the DatabaseMetaData 
> request, and the HiveMetastoreClientFactory obtains the Hive via Hive.get(), 
> then there is a chance that the deadlock can happen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27352) Support both LDAP and Kerberos Auth in HS2

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729681#comment-17729681
 ] 

Zhihua Deng edited comment on HIVE-27352 at 10/10/23 1:13 AM:
--

Thanks for the review and merge, [~hemanth619]!


was (Author: dengzh):
Thanks for the review and merge, [~hemanth619]!

For Http mode, we must configure the new auth method to the end of original 
hive.server2.authentication in order to keep compatible with the old HS2 client.

> Support both LDAP and Kerberos Auth in HS2
> --
>
> Key: HIVE-27352
> URL: https://issues.apache.org/jira/browse/HIVE-27352
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>
> Currently, HS2 supports a single form of auth in binary mode, and some 
> limited multiple auth methods in http mode. Some analysis tools based on Hive 
> JDBC provide support for mixing both Kerberos and LDAP, however HS2 doesn't 
> make itself heard for this auth type, both in http and binary mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27597) Implement JDBC Connector for HiveServer

2023-10-09 Thread zhangbutao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773426#comment-17773426
 ] 

zhangbutao commented on HIVE-27597:
---

[~ngangam] I probably still don't have edit permission, and i can't find the 
edit button. Could you please check it again? Thanks.

[https://cwiki.apache.org/confluence/display/Hive/Data+Connectors+in+Hive]

 

> Implement JDBC Connector for HiveServer 
> 
>
> Key: HIVE-27597
> URL: https://issues.apache.org/jira/browse/HIVE-27597
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>
> The initial idea of having a thrift based connector, that would enable Hive 
> Metastore to use thrift APIs to interact with another metastore from another 
> cluster, has some limitations. Features like column masking support become a 
> challenge as we may bypass the authz controls on the remote cluster.
> Instead if we could federate a query from one instance of HS2 to another 
> instance of HS2 over JDBC, we would address the above concerns. This will 
> atleast give us the ability to access tables across cluster boundaries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27597) Implement JDBC Connector for HiveServer

2023-10-09 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-27597.
--
Resolution: Fixed

Fix has been merged to master. Thank you [~ayushtkn] and [~zhangbutao] for the 
review. Closing the jira.

[~zhangbutao] I have granted you write access to the Data connectors page. Hope 
it works. Thank you

> Implement JDBC Connector for HiveServer 
> 
>
> Key: HIVE-27597
> URL: https://issues.apache.org/jira/browse/HIVE-27597
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>
> The initial idea of having a thrift based connector, that would enable Hive 
> Metastore to use thrift APIs to interact with another metastore from another 
> cluster, has some limitations. Features like column masking support become a 
> challenge as we may bypass the authz controls on the remote cluster.
> Instead if we could federate a query from one instance of HS2 to another 
> instance of HS2 over JDBC, we would address the above concerns. This will 
> atleast give us the ability to access tables across cluster boundaries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27780) Implement direct SQL for get_all_functions

2023-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27780:
--
Labels: pull-request-available  (was: )

> Implement direct SQL for get_all_functions
> --
>
> Key: HIVE-27780
> URL: https://issues.apache.org/jira/browse/HIVE-27780
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27771) Iceberg: Allow expire snapshot by time range

2023-10-09 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HIVE-27771.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> Iceberg: Allow expire snapshot by time range
> 
>
> Key: HIVE-27771
> URL: https://issues.apache.org/jira/browse/HIVE-27771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Allow expiring snapshot by time range.
> Alter table ice01 execute expire_snapshot BETWEEN 'some time' AND 'some time'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27771) Iceberg: Allow expire snapshot by time range

2023-10-09 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773254#comment-17773254
 ] 

Ayush Saxena commented on HIVE-27771:
-

Committed to master.

Thanx [~dkuzmenko] for the review!!!

> Iceberg: Allow expire snapshot by time range
> 
>
> Key: HIVE-27771
> URL: https://issues.apache.org/jira/browse/HIVE-27771
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>
> Allow expiring snapshot by time range.
> Alter table ice01 execute expire_snapshot BETWEEN 'some time' AND 'some time'



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27780) Implement direct SQL for get_all_functions

2023-10-09 Thread zhangbutao (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao reassigned HIVE-27780:
-

Assignee: zhangbutao

> Implement direct SQL for get_all_functions
> --
>
> Key: HIVE-27780
> URL: https://issues.apache.org/jira/browse/HIVE-27780
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27780) Implement direct SQL for get_all_functions

2023-10-09 Thread zhangbutao (Jira)
zhangbutao created HIVE-27780:
-

 Summary: Implement direct SQL for get_all_functions
 Key: HIVE-27780
 URL: https://issues.apache.org/jira/browse/HIVE-27780
 Project: Hive
  Issue Type: Improvement
  Components: Standalone Metastore
Reporter: zhangbutao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27779) Iceberg: Drop partition support

2023-10-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27779:
--
Labels: pull-request-available  (was: )

> Iceberg: Drop partition support
> ---
>
> Key: HIVE-27779
> URL: https://issues.apache.org/jira/browse/HIVE-27779
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sourabh Badhya
>Assignee: Sourabh Badhya
>Priority: Major
>  Labels: pull-request-available
>
> A logical extension of TRUNCATE PARTITION however, DROP PARTITION allows 
> expressions with >, <, >=, <=, != etc.
> The syntax is something as follows - 
> {code:java}
> alter table tableName drop partition (c='US', d<'2');{code}
> Drop partition command also allows multiple partition expressions as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27777) CBO fails on multi insert overwrites with common group expression

2023-10-09 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-2:
---
Description: 
The following statement is failing in CBO:

 
{code:java}
FROM (select key, f1 FROM tbl1 where key=5) a
INSERT OVERWRITE TABLE tbl2 partition(key=5)
select f1 WHERE key > 0 GROUP by f1
INSERT OVERWRITE TABLE tbl2 partition(key=6)
select f1 WHERE key > 0 GROUP by f1;
{code}
The failure happens when there is a filter to a constant value in the FROM 
clause ,the value is referenced in the filter in the INSERT OVERWRITE, and 
there is a common group existing across the insert overwrites.

CBO is pulling up the key = 5 expression into the select clause as a constant 
(i.e. select 5 key, f1 FROM tbl1 where key = 5).  After it gets converted back 
into AST and then re-compiled, there is code in the common group method that 
expects all columns to be non-constants which is causing the failure.

The failure stacktrace is shown below:
{noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: Line 6:16 Expression not in 
GROUP BY key '0'
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:13509)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13451)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:13419)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3727)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:3707)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlan1ReduceMultiGBY(SemanticAnalyzer.java:6514)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:11415)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12343)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:12209)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:634)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:13073)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:465)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:180)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:224)
at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:107)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:519)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:471)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:436)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:430)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:121)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:227)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:257)
at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:425)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:356)
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:733)
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:703)
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:115)
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157)
{noformat}


  was:
The following statement is failing in CBO:

 
{code:java}
FROM (select key, f1 FROM tbl1 where key=5) a
INSERT OVERWRITE TABLE tbl2 partition(key=5)
select f1 WHERE key > 0 GROUP by f1
INSERT OVERWRITE TABLE tbl2 partition(key=6)
select f1 WHERE key > 0 GROUP by f1;
{code}
The failure happens when there is a filter to a constant value in the FROM 
clause ,the value is referenced in the filter in the INSERT OVERWRITE, and 
there is a common group existing across the insert overwrites.

CBO is pulling up the key = 5 expression into the select clause as a constant 
(i.e. select 5 key, f1 FROM tbl1 where key = 5).  After it gets converted back 
into AST and then re-compiled, there is code in the common group method that 
expects all columns to be non-constants which is causing the failiure.


> CBO fails on multi insert overwrites with 

[jira] [Created] (HIVE-27779) Iceberg: Drop partition support

2023-10-09 Thread Sourabh Badhya (Jira)
Sourabh Badhya created HIVE-27779:
-

 Summary: Iceberg: Drop partition support
 Key: HIVE-27779
 URL: https://issues.apache.org/jira/browse/HIVE-27779
 Project: Hive
  Issue Type: Improvement
Reporter: Sourabh Badhya
Assignee: Sourabh Badhya


A logical extension of TRUNCATE PARTITION however, DROP PARTITION allows 
expressions with >, <, >=, <=, != etc.
The syntax is something as follows - 
{code:java}
alter table tableName drop partition (c='US', d<'2');{code}
Drop partition command also allows multiple partition expressions as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL

2023-10-09 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773173#comment-17773173
 ] 

Stamatis Zampetakis commented on HIVE-27755:


The queries just above do not come from the validate command but from the test 
itself. 

The {{TestSchemaToolForMetastore.testValidateSequences}} case sends some 
[hardcoded DML 
commands|https://github.com/apache/hive/blob/20be17d19c19a1482dc3b1b753ce1f342b940e36/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/tools/schematool/TestSchemaToolForMetastore.java#L122]
 before/after validation. 

> Quote identifiers in SQL emitted by SchemaTool for MySQL
> 
>
> Key: HIVE-27755
> URL: https://issues.apache.org/jira/browse/HIVE-27755
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: TestMysql-upgrade-after.txt, 
> TestMysql-upgrade-before.txt, 
> TestSchemaToolForMetastore-validateSequences-after.txt, 
> TestSchemaToolForMetastore-validateSequences-before.txt, 
> TestSchemaToolForMetastore-validateTables-after.txt, 
> TestSchemaToolForMetastore-validateTables-before.txt
>
>
> Various SchemaTool options/tasks (e.g., "validate") generate and run SQL 
> statements on the underlying database. Depending on the database identifiers 
> in the SQL statements may be quoted (see 
> [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]).
> Currently, all identifiers are quoted when the database is Postgres and this 
> tickets aims to do the same for MySQL/MariaDB.
> The main motivation behind this change is to avoid unexpected surprises and 
> query failures when/if the database decides to turn some of the 
> tables/columns we are using internally to reserved keywords.
> As a concrete example, the Percona fork of MySQL recently turned 
> SEQUENCE_TABLE into a reserved keyword 
> ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html])
>  and this comes in conflict with our internal metastore table.
> The installation scripts do not fail since in that case SEQUENCE_TABLE is 
> quoted 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447])
>  but validation queries emitted by the SchemaTool will fail 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117])
>  if we don't use quoted identifiers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773136#comment-17773136
 ] 

Zhihua Deng edited comment on HIVE-27755 at 10/9/23 6:49 AM:
-

In the TestSchemaToolForMetastore-validateSequences-after.txt,  do these come 
from the validate command?
{noformat}
2023-09-28T11:28:55.582079Z 9 Query delete from SEQUENCE_TABLE
2023-09-28T11:28:55.584104Z 9 Query insert into SEQUENCE_TABLE 
values('org.apache.hadoop.hive.metastore.model.MDatabase', 100){noformat}


was (Author: dengzh):
In the TestSchemaToolForMetastore-validateSequences-after.txt,  are these come 
from the validate command?
{noformat}
2023-09-28T11:28:55.582079Z 9 Query delete from SEQUENCE_TABLE
2023-09-28T11:28:55.584104Z 9 Query insert into SEQUENCE_TABLE 
values('org.apache.hadoop.hive.metastore.model.MDatabase', 100){noformat}

> Quote identifiers in SQL emitted by SchemaTool for MySQL
> 
>
> Key: HIVE-27755
> URL: https://issues.apache.org/jira/browse/HIVE-27755
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: TestMysql-upgrade-after.txt, 
> TestMysql-upgrade-before.txt, 
> TestSchemaToolForMetastore-validateSequences-after.txt, 
> TestSchemaToolForMetastore-validateSequences-before.txt, 
> TestSchemaToolForMetastore-validateTables-after.txt, 
> TestSchemaToolForMetastore-validateTables-before.txt
>
>
> Various SchemaTool options/tasks (e.g., "validate") generate and run SQL 
> statements on the underlying database. Depending on the database identifiers 
> in the SQL statements may be quoted (see 
> [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]).
> Currently, all identifiers are quoted when the database is Postgres and this 
> tickets aims to do the same for MySQL/MariaDB.
> The main motivation behind this change is to avoid unexpected surprises and 
> query failures when/if the database decides to turn some of the 
> tables/columns we are using internally to reserved keywords.
> As a concrete example, the Percona fork of MySQL recently turned 
> SEQUENCE_TABLE into a reserved keyword 
> ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html])
>  and this comes in conflict with our internal metastore table.
> The installation scripts do not fail since in that case SEQUENCE_TABLE is 
> quoted 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447])
>  but validation queries emitted by the SchemaTool will fail 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117])
>  if we don't use quoted identifiers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773136#comment-17773136
 ] 

Zhihua Deng edited comment on HIVE-27755 at 10/9/23 6:48 AM:
-

In the TestSchemaToolForMetastore-validateSequences-after.txt,  are these come 
from the validate command?
{noformat}
2023-09-28T11:28:55.582079Z 9 Query delete from SEQUENCE_TABLE
2023-09-28T11:28:55.584104Z 9 Query insert into SEQUENCE_TABLE 
values('org.apache.hadoop.hive.metastore.model.MDatabase', 100){noformat}


was (Author: dengzh):
In the TestSchemaToolForMetastore-validateSequences-after.txt,  are these from 
the validate command?
{noformat}
2023-09-28T11:28:55.582079Z 9 Query delete from SEQUENCE_TABLE
2023-09-28T11:28:55.584104Z 9 Query insert into SEQUENCE_TABLE 
values('org.apache.hadoop.hive.metastore.model.MDatabase', 100){noformat}

> Quote identifiers in SQL emitted by SchemaTool for MySQL
> 
>
> Key: HIVE-27755
> URL: https://issues.apache.org/jira/browse/HIVE-27755
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: TestMysql-upgrade-after.txt, 
> TestMysql-upgrade-before.txt, 
> TestSchemaToolForMetastore-validateSequences-after.txt, 
> TestSchemaToolForMetastore-validateSequences-before.txt, 
> TestSchemaToolForMetastore-validateTables-after.txt, 
> TestSchemaToolForMetastore-validateTables-before.txt
>
>
> Various SchemaTool options/tasks (e.g., "validate") generate and run SQL 
> statements on the underlying database. Depending on the database identifiers 
> in the SQL statements may be quoted (see 
> [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]).
> Currently, all identifiers are quoted when the database is Postgres and this 
> tickets aims to do the same for MySQL/MariaDB.
> The main motivation behind this change is to avoid unexpected surprises and 
> query failures when/if the database decides to turn some of the 
> tables/columns we are using internally to reserved keywords.
> As a concrete example, the Percona fork of MySQL recently turned 
> SEQUENCE_TABLE into a reserved keyword 
> ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html])
>  and this comes in conflict with our internal metastore table.
> The installation scripts do not fail since in that case SEQUENCE_TABLE is 
> quoted 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447])
>  but validation queries emitted by the SchemaTool will fail 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117])
>  if we don't use quoted identifiers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27755) Quote identifiers in SQL emitted by SchemaTool for MySQL

2023-10-09 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17773136#comment-17773136
 ] 

Zhihua Deng commented on HIVE-27755:


In the TestSchemaToolForMetastore-validateSequences-after.txt,  are these from 
the validate command?
{noformat}
2023-09-28T11:28:55.582079Z 9 Query delete from SEQUENCE_TABLE
2023-09-28T11:28:55.584104Z 9 Query insert into SEQUENCE_TABLE 
values('org.apache.hadoop.hive.metastore.model.MDatabase', 100){noformat}

> Quote identifiers in SQL emitted by SchemaTool for MySQL
> 
>
> Key: HIVE-27755
> URL: https://issues.apache.org/jira/browse/HIVE-27755
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-beta-1
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: TestMysql-upgrade-after.txt, 
> TestMysql-upgrade-before.txt, 
> TestSchemaToolForMetastore-validateSequences-after.txt, 
> TestSchemaToolForMetastore-validateSequences-before.txt, 
> TestSchemaToolForMetastore-validateTables-after.txt, 
> TestSchemaToolForMetastore-validateTables-before.txt
>
>
> Various SchemaTool options/tasks (e.g., "validate") generate and run SQL 
> statements on the underlying database. Depending on the database identifiers 
> in the SQL statements may be quoted (see 
> [https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/HiveSchemaHelper.java#L173]).
> Currently, all identifiers are quoted when the database is Postgres and this 
> tickets aims to do the same for MySQL/MariaDB.
> The main motivation behind this change is to avoid unexpected surprises and 
> query failures when/if the database decides to turn some of the 
> tables/columns we are using internally to reserved keywords.
> As a concrete example, the Percona fork of MySQL recently turned 
> SEQUENCE_TABLE into a reserved keyword 
> ([https://docs.percona.com/percona-server/8.0/flexibility/sequence_table.html])
>  and this comes in conflict with our internal metastore table.
> The installation scripts do not fail since in that case SEQUENCE_TABLE is 
> quoted 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0-beta-2.mysql.sql#L447])
>  but validation queries emitted by the SchemaTool will fail 
> ([https://github.com/apache/hive/blob/2dbfbeefc1a73d6a50f1c829658846fc827fc780/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/schematool/SchemaToolTaskValidate.java#L117])
>  if we don't use quoted identifiers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27765) Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966

2023-10-09 Thread Aman Raj (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Raj updated HIVE-27765:

Summary: Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, 
HIVE-20300, HIVE-20312, HIVE-20044, HIVE-21966  (was: Backport of HIVE-20052: 
Arrow serde should fill ArrowColumnVector(Decimal) with the given schema 
precision/scale)

> Backport of HIVE-20052, HIVE-20093, HIVE-20203, HIVE-20290, HIVE-20300, 
> HIVE-20312, HIVE-20044, HIVE-21966
> --
>
> Key: HIVE-27765
> URL: https://issues.apache.org/jira/browse/HIVE-27765
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.2.0
>Reporter: Aman Raj
>Assignee: Aman Raj
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)