[jira] [Assigned] (HIVE-27370) SUBSTR UDF return '?' against 4-bytes character

2023-08-01 Thread Ryu Kobayashi (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryu Kobayashi reassigned HIVE-27370:


Assignee: Ryu Kobayashi  (was: Ryu Kobayashi)

> SUBSTR UDF return '?' against 4-bytes character
> ---
>
> Key: HIVE-27370
> URL: https://issues.apache.org/jira/browse/HIVE-27370
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Ryu Kobayashi
>Assignee: Ryu Kobayashi
>Priority: Major
>  Labels: pull-request-available
>
> SUBSTR doesn't seem to support 4-byte characters. This also happens in master 
> branch. Also, this does not happen in vectorized mode, so it is a problem 
> specific to non-vectorized mode. An example is below:
> {code:java}
> -- vectorized mode
> create temporary table foo (str string) stored as orc;
> insert into foo values('安佐町大字久地字野𨵱4614番地'), ('あa🤎いiうu');
> SELECT
>   SUBSTR(str, 1, 10) as a1,
>   SUBSTR(str, 10, 3) as a2,
>   SUBSTR(str, -7) as a3,
>   substr(str, 1, 3) as b1,
>   substr(str, 3) as b2,
>   substr(str, -5) as b3
> from foo
> ;
> 安佐町大字久地字野𨵱  𨵱4614番地  安佐町       町大字久地字野𨵱4614番地     614番地
> あa🤎             あa🤎いiうu        あa🤎        🤎いiうu    🤎いiうu {code}
> {code:java}
> -- non-vectorized
> SELECT
>   SUBSTR('安佐町大字久地字野𨵱4614番地', 1, 10) as a1,
>   SUBSTR('安佐町大字久地字野𨵱4614番地', 10, 3) as a2,
>   SUBSTR('安佐町大字久地字野𨵱4614番地', -7) as a3,
>   substr('あa🤎いiうu', 1, 3) as b1,
>   substr('あa🤎いiうu', 3) as b2,
>   substr('あa🤎いiうu', -5) as b3
> ; 
> 安佐町大字久地字野?    �4   ?4614番地     あa?   �いiうu    ?いiうu{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27556) Add Unit Test for KafkaStorageHandlerInfo

2023-08-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-27556:
--
Labels: pull-request-available  (was: )

> Add Unit Test for KafkaStorageHandlerInfo
> -
>
> Key: HIVE-27556
> URL: https://issues.apache.org/jira/browse/HIVE-27556
> Project: Hive
>  Issue Type: Test
>  Components: kafka integration, StorageHandler
>Reporter: Kokila N
>Assignee: Kokila N
>Priority: Major
>  Labels: pull-request-available
>
> Adding unit tests for KafkaStorageHandlerInfo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HIVE-27556) Add Unit Test for KafkaStorageHandlerInfo

2023-08-01 Thread Kokila N (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kokila N reassigned HIVE-27556:
---

Assignee: Kokila N

> Add Unit Test for KafkaStorageHandlerInfo
> -
>
> Key: HIVE-27556
> URL: https://issues.apache.org/jira/browse/HIVE-27556
> Project: Hive
>  Issue Type: Test
>  Components: kafka integration, StorageHandler
>Reporter: Kokila N
>Assignee: Kokila N
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27556) Add Unit Test for KafkaStorageHandlerInfo

2023-08-01 Thread Kokila N (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kokila N updated HIVE-27556:

Description: Adding unit tests for 

> Add Unit Test for KafkaStorageHandlerInfo
> -
>
> Key: HIVE-27556
> URL: https://issues.apache.org/jira/browse/HIVE-27556
> Project: Hive
>  Issue Type: Test
>  Components: kafka integration, StorageHandler
>Reporter: Kokila N
>Assignee: Kokila N
>Priority: Major
>
> Adding unit tests for 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27556) Add Unit Test for KafkaStorageHandlerInfo

2023-08-01 Thread Kokila N (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kokila N updated HIVE-27556:

Description: Adding unit tests for KafkaStorageHandlerInfo.  (was: Adding 
unit tests for )

> Add Unit Test for KafkaStorageHandlerInfo
> -
>
> Key: HIVE-27556
> URL: https://issues.apache.org/jira/browse/HIVE-27556
> Project: Hive
>  Issue Type: Test
>  Components: kafka integration, StorageHandler
>Reporter: Kokila N
>Assignee: Kokila N
>Priority: Major
>
> Adding unit tests for KafkaStorageHandlerInfo.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27556) Add Unit Test for KafkaStorageHandlerInfo

2023-08-01 Thread Kokila N (Jira)
Kokila N created HIVE-27556:
---

 Summary: Add Unit Test for KafkaStorageHandlerInfo
 Key: HIVE-27556
 URL: https://issues.apache.org/jira/browse/HIVE-27556
 Project: Hive
  Issue Type: Test
  Components: kafka integration, StorageHandler
Reporter: Kokila N






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27487) NPE in Hive JDBC storage handler

2023-08-01 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750056#comment-17750056
 ] 

Zhihua Deng commented on HIVE-27487:


When failed to register the remote Hive table, it's hard to identify what was 
causing it by the given message, propagate the detail exception message to the 
front to help diagnose the problem.

Tested the PR on both secure(Kerberos based) and unsecure HS2:
    a. Unsecure HS2 can't connect to Kerberos based HS2;
    b. Kerberos based HS2 can connect to both unsecure HS2 and Kerberos based 
HS2;

The PR can query the remote table with no NPE.

> NPE in Hive JDBC storage handler
> 
>
> Key: HIVE-27487
> URL: https://issues.apache.org/jira/browse/HIVE-27487
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC storage handler
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>
> A simple query against a Hive JDBC table: "select * from sample_nightly" 
> would fail due to:
> {noformat}
>  Caused by: java.lang.NullPointerException
>     at org.apache.hive.storage.jdbc.JdbcSerDe.deserialize(JdbcSerDe.java:168)
>     at 
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:623)
>     ... 21 more{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27555) Kudu table storage handler doesn't update correctly in backend db

2023-08-01 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-27555:
---
Description: 
In HIVE-27457, we try to update the serde lib, (input/output)format of the kudu 
table in back db. In the upgrade scripts, we join the  "SDS"."SD_ID" with 
"TABLE_PARAMS"."TBL_ID", 

https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.mysql.sql#L37-L39

as "SD_ID" is the primary key of SDS, and "TBL_ID" is the primary key of TBLS, 
we can't join the two tables using these two columns.

  was:
In HIVE-27457, we try to update the serde lib, (input/output)format of the kudu 
table in back db. In the upgrade scripts, we join the  "SDS"."SD_ID" with 
"TABLE_PARAMS"."TBL_ID", 

[https://github.com/zchovan/hive/blob/454b3000b74f8df8114895d60f4befbbd64d62ec/standalone-metastore/metastore-server/src/main/sql/derby/upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.derby.sql#L36-L38]

as "SD_ID" is the primary key of SDS, and "TBL_ID" is the primary key of TBLS, 
we can't join the two tables using these two columns.


> Kudu table storage handler doesn't update correctly in backend db
> -
>
> Key: HIVE-27555
> URL: https://issues.apache.org/jira/browse/HIVE-27555
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhihua Deng
>Priority: Major
>
> In HIVE-27457, we try to update the serde lib, (input/output)format of the 
> kudu table in back db. In the upgrade scripts, we join the  "SDS"."SD_ID" 
> with "TABLE_PARAMS"."TBL_ID", 
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.mysql.sql#L37-L39
> as "SD_ID" is the primary key of SDS, and "TBL_ID" is the primary key of 
> TBLS, we can't join the two tables using these two columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27195) Add database authorization for drop table command

2023-08-01 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749786#comment-17749786
 ] 

Stamatis Zampetakis commented on HIVE-27195:


Thanks for merging this [~ngangam]. In the future, please remember to give 
credits to contributors and reviewers in the commit message since we are mostly 
gathering stats from there for inviting new committer/PMC members.

> Add database authorization for drop table command
> -
>
> Key: HIVE-27195
> URL: https://issues.apache.org/jira/browse/HIVE-27195
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Include authorization of the database object during the "drop table" command. 
> Similar to "Create table", DB permissions should be verified in the case of 
> "drop table" too. Add the database object along with the table object to the 
> list of output objects sent for verifying privileges. This change would 
> ensure that in case of a non-existent table or temporary table (skipped from 
> authorization after HIVE-20051), the authorizer will verify privileges for 
> the database object.
> This would also prevent DROP TABLE IF EXISTS command failure for temporary or 
> non-existing tables with `RangerHiveAuthorizer`. In case of 
> temporary/non-existing table, empty input and output HivePrivilege Objects 
> are sent to Ranger authorizer and after 
> https://issues.apache.org/jira/browse/RANGER-3407 authorization request is 
> built from command in case of empty objects. Hence, the drop table if Exists 
> command fails with  HiveAccessControlException.
> Steps to Repro:
> {code:java}
> use test; CREATE TEMPORARY TABLE temp_table (id int);
> drop table if exists test.temp_table;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: user [rtrivedi] does not have [DROP] privilege on 
> [test/temp_table] (state=42000,code=4) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27195) Add database authorization for drop table command

2023-08-01 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-27195.
--
Fix Version/s: 4.0.0-beta-1
   Resolution: Fixed

Fix has been merged to master for beta1 release. Thank you for the patch.

> Add database authorization for drop table command
> -
>
> Key: HIVE-27195
> URL: https://issues.apache.org/jira/browse/HIVE-27195
> Project: Hive
>  Issue Type: Bug
>Reporter: Riju Trivedi
>Assignee: Riju Trivedi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Include authorization of the database object during the "drop table" command. 
> Similar to "Create table", DB permissions should be verified in the case of 
> "drop table" too. Add the database object along with the table object to the 
> list of output objects sent for verifying privileges. This change would 
> ensure that in case of a non-existent table or temporary table (skipped from 
> authorization after HIVE-20051), the authorizer will verify privileges for 
> the database object.
> This would also prevent DROP TABLE IF EXISTS command failure for temporary or 
> non-existing tables with `RangerHiveAuthorizer`. In case of 
> temporary/non-existing table, empty input and output HivePrivilege Objects 
> are sent to Ranger authorizer and after 
> https://issues.apache.org/jira/browse/RANGER-3407 authorization request is 
> built from command in case of empty objects. Hence, the drop table if Exists 
> command fails with  HiveAccessControlException.
> Steps to Repro:
> {code:java}
> use test; CREATE TEMPORARY TABLE temp_table (id int);
> drop table if exists test.temp_table;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: user [rtrivedi] does not have [DROP] privilege on 
> [test/temp_table] (state=42000,code=4) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-27555) Kudu table storage handler doesn't update correctly in backend db

2023-08-01 Thread Zhihua Deng (Jira)
Zhihua Deng created HIVE-27555:
--

 Summary: Kudu table storage handler doesn't update correctly in 
backend db
 Key: HIVE-27555
 URL: https://issues.apache.org/jira/browse/HIVE-27555
 Project: Hive
  Issue Type: Bug
Reporter: Zhihua Deng


In HIVE-27457, we try to update the serde lib, (input/output)format of the kudu 
table in back db. In the upgrade scripts, we join the  "SDS"."SD_ID" with 
"TABLE_PARAMS"."TBL_ID", 

[https://github.com/zchovan/hive/blob/454b3000b74f8df8114895d60f4befbbd64d62ec/standalone-metastore/metastore-server/src/main/sql/derby/upgrade-4.0.0-alpha-2-to-4.0.0-beta-1.derby.sql#L36-L38]

as "SD_ID" is the primary key of SDS, and "TBL_ID" is the primary key of TBLS, 
we can't join the two tables using these two columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27445) Improve SSO handling in HiveJdbcBrowserClient

2023-08-01 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-27445.

Resolution: Duplicate

> Improve SSO handling in HiveJdbcBrowserClient
> -
>
> Key: HIVE-27445
> URL: https://issues.apache.org/jira/browse/HIVE-27445
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Reporter: Stamatis Zampetakis
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HIVE-27554) Validate URL used by SSO workflow for JDBC connection

2023-08-01 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HIVE-27554:

Fix Version/s: (was: 4.0.0)

> Validate URL used by SSO workflow for JDBC connection
> -
>
> Key: HIVE-27554
> URL: https://issues.apache.org/jira/browse/HIVE-27554
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 3.1.3
>Reporter: Henri Biestro
>Assignee: Henri Biestro
>Priority: Minor
>  Labels: pull-request-available
>
> Add a validation to ensure the URL used during SSO workflow is proper 
> (http/https).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27553) After upgrading from Hive1 to Hive3, Decimal computation experiences a loss of precision

2023-08-01 Thread zhangbutao (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749598#comment-17749598
 ] 

zhangbutao commented on HIVE-27553:
---

This issue was caused by HIVE-15331 which emulated the SQL Server decimal 
behavior. 

> After upgrading from Hive1 to Hive3, Decimal computation experiences a loss 
> of precision
> 
>
> Key: HIVE-27553
> URL: https://issues.apache.org/jira/browse/HIVE-27553
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.3
>Reporter: ZhengBowen
>Priority: Major
> Attachments: image-2023-07-31-20-40-00-679.png, 
> image-2023-07-31-20-40-35-050.png, image-2023-07-31-20-43-05-379.png, 
> image-2023-07-31-20-43-49-775.png
>
>
> I can reproduce this bug.
> {quote}{{create}} {{table}} {{decimal_test(}}
> {{}}{{id }}{{{}int{}}}{{{},{}}}
> {{}}{{quantity }}{{{}decimal{}}}{{{}(38,8),{}}}
> {{}}{{cost }}{{{}decimal{}}}{{{}(38,8){}}}
> {{) stored }}{{as}} {{textfile;}}
>  
> {{insert}} {{into}} {{decimal_test }}{{{}values{}}}{{{}(1,0.8000, 
> 0.00015000);{}}}
>  
> {{select}} {{quantity * cost }}{{from}} {{decimal_test;}}
> {quote}
> *1、The following are the execution results and execution plan on Hive-1.0.1:*
> !image-2023-07-31-20-40-00-679.png|width=550,height=230!
> !image-2023-07-31-20-43-05-379.png|width=540,height=144!
> *2、The following are the execution results and execution plan on Hive-3.1.3:*
> !image-2023-07-31-20-40-35-050.png|width=538,height=257!
> !image-2023-07-31-20-43-49-775.png|width=533,height=142!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HIVE-27050) Iceberg: MOR: Restrict reducer extrapolation to contain number of small files being created

2023-08-01 Thread Denys Kuzmenko (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-27050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko resolved HIVE-27050.
---
Fix Version/s: 4.0.0-beta-1
   Resolution: Fixed

> Iceberg: MOR: Restrict reducer extrapolation to contain number of small files 
> being created
> ---
>
> Key: HIVE-27050
> URL: https://issues.apache.org/jira/browse/HIVE-27050
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-beta-1
>
>
> Scenario:
>  # Create a simple table in iceberg (MOR mode). e.g store_sales_delete_1
>  # Insert some data into it. 
>  # Run an update statement as follows
>  ## "update  store_sales_delete_1 set ss_sold_time_sk=699060 where 
> ss_sold_time_sk=69906"
> Hive estimates the number of reducers as "1". But due to 
> "hive.tez.max.partition.factor" which defaults to "2.0", it will double the 
> number of reducers.
> To put in perspective, it will create very small positional delete files 
> spreading across different reducers. This will cause problems during reading, 
> as all files should be opened for reading.
>  
>  # When iceberg MOR tables are involved in update/delete/merges, disable 
> "hive.tez.max.partition.factor"; or set it to "1.0" irrespective of the user 
> setting;
>  # Have explicit logs for easier debugging; User shouldn't be confused on why 
> the setting is not taking into effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HIVE-27050) Iceberg: MOR: Restrict reducer extrapolation to contain number of small files being created

2023-08-01 Thread Denys Kuzmenko (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-27050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17749559#comment-17749559
 ] 

Denys Kuzmenko commented on HIVE-27050:
---

merged to master.
thanks [~difin] for the patch and [~okumin] for the review!

> Iceberg: MOR: Restrict reducer extrapolation to contain number of small files 
> being created
> ---
>
> Key: HIVE-27050
> URL: https://issues.apache.org/jira/browse/HIVE-27050
> Project: Hive
>  Issue Type: Improvement
>  Components: Iceberg integration
>Reporter: Rajesh Balamohan
>Assignee: Dmitriy Fingerman
>Priority: Major
>  Labels: pull-request-available
>
> Scenario:
>  # Create a simple table in iceberg (MOR mode). e.g store_sales_delete_1
>  # Insert some data into it. 
>  # Run an update statement as follows
>  ## "update  store_sales_delete_1 set ss_sold_time_sk=699060 where 
> ss_sold_time_sk=69906"
> Hive estimates the number of reducers as "1". But due to 
> "hive.tez.max.partition.factor" which defaults to "2.0", it will double the 
> number of reducers.
> To put in perspective, it will create very small positional delete files 
> spreading across different reducers. This will cause problems during reading, 
> as all files should be opened for reading.
>  
>  # When iceberg MOR tables are involved in update/delete/merges, disable 
> "hive.tez.max.partition.factor"; or set it to "1.0" irrespective of the user 
> setting;
>  # Have explicit logs for easier debugging; User shouldn't be confused on why 
> the setting is not taking into effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)