date:20210606

[jira] [Work logged] (HIVE-25194) Add support for STORED AS ORC/PARQUET/AVRO for Iceberg

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25194?focusedWorklogId=607651&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607651
 ]

ASF GitHub Bot logged work on HIVE-25194:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 06:57
Start Date: 07/Jun/21 06:57
Worklog Time Spent: 10m 
  Work Description: lcspinter commented on a change in pull request #2348:
URL: https://github.com/apache/hive/pull/2348#discussion_r646317509



##
File path: 
iceberg/iceberg-handler/src/test/queries/negative/create_iceberg_table_stored_as_with_serdeproperties_failure.q
##
@@ -0,0 +1,3 @@
+set hive.vectorized.execution.enabled=false;

Review comment:
   I've added one to the HBase qtests. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607651)
Time Spent: 2h 10m  (was: 2h)

> Add support for STORED AS ORC/PARQUET/AVRO for Iceberg
> --
>
> Key: HIVE-25194
> URL: https://issues.apache.org/jira/browse/HIVE-25194
> Project: Hive
>  Issue Type: New Feature
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently we have to specify the fileformat in TBLPROPERTIES during Iceberg 
> create table statements.
> The ideal syntax would be:
> CREATE TABLE tbl STORED BY ICEBERG STORED AS ORC ...
> One complication is that currently stored by and stored as are not permitted 
> within the same query, so that needs to be amended.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25150) Tab characters are not removed before decimal conversion similar to space character which is fixed as part of HIVE-24378

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25150?focusedWorklogId=607629&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607629
 ]

ASF GitHub Bot logged work on HIVE-25150:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 05:43
Start Date: 07/Jun/21 05:43
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #2308:
URL: https://github.com/apache/hive/pull/2308#issuecomment-855597524


   LGTM +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607629)
Time Spent: 1h  (was: 50m)

> Tab characters are not removed before decimal conversion similar to space 
> character which is fixed as part of HIVE-24378
> 
>
> Key: HIVE-25150
> URL: https://issues.apache.org/jira/browse/HIVE-25150
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Taraka Rama Rao Lethavadla
>Assignee: Taraka Rama Rao Lethavadla
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Test case: 
>  column values with space and tab character 
> {noformat}
> bash-4.2$ cat data/files/test_dec_space.csv
> 1,0
> 2, 1
> 3,2{noformat}
> {noformat}
> create external table test_dec_space (id int, value decimal) ROW FORMAT 
> DELIMITED
>  FIELDS TERMINATED BY ',' location '/tmp/test_dec_space';
> {noformat}
> output of select * from test_dec_space would be
> {noformat}
> 1 0
> 2 1
> 3 NULL{noformat}
> The behaviour in MySQL when there is tab & space characters in decimal values
> {noformat}
> bash-4.2$ cat /tmp/insert.csv 
> "1","aa",11.88
> "2","bb", 99.88
> "4","dd", 209.88{noformat}
>  
> {noformat}
> MariaDB [test]> load data local infile '/tmp/insert.csv' into table t2 fields 
> terminated by ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';
>  Query OK, 3 rows affected, 3 warnings (0.00 sec) 
>  Records: 3 Deleted: 0 Skipped: 0 Warnings: 3
> MariaDB [test]> select * from t2;
> +--+--+---+
> | id   | name | score |
> +--+--+---+
> | 1| aa   |12 |
> | 2| bb   |   100 |
> | 4| dd   |   210 |
> +--+--+---+
>  3 rows in set (0.00 sec)
> {noformat}
> So in hive also we can make it work by skipping tab character



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24284) NPE when parsing druid logs using Hive

2021-06-06 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera resolved HIVE-24284.

Resolution: Fixed

> NPE when parsing druid logs using Hive
> --
>
> Key: HIVE-24284
> URL: https://issues.apache.org/jira/browse/HIVE-24284
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As per current Sys-logger parser, its always expecting a valid proc id. But 
> as per RFC3164 and RFC5424, the proc id can be skipped.So hive should handled 
> it by using NILVALUE/empty string in case the proc id is null.
>  
> {code:java}
> Caused by: java.lang.NullPointerException: null
> at java.lang.String.(String.java:566)
> at 
> org.apache.hadoop.hive.ql.log.syslog.SyslogParser.createEvent(SyslogParser.java:361)
> at 
> org.apache.hadoop.hive.ql.log.syslog.SyslogParser.readEvent(SyslogParser.java:326)
> at 
> org.apache.hadoop.hive.ql.log.syslog.SyslogSerDe.deserialize(SyslogSerDe.java:95)
>  {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25206) Add primary key for partial metadata script

2021-06-06 Thread zhangbutao (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-25206:
--
Attachment: HIVE-25206.1.patch
Status: Patch Available  (was: Open)

> Add primary key for partial metadata script
> ---
>
> Key: HIVE-25206
> URL: https://issues.apache.org/jira/browse/HIVE-25206
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-25206.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
> {code}
> Some metadata tables in hive-schema-4.0.0.mysql.sql dont't have primary key. 
> Eg *TXN_COMPONENTS* and  *COMPLETED_TXN_COMPONENTS* . This will cause 
> exception when backend mysql set some strict parameters such as '*global 
> pxc_strict_mode='ENFORCING*''.
> {code:java}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to 
> clean up java.sql.SQLException: Percona-XtraDB-Cluster prohibits use of DML 
> command on a table (hive4s.txn_components) without an explicit primary key 
> with pxc_strict_mode = ENFORCING or MASTER
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2814)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1813)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1727)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.ProxyStatement.executeUpdate(ProxyStatement.java:117)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.HikariProxyStatement.executeUpdate(HikariProxyStatement.java)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3962)
> at 
> org.apache.hadoop.hive.metastore.AcidEventListener.onDropDatabase(AcidEventListener.java:58)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$23.notify(MetaStoreListenerNotifier.java:94)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:305)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database_core(HMSHandler.java:1893)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database(HMSHandler.java:1954)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy28.drop_database(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17577)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17556)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25206) Add primary key for partial metadata script

2021-06-06 Thread zhangbutao (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-25206:
--
Fix Version/s: 4.0.0

> Add primary key for partial metadata script
> ---
>
> Key: HIVE-25206
> URL: https://issues.apache.org/jira/browse/HIVE-25206
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
> {code}
> Some metadata tables in hive-schema-4.0.0.mysql.sql dont't have primary key. 
> Eg *TXN_COMPONENTS* and  *COMPLETED_TXN_COMPONENTS* . This will cause 
> exception when backend mysql set some strict parameters such as '*global 
> pxc_strict_mode='ENFORCING*''.
> {code:java}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to 
> clean up java.sql.SQLException: Percona-XtraDB-Cluster prohibits use of DML 
> command on a table (hive4s.txn_components) without an explicit primary key 
> with pxc_strict_mode = ENFORCING or MASTER
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2814)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1813)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1727)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.ProxyStatement.executeUpdate(ProxyStatement.java:117)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.HikariProxyStatement.executeUpdate(HikariProxyStatement.java)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3962)
> at 
> org.apache.hadoop.hive.metastore.AcidEventListener.onDropDatabase(AcidEventListener.java:58)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$23.notify(MetaStoreListenerNotifier.java:94)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:305)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database_core(HMSHandler.java:1893)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database(HMSHandler.java:1954)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy28.drop_database(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17577)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17556)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25206) Add primary key for partial metadata script

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25206?focusedWorklogId=607615&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607615
 ]

ASF GitHub Bot logged work on HIVE-25206:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 04:29
Start Date: 07/Jun/21 04:29
Worklog Time Spent: 10m 
  Work Description: butaozhang opened a new pull request #2355:
URL: https://github.com/apache/hive/pull/2355


   
   ### What changes were proposed in this pull request?
   Add primary key for partial metadata script
   
   
   ### Why are the changes needed?
   Refer to  HIVE-25206
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Local test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607615)
Remaining Estimate: 0h
Time Spent: 10m

> Add primary key for partial metadata script
> ---
>
> Key: HIVE-25206
> URL: https://issues.apache.org/jira/browse/HIVE-25206
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
> {code}
> Some metadata tables in hive-schema-4.0.0.mysql.sql dont't have primary key. 
> Eg *TXN_COMPONENTS* and  *COMPLETED_TXN_COMPONENTS* . This will cause 
> exception when backend mysql set some strict parameters such as '*global 
> pxc_strict_mode='ENFORCING*''.
> {code:java}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to 
> clean up java.sql.SQLException: Percona-XtraDB-Cluster prohibits use of DML 
> command on a table (hive4s.txn_components) without an explicit primary key 
> with pxc_strict_mode = ENFORCING or MASTER
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2814)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1813)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1727)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.ProxyStatement.executeUpdate(ProxyStatement.java:117)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.HikariProxyStatement.executeUpdate(HikariProxyStatement.java)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3962)
> at 
> org.apache.hadoop.hive.metastore.AcidEventListener.onDropDatabase(AcidEventListener.java:58)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$23.notify(MetaStoreListenerNotifier.java:94)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:305)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database_core(HMSHandler.java:1893)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database(HMSHandler.java:1954)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy28.drop_database(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17577)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17556)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBas

[jira] [Updated] (HIVE-25206) Add primary key for partial metadata script

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25206:
--
Labels: pull-request-available  (was: )

> Add primary key for partial metadata script
> ---
>
> Key: HIVE-25206
> URL: https://issues.apache.org/jira/browse/HIVE-25206
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
> {code}
> Some metadata tables in hive-schema-4.0.0.mysql.sql dont't have primary key. 
> Eg *TXN_COMPONENTS* and  *COMPLETED_TXN_COMPONENTS* . This will cause 
> exception when backend mysql set some strict parameters such as '*global 
> pxc_strict_mode='ENFORCING*''.
> {code:java}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to 
> clean up java.sql.SQLException: Percona-XtraDB-Cluster prohibits use of DML 
> command on a table (hive4s.txn_components) without an explicit primary key 
> with pxc_strict_mode = ENFORCING or MASTER
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2814)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1813)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1727)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.ProxyStatement.executeUpdate(ProxyStatement.java:117)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.HikariProxyStatement.executeUpdate(HikariProxyStatement.java)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3962)
> at 
> org.apache.hadoop.hive.metastore.AcidEventListener.onDropDatabase(AcidEventListener.java:58)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$23.notify(MetaStoreListenerNotifier.java:94)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:305)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database_core(HMSHandler.java:1893)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database(HMSHandler.java:1954)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy28.drop_database(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17577)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17556)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25206) Add primary key for partial metadata script

2021-06-06 Thread zhangbutao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358347#comment-17358347
 ] 

zhangbutao commented on HIVE-25206:
---

We should add primary key for all of metadata tables in 
hive-schema-4.0.0.mysql.sql. Metadata tables without primary key as follows:
{code:java}
MV_TABLES_USED
TXN_COMPONENTS
COMPLETED_TXN_COMPONENTS
TXN_LOCK_TBL
NEXT_LOCK_ID
NEXT_COMPACTION_QUEUE_ID
WRITE_SET
TXN_TO_WRITE_ID
NEXT_WRITE_ID
{code}

> Add primary key for partial metadata script
> ---
>
> Key: HIVE-25206
> URL: https://issues.apache.org/jira/browse/HIVE-25206
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>
>  
> {code:java}
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
> {code}
> Some metadata tables in hive-schema-4.0.0.mysql.sql dont't have primary key. 
> Eg *TXN_COMPONENTS* and  *COMPLETED_TXN_COMPONENTS* . This will cause 
> exception when backend mysql set some strict parameters such as '*global 
> pxc_strict_mode='ENFORCING*''.
> {code:java}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to 
> clean up java.sql.SQLException: Percona-XtraDB-Cluster prohibits use of DML 
> command on a table (hive4s.txn_components) without an explicit primary key 
> with pxc_strict_mode = ENFORCING or MASTER
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2814)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1813)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1727)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.ProxyStatement.executeUpdate(ProxyStatement.java:117)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.HikariProxyStatement.executeUpdate(HikariProxyStatement.java)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3962)
> at 
> org.apache.hadoop.hive.metastore.AcidEventListener.onDropDatabase(AcidEventListener.java:58)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$23.notify(MetaStoreListenerNotifier.java:94)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:305)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database_core(HMSHandler.java:1893)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database(HMSHandler.java:1954)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy28.drop_database(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17577)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17556)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
>

[jira] [Assigned] (HIVE-25206) Add primary key for partial metadata script

2021-06-06 Thread zhangbutao (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao reassigned HIVE-25206:
-

Assignee: zhangbutao

> Add primary key for partial metadata script
> ---
>
> Key: HIVE-25206
> URL: https://issues.apache.org/jira/browse/HIVE-25206
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>
>  
> {code:java}
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
> {code}
> Some metadata tables in hive-schema-4.0.0.mysql.sql dont't have primary key. 
> Eg *TXN_COMPONENTS* and  *COMPLETED_TXN_COMPONENTS* . This will cause 
> exception when backend mysql set some strict parameters such as '*global 
> pxc_strict_mode='ENFORCING*''.
> {code:java}
> Caused by: org.apache.hadoop.hive.metastore.api.MetaException: Unable to 
> clean up java.sql.SQLException: Percona-XtraDB-Cluster prohibits use of DML 
> command on a table (hive4s.txn_components) without an explicit primary key 
> with pxc_strict_mode = ENFORCING or MASTER
> at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1078)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4187)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4119)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2570)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2731)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2814)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1813)
> at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1727)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.ProxyStatement.executeUpdate(ProxyStatement.java:117)
> at 
> org.apache.hive.com.zaxxer.hikari.pool.HikariProxyStatement.executeUpdate(HikariProxyStatement.java)
> at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3962)
> at 
> org.apache.hadoop.hive.metastore.AcidEventListener.onDropDatabase(AcidEventListener.java:58)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$23.notify(MetaStoreListenerNotifier.java:94)
> at 
> org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:305)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database_core(HMSHandler.java:1893)
> at 
> org.apache.hadoop.hive.metastore.HMSHandler.drop_database(HMSHandler.java:1954)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy28.drop_database(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17577)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$drop_database.getResult(ThriftHiveMetastore.java:17556)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
> at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:313)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25205) Reduce overhead of adding write notification log during batch loading of partition.

2021-06-06 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25205:
---
Labels: performance  (was: )

> Reduce overhead of adding write notification log during batch loading of 
> partition.
> ---
>
> Key: HIVE-25205
> URL: https://issues.apache.org/jira/browse/HIVE-25205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: performance
>
> During batch loading of partition the write notification logs are added for 
> each partition added. This is causing delay in execution as the call to HMS 
> is done for each partition. This can be optimised by adding a new API in HMS 
> to send a batch of partition and then this batch can be added together to the 
> backend database. Once we have a batch of notification log, at HMS side, code 
> can be optimised to add the logs using single call to backend RDBMS. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-06 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25204:
---
Labels: perfomance  (was: )

> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: perfomance
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25205) Reduce overhead of adding write notification log during batch loading of partition.

2021-06-06 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-25205:
--


> Reduce overhead of adding write notification log during batch loading of 
> partition.
> ---
>
> Key: HIVE-25205
> URL: https://issues.apache.org/jira/browse/HIVE-25205
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> During batch loading of partition the write notification logs are added for 
> each partition added. This is causing delay in execution as the call to HMS 
> is done for each partition. This can be optimised by adding a new API in HMS 
> to send a batch of partition and then this batch can be added together to the 
> backend database. Once we have a batch of notification log, at HMS side, code 
> can be optimised to add the logs using single call to backend RDBMS. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-25204) Reduce overhead of adding notification log for update partition column statistics

2021-06-06 Thread mahesh kumar behera (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-25204:
--


> Reduce overhead of adding notification log for update partition column 
> statistics
> -
>
> Key: HIVE-25204
> URL: https://issues.apache.org/jira/browse/HIVE-25204
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> The notification logs for partition column statistics can be optimised by 
> adding them in batch. In the current implementation its done one by one 
> causing multiple sql execution in the backend RDBMS. These SQL executions can 
> be batched to reduce the execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24390) Spelling

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24390?focusedWorklogId=607580&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607580
 ]

ASF GitHub Bot logged work on HIVE-24390:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:25
Start Date: 07/Jun/21 00:25
Worklog Time Spent: 10m 
  Work Description: jsoref commented on pull request #1674:
URL: https://github.com/apache/hive/pull/1674#issuecomment-855492358


   I'm not sure what to do about this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607580)
Time Spent: 1h  (was: 50m)

> Spelling
> 
>
> Key: HIVE-24390
> URL: https://issues.apache.org/jira/browse/HIVE-24390
> Project: Hive
>  Issue Type: Bug
>Reporter: Josh Soref
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24463) Add special case for Derby and MySQL in Get Next ID DbNotificationListener

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24463?focusedWorklogId=607576&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607576
 ]

ASF GitHub Bot logged work on HIVE-24463:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1727:
URL: https://github.com/apache/hive/pull/1727#issuecomment-855489373


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607576)
Time Spent: 50m  (was: 40m)

> Add special case for Derby and MySQL in Get Next ID DbNotificationListener
> --
>
> Key: HIVE-24463
> URL: https://issues.apache.org/jira/browse/HIVE-24463
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> * Derby does not support {{SELECT FOR UPDATE}} statements
>  * MySQL can be optimized to use {{LAST_INSERT_ID()}}
>  
> Debry tables are locked in other parts of the code already, but not in this 
> path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24804) Introduce check: RANGE with offset PRECEDING/FOLLOWING requires exactly one ORDER BY column

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24804?focusedWorklogId=607574&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607574
 ]

ASF GitHub Bot logged work on HIVE-24804:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2000:
URL: https://github.com/apache/hive/pull/2000#issuecomment-855489248


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607574)
Time Spent: 20m  (was: 10m)

> Introduce check: RANGE with offset PRECEDING/FOLLOWING requires exactly one 
> ORDER BY column
> ---
>
> Key: HIVE-24804
> URL: https://issues.apache.org/jira/browse/HIVE-24804
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, in Hive, we can run a windowing function with range specification 
> but without an ORDER BY clause:
> {code}
> create table vector_ptf_part_simple_text(p_mfgr string, p_name string, 
> p_retailprice double, rowindex string);
> select p_mfgr, p_name, rowindex,
> count(*) over(partition by p_mfgr range between 1 preceding and current row) 
> as cs1,
> count(*) over(partition by p_mfgr range between 3 preceding and current row) 
> as cs2
> from vector_ptf_part_simple_text;
> {code}
> This is confusing, because without an order by clause, the range is out of 
> context, we don't know by which column should we calculate the range.
> Tested on Postgres, it throws an exception:
> {code}
> create table vector_ptf_part_simple_text(p_mfgr varchar(10), p_name 
> varchar(10), p_retailprice integer, rowindex varchar(10));
> select p_mfgr, p_name, rowindex,
> count(*) over(partition by p_mfgr range between 1 preceding and current row) 
> as cs1,
> count(*) over(partition by p_mfgr range between 3 preceding and current row) 
> as cs2
> from vector_ptf_part_simple_text;
> *RANGE with offset PRECEDING/FOLLOWING requires exactly one ORDER BY column*
> {code}
> further references:
> https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts
> {code}
> RANGE: Computes the window frame based on a logical range of rows around the 
> current row, based on the current row’s ORDER BY key value. The provided 
> range value is added or subtracted to the current row's key value to define a 
> starting or ending range boundary for the window frame. In a range-based 
> window frame, there must be exactly one expression in the ORDER BY clause, 
> and the expression must have a numeric type.
> {code}
> https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/window-functions-frames.html
> {code}
> Without ORDER BY: The default frame includes all partition rows (because, 
> without ORDER BY, all partition rows are peers). The default is equivalent to 
> this frame specification:
> RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
> {code}
> I believe this one could only make sense if you don't specify range, 
> otherwise the sql statement reflects a different thing from which is returned 
> by the engine



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23785) Databases, Catalogs and Partitions should have unique id

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23785?focusedWorklogId=607570&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607570
 ]

ASF GitHub Bot logged work on HIVE-23785:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1935:
URL: https://github.com/apache/hive/pull/1935#issuecomment-855489326


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607570)
Time Spent: 0.5h  (was: 20m)

> Databases, Catalogs and Partitions should have unique id
> 
>
> Key: HIVE-23785
> URL: https://issues.apache.org/jira/browse/HIVE-23785
> Project: Hive
>  Issue Type: Improvement
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-20556 introduced a id field to the Table object. This is a useful 
> information since a table which is dropped and recreated with the same name 
> will have a different Id. If a HMS client is caching such table object, it 
> can be used to determine if the table which is present on the client-side 
> matches with the one in the HMS.
> We can expand this idea to other HMS objects like Database, Catalogs and 
> Partitions and add a new id field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24707) Apply Sane Default for Tez Containers as Last Resort

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24707?focusedWorklogId=607566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607566
 ]

ASF GitHub Bot logged work on HIVE-24707:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1933:
URL: https://github.com/apache/hive/pull/1933#issuecomment-855489341


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607566)
Time Spent: 3h 50m  (was: 3h 40m)

> Apply Sane Default for Tez Containers as Last Resort
> 
>
> Key: HIVE-24707
> URL: https://issues.apache.org/jira/browse/HIVE-24707
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: Panagiotis Garefalakis
>Priority: Trivial
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> {code:java|title=DagUtils.java}
> public static Resource getContainerResource(Configuration conf) {
> int memory = HiveConf.getIntVar(conf, 
> HiveConf.ConfVars.HIVETEZCONTAINERSIZE) > 0 ?
>   HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVETEZCONTAINERSIZE) :
>   conf.getInt(MRJobConfig.MAP_MEMORY_MB, 
> MRJobConfig.DEFAULT_MAP_MEMORY_MB);
> int cpus = HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVETEZCPUVCORES) > 
> 0 ?
>   HiveConf.getIntVar(conf, HiveConf.ConfVars.HIVETEZCPUVCORES) :
>   conf.getInt(MRJobConfig.MAP_CPU_VCORES, 
> MRJobConfig.DEFAULT_MAP_CPU_VCORES);
> return Resource.newInstance(memory, cpus);
>   }
> {code}
> If Tez Container Size or VCores is an invalid value ( <= 0 ) then it falls 
> back onto the MapReduce configurations, but if the MapReduce configurations 
> have invalid values ( <= 0 ), they are excepted regardless and this will 
> cause failures down the road.
> This code should also check the MapReduce values and fall back to MapReduce 
> default values if they are <= 0.
> Also, some logging would be nice here too, reporting about where the 
> configuration values came from.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24515) Analyze table job can be skipped when stats populated are already accurate

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24515?focusedWorklogId=607564&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607564
 ]

ASF GitHub Bot logged work on HIVE-24515:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1834:
URL: https://github.com/apache/hive/pull/1834#issuecomment-855489364


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607564)
Time Spent: 2h 50m  (was: 2h 40m)

> Analyze table job can be skipped when stats populated are already accurate
> --
>
> Key: HIVE-24515
> URL: https://issues.apache.org/jira/browse/HIVE-24515
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> For non-partitioned tables, stats detail should be present in table level,
> e.g
> {noformat}
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"d_current_day":"true"...
>  }}
>   {noformat}
> For partitioned tables, stats detail should be present in partition level,
> {noformat}
> store_sales(ss_sold_date_sk=2451819)
> {totalSize=0, numRows=0, rawDataSize=0, 
> COLUMN_STATS_ACCURATE={"BASIC_STATS":"true","COLUMN_STATS":{"ss_addr_sk":"true"}}
>  
>  {noformat}
> When stats populated are already accurate, {{analyze table tn compute 
> statistics for columns}} should skip launching the job.
>  
> For ACID tables, stats are auto computed and it can skip computing stats 
> again when stats are accurate.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24390) Spelling

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24390?focusedWorklogId=607575&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607575
 ]

ASF GitHub Bot logged work on HIVE-24390:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1674:
URL: https://github.com/apache/hive/pull/1674#issuecomment-855489379


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607575)
Time Spent: 50m  (was: 40m)

> Spelling
> 
>
> Key: HIVE-24390
> URL: https://issues.apache.org/jira/browse/HIVE-24390
> Project: Hive
>  Issue Type: Bug
>Reporter: Josh Soref
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=607565&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607565
 ]

ASF GitHub Bot logged work on HIVE-24590:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1849:
URL: https://github.com/apache/hive/pull/1849#issuecomment-855489358


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607565)
Time Spent: 1h 50m  (was: 1h 40m)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24817) "not in" clause returns incorrect data when there is coercion

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24817?focusedWorklogId=607568&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607568
 ]

ASF GitHub Bot logged work on HIVE-24817:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2012:
URL: https://github.com/apache/hive/pull/2012#issuecomment-855489227


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607568)
Time Spent: 2h 50m  (was: 2h 40m)

> "not in" clause returns incorrect data when there is coercion
> -
>
> Key: HIVE-24817
> URL: https://issues.apache.org/jira/browse/HIVE-24817
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Steve Carlin
>Assignee: Steve Carlin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When the query has a where clause that has an integer column checking against 
> being "not in" a decimal column, the decimal column is being changed to null, 
> causing incorrect results.
> This is a sample query of a failure:
> select count(*) from my_tbl where int_col not in (355.8);
> Since the int_col can never be 355.8, one would expect all the rows to be 
> returned, but it is changing the 355.8 into a null value causing no rows to 
> be returned.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24310) Allow specified number of deserialize errors to be ignored

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24310?focusedWorklogId=607569&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607569
 ]

ASF GitHub Bot logged work on HIVE-24310:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1607:
URL: https://github.com/apache/hive/pull/1607#issuecomment-855489399


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607569)
Time Spent: 2h 20m  (was: 2h 10m)

> Allow specified number of deserialize errors to be ignored
> --
>
> Key: HIVE-24310
> URL: https://issues.apache.org/jira/browse/HIVE-24310
> Project: Hive
>  Issue Type: Improvement
>  Components: Operators
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Sometimes we see some corrupted records in user's raw data,  like one 
> corrupted in a file which contains over thousands of records, user has to 
> either give up all records or replay the whole data in order to run 
> successfully on hive, we should provide a way to ignore such corrupted 
> records. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24830) Revise RowSchema mutability usage

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24830:
--
Labels: pull-request-available  (was: )

> Revise RowSchema mutability usage
> -
>
> Key: HIVE-24830
> URL: https://issues.apache.org/jira/browse/HIVE-24830
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> RowSchema is essentially a container class for a list of fields.
> * it can be constructed from a "list"
> * the list can be set
> * the list can be accessed
> none of the above methods try to protect the data inside; hence the following 
> could easily  happen:
> {code}
> s=o1.getSchema();
> col=s.getCol("favourite")
> col.setInternalName("asd"); // will modify o1 schema
> newSchema.add(col);
> o2.setSchema(newSchema);
> o2.getSchema().get("asd").setInternalName("xxx"); // will modify o1 and o2 
> schema
> [...]
> {code}
> not sure how much of this is actually cruical; exploratory testrun revealed 
> some cases
> https://github.com/apache/hive/pull/2019



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24753) Non blocking DROP PARTITION implementation

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24753?focusedWorklogId=607572&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607572
 ]

ASF GitHub Bot logged work on HIVE-24753:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2017:
URL: https://github.com/apache/hive/pull/2017#issuecomment-855489220


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607572)
Time Spent: 2h 20m  (was: 2h 10m)

> Non blocking DROP PARTITION implementation
> --
>
> Key: HIVE-24753
> URL: https://issues.apache.org/jira/browse/HIVE-24753
> Project: Hive
>  Issue Type: New Feature
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop partition operations in a way that doesn't 
> have to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24778) Unify hive.strict.timestamp.conversion and hive.strict.checks.type.safety properties

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24778?focusedWorklogId=607571&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607571
 ]

ASF GitHub Bot logged work on HIVE-24778:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1982:
URL: https://github.com/apache/hive/pull/1982#issuecomment-855489270


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607571)
Time Spent: 1h 20m  (was: 1h 10m)

> Unify hive.strict.timestamp.conversion and hive.strict.checks.type.safety 
> properties
> 
>
> Key: HIVE-24778
> URL: https://issues.apache.org/jira/browse/HIVE-24778
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0
>Reporter: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The majority of strict type checks can be controlled by 
> {{hive.strict.checks.type.safety}} property. HIVE-24157 introduced another 
> property, namely  {{hive.strict.timestamp.conversion}}, to control the 
> implicit comparisons between numerics and timestamps.
> The name and description of {{hive.strict.checks.type.safety}} imply that the 
> property covers all strict checks so having others for specific cases appears 
> confusing and can easily lead to unexpected behavior.
> The goal of this issue is to unify those properties to facilitate 
> configuration and improve code reuse.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24830) Revise RowSchema mutability usage

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24830?focusedWorklogId=607577&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607577
 ]

ASF GitHub Bot logged work on HIVE-24830:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2019:
URL: https://github.com/apache/hive/pull/2019#issuecomment-855489215


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607577)
Remaining Estimate: 0h
Time Spent: 10m

> Revise RowSchema mutability usage
> -
>
> Key: HIVE-24830
> URL: https://issues.apache.org/jira/browse/HIVE-24830
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> RowSchema is essentially a container class for a list of fields.
> * it can be constructed from a "list"
> * the list can be set
> * the list can be accessed
> none of the above methods try to protect the data inside; hence the following 
> could easily  happen:
> {code}
> s=o1.getSchema();
> col=s.getCol("favourite")
> col.setInternalName("asd"); // will modify o1 schema
> newSchema.add(col);
> o2.setSchema(newSchema);
> o2.getSchema().get("asd").setInternalName("xxx"); // will modify o1 and o2 
> schema
> [...]
> {code}
> not sure how much of this is actually cruical; exploratory testrun revealed 
> some cases
> https://github.com/apache/hive/pull/2019



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24445) Non blocking DROP table implementation

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24445?focusedWorklogId=607578&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607578
 ]

ASF GitHub Bot logged work on HIVE-24445:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2020:
URL: https://github.com/apache/hive/pull/2020#issuecomment-855489209


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607578)
Time Spent: 4h 10m  (was: 4h)

> Non blocking DROP table implementation
> --
>
> Key: HIVE-24445
> URL: https://issues.apache.org/jira/browse/HIVE-24445
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Implement a way to execute drop table operations in a way that doesn't have 
> to wait for currently running read operations to be finished.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24742) Support router path or view fs path in Hive table location

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24742?focusedWorklogId=607573&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607573
 ]

ASF GitHub Bot logged work on HIVE-24742:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1973:
URL: https://github.com/apache/hive/pull/1973#issuecomment-855489281


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607573)
Time Spent: 1h  (was: 50m)

> Support router path or view fs path in Hive table location
> --
>
> Key: HIVE-24742
> URL: https://issues.apache.org/jira/browse/HIVE-24742
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.1.2
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24742.patch
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In 
> [FileUtils.java|https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/FileUtils.java#L747],
>  equalsFileSystem function checks the base URL to determine if source and 
> destination are on the same cluster and decides copy or move the data. That 
> will not work for viewfs or router base file system since 
> viewfs://ns-default/a and viewfs://ns-default/b may be on different physical 
> clusters.
> FileSystem in HDFS supports resolvePath() function to resolve to the physical 
> path. We can support viewfs and router through such function.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24748) The result of TimestampWritableV2.toString() is wrong when year is larger than 9999.

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24748?focusedWorklogId=607567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607567
 ]

ASF GitHub Bot logged work on HIVE-24748:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1953:
URL: https://github.com/apache/hive/pull/1953#issuecomment-855489315


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607567)
Time Spent: 20m  (was: 10m)

> The result of TimestampWritableV2.toString() is wrong when year is larger 
> than .
> 
>
> Key: HIVE-24748
> URL: https://issues.apache.org/jira/browse/HIVE-24748
> Project: Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Affects Versions: All Versions
> Environment: Master branch
>Reporter: Qiang.Kang
>Assignee: Qiang.Kang
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Hi, I found that `TimestampWritableV2.toString()` can be wrong when year is 
> larger than . Here is what I get:
>  - test code:
> {code:java}
> // code placeholder
> @Test
> public void testTimestampWritableV2toString() {
>   TimestampWritableV2 timestampWritableV2 = new TimestampWritableV2(
>   Timestamp.valueOf("10001-01-01 01:01:23"));
>   
>   assertEquals("+10001-01-01 01:01:23", timestampWritableV2.toString());
> }{code}
>  - output:
> {code:java}
> // code placeholder
> org.junit.ComparisonFailure:
>  Expected :+10001-01-01 01:01:23
>  Actual :+10001-01-01 01:01:2323
> {code}
> The patch below removes some 'wrong' code in 
> `TimestampWritableV2.toString()`, for the reason that the length of 
> `org.apache.hadoop.hive.common.type.Timestamp#toString` can be larger than 
> 19, even its nano-of-second is zero.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24240) Implement missing features in UDTFStatsRule

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24240?focusedWorklogId=607563&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607563
 ]

ASF GitHub Bot logged work on HIVE-24240:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1984:
URL: https://github.com/apache/hive/pull/1984#issuecomment-855489263


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607563)
Time Spent: 0.5h  (was: 20m)

> Implement missing features in UDTFStatsRule
> ---
>
> Key: HIVE-24240
> URL: https://issues.apache.org/jira/browse/HIVE-24240
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: okumin
>Assignee: okumin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add the following steps.
>  * Handle the case in which the num row will be zero
>  * Compute runtime stats in case of a re-execution



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24073) Execution exception in sort-merge semijoin

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24073?focusedWorklogId=607559&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607559
 ]

ASF GitHub Bot logged work on HIVE-24073:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1476:
URL: https://github.com/apache/hive/pull/1476#issuecomment-855489405


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607559)
Time Spent: 1.5h  (was: 1h 20m)

> Execution exception in sort-merge semijoin
> --
>
> Key: HIVE-24073
> URL: https://issues.apache.org/jira/browse/HIVE-24073
> Project: Hive
>  Issue Type: Bug
>  Components: Operators
>Reporter: Jesus Camacho Rodriguez
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Working on HIVE-24041, we trigger an additional SJ conversion that leads to 
> this exception at execution time:
> {code}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite 
> nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1063)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:685)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:707)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:462)
>   ... 16 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to overwrite 
> nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1037)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1060)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Attempting to 
> overwrite nextKeyWritables[1]
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.processKey(CommonMergeJoinOperator.java:564)
>   at 
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.process(CommonMergeJoinOperator.java:243)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887)
>   at 
> org.apache.hadoop.hive.ql.exec.TezDummyStoreOperator.process(TezDummyStoreOperator.java:49)
>   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:887)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1003)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1020)
>   ... 23 more
> {code}
> To reproduce, just set {{hive.auto.convert.sortmerge.join}} to {{true}} in 
> the last query in {{auto_sortmerge_join_10.q}} after HIVE-24041 has been 
> merged.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24815) Remove "IDXS" Table from Metastore Schema

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24815?focusedWorklogId=607562&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607562
 ]

ASF GitHub Bot logged work on HIVE-24815:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:18
Start Date: 07/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2010:
URL: https://github.com/apache/hive/pull/2010#issuecomment-855489234


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607562)
Time Spent: 20m  (was: 10m)

> Remove "IDXS" Table from Metastore Schema
> -
>
> Key: HIVE-24815
> URL: https://issues.apache.org/jira/browse/HIVE-24815
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore, Standalone Metastore
>Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 3.2.0, 4.0.0
>Reporter: Hunter Logan
>Assignee: Hunter Logan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In Hive 3 the rarely used "INDEXES" was removed from the DDL
> https://issues.apache.org/jira/browse/HIVE-18448
>  
> There are a few issues here:
>  # The Standalone-Metastore schema for Hive 3+ all include the "IDXS" table, 
> which has no function.
>  ** 
> [https://github.com/apache/hive/tree/master/standalone-metastore/metastore-server/src/main/sql/mysql]
>  # The upgrade schemas from 2.x -> 3.x do not do any cleanup of the IDXS table
>  ** If a user used the "INDEXES" feature in 2.x and then upgrades their 
> metastore to 3.x+ they cannot drop any table that has an index on it due to 
> "IDXS_FK1" constraint since the TBLS entry is referenced in the IDXS table
>  ** Since INDEX is no longer in the DDL they cannot run any command from Hive 
> to drop the index.
>  ** Users can manually connect to the metastore and either drop the IDXS 
> table or the foreign key constraint
>  
> Since indexes provide no benefits in Hive 3+ it should be fine to drop them 
> completely in the schema upgrade scripts. At the very least the 2.x -> 3.x+ 
> scripts should drop the fk constraint.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24762) StringValueBoundaryScanner ignores boundary which leads to incorrect results

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24762?focusedWorklogId=607556&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607556
 ]

ASF GitHub Bot logged work on HIVE-24762:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1965:
URL: https://github.com/apache/hive/pull/1965#issuecomment-855489287


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607556)
Time Spent: 20m  (was: 10m)

>  StringValueBoundaryScanner ignores boundary which leads to incorrect results
> -
>
> Key: HIVE-24762
> URL: https://issues.apache.org/jira/browse/HIVE-24762
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/udf/ptf/ValueBoundaryScanner.java#L901
> {code}
>   public boolean isDistanceGreater(Object v1, Object v2, int amt) {
> ...
> return s1 != null && s2 != null && s1.compareTo(s2) > 0;
> {code}
> Like other boundary scanners, StringValueBoundaryScanner should take amt into 
> account, otherwise it'll result in the same range regardless of the given 
> window size. This typically affects queries where the range is defined on a 
> string column:
> {code}
> select p_mfgr, p_name, p_retailprice,
> count(*) over(partition by p_mfgr order by p_name range between 1 preceding 
> and current row) as cs1,
> count(*) over(partition by p_mfgr order by p_name range between 3 preceding 
> and current row) as cs2
> from vector_ptf_part_simple_orc;
> {code} 
> with "> 0" cs1 and cs2 will be calculated on the same window, so cs1 == cs2, 
> but actually it should be different, this is the correct result (see "almond 
> antique olive coral navajo"):
> {code}
> +-+-+--+--+
> | p_mfgr  |   p_name| cs1  | cs2  
> |
> +-+-+--+--+
> | Manufacturer#1  | almond antique burnished rose metallic  | 2| 2
> |
> | Manufacturer#1  | almond antique burnished rose metallic  | 2| 2
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6| 6
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6| 6
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6| 6
> |
> | Manufacturer#1  | almond antique chartreuse lavender yellow   | 6| 6
> |
> | Manufacturer#1  | almond antique salmon chartreuse burlywood  | 1| 1
> |
> | Manufacturer#1  | almond aquamarine burnished black steel | 1| 8
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle | 4| 4
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle | 4| 4
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle | 4| 4
> |
> | Manufacturer#1  | almond aquamarine pink moccasin thistle | 4| 4
> |
> | Manufacturer#2  | almond antique violet chocolate turquoise   | 1| 1
> |
> | Manufacturer#2  | almond antique violet turquoise frosted | 3| 3
> |
> | Manufacturer#2  | almond antique violet turquoise frosted | 3| 3
> |
> | Manufacturer#2  | almond antique violet turquoise frosted | 3| 3
> |
> | Manufacturer#2  | almond aquamarine midnight light salmon | 1| 5
> |
> | Manufacturer#2  | almond aquamarine rose maroon antique   | 2| 2
> |
> | Manufacturer#2  | almond aquamarine rose maroon antique   | 2| 2
> |
> | Manufacturer#2  | almond aquamarine sandy cyan gainsboro  | 3| 3
> |
> | Manufacturer#3  | almond antique chartreuse khaki white   | 1| 1
> |
> | Manufacturer#3  | almond antique forest lavender goldenrod| 4| 5
> |
> | Manufacturer#3  | almond antique forest lavender goldenrod| 4| 5
> |
> | Manufacturer#3  | almond antique forest lavender goldenr

[jira] [Work logged] (HIVE-24313) Optimise stats collection for file sizes on cloud storage

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24313?focusedWorklogId=607558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607558
 ]

ASF GitHub Bot logged work on HIVE-24313:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1636:
URL: https://github.com/apache/hive/pull/1636#issuecomment-855489390


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607558)
Time Spent: 1h 40m  (was: 1.5h)

> Optimise stats collection for file sizes on cloud storage
> -
>
> Key: HIVE-24313
> URL: https://issues.apache.org/jira/browse/HIVE-24313
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When stats information is not present (e.g external table), RelOptHiveTable 
> computes basic stats at runtime.
> Following is the codepath.
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/RelOptHiveTable.java#L598]
> {code:java}
> Statistics stats = StatsUtils.collectStatistics(hiveConf, partitionList,
> hiveTblMetadata, hiveNonPartitionCols, 
> nonPartColNamesThatRqrStats, colStatsCached,
> nonPartColNamesThatRqrStats, true);
>  {code}
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L322]
> {code:java}
> for (Partition p : partList.getNotDeniedPartns()) {
> BasicStats basicStats = 
> basicStatsFactory.build(Partish.buildFor(table, p));
> partStats.add(basicStats);
>   }
>  {code}
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/stats/BasicStats.java#L205]
>  
> {code:java}
> try {
> ds = getFileSizeForPath(path);
>   } catch (IOException e) {
> ds = 0L;
>   }
>  {code}
>  
> For a table & query with large number of partitions, this takes long time to 
> compute statistics and increases compilation time.  It would be good to fix 
> it with "ForkJoinPool" ( 
> partList.getNotDeniedPartns().parallelStream().forEach((p) )
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24834) Cannot add comment for kafka table

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24834?focusedWorklogId=607553&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607553
 ]

ASF GitHub Bot logged work on HIVE-24834:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2028:
URL: https://github.com/apache/hive/pull/2028#issuecomment-855489202


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607553)
Time Spent: 1h  (was: 50m)

> Cannot add comment for kafka table
> --
>
> Key: HIVE-24834
> URL: https://issues.apache.org/jira/browse/HIVE-24834
> Project: Hive
>  Issue Type: Bug
>  Components: kafka integration
>Affects Versions: 2.3.7
>Reporter: ChangjiGuo
>Assignee: ChangjiGuo
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> when using kafka-handler to create a kafka table, no matter whether the user 
> specifies column comment or not, the comment will become 'from deserializer' 
> when the 'show create table' command is used to view the table structure.
> You can refer to the following example:
> {code:sql}
> CREATE EXTERNAL TABLE `kafka_table`(
>   `id` string, 
>   `info` string comment 'comment 1', 
>   `jsoninfo` struct comment 'comment 2')
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.kafka.KafkaSerDe' 
> STORED BY 
>   'org.apache.hadoop.hive.kafka.KafkaStorageHandler' 
> WITH SERDEPROPERTIES ( 
>   'serialization.format'='1')
> LOCATION
>   'hdfs://offlinehdfs/user/hive/warehouse/kafka_table'
> TBLPROPERTIES (
>   'properties.bootstrap.servers'='', 
>   'topic'=''
>   ..)
> {code}
> The result is as follows:
> {code:sql}
> CREATE EXTERNAL TABLE `kafka_table`(
>   `id` string COMMENT 'from deserializer', 
>   `info` string COMMENT 'from deserializer', 
>   `jsoninfo` struct COMMENT 'from deserializer', 
>   `__key` binary COMMENT 'from deserializer', 
>   `__partition` int COMMENT 'from deserializer', 
>   `__offset` bigint COMMENT 'from deserializer', 
>   `__timestamp` bigint COMMENT 'from deserializer')
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.kafka.KafkaSerDe' 
> STORED BY 
>   'org.apache.hadoop.hive.kafka.KafkaStorageHandler' 
> WITH SERDEPROPERTIES ( 
>   'serialization.format'='1')
> LOCATION
>   'hdfs://offlinehdfs/user/hive/warehouse/kafka_table'
> TBLPROPERTIES (
>   ...)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24860) Shut down DbTxnManager heartbeatExecutorService at session close

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24860?focusedWorklogId=607555&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607555
 ]

ASF GitHub Bot logged work on HIVE-24860:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2052:
URL: https://github.com/apache/hive/pull/2052#issuecomment-855489155


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607555)
Time Spent: 1h 40m  (was: 1.5h)

> Shut down DbTxnManager heartbeatExecutorService at session close
> 
>
> Key: HIVE-24860
> URL: https://issues.apache.org/jira/browse/HIVE-24860
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> If the HeartBeaterExecutorService isn't shut down, UDF classes belonging to 
> the session's UDFClassLoader (which was the session thread's context 
> classloader at HeartBeaterExecutorService creation) could pile up and cause 
> Metaspace OOM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24846) Log When HS2 Goes OOM

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24846?focusedWorklogId=607552&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607552
 ]

ASF GitHub Bot logged work on HIVE-24846:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2038:
URL: https://github.com/apache/hive/pull/2038#issuecomment-855489192


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607552)
Time Spent: 40m  (was: 0.5h)

> Log When HS2 Goes OOM
> -
>
> Key: HIVE-24846
> URL: https://issues.apache.org/jira/browse/HIVE-24846
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Otherwise the server just shuts down without any justification.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24820) MaterializedViewCache enables adding multiple entries of the same Materialization instance

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24820?focusedWorklogId=607549&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607549
 ]

ASF GitHub Bot logged work on HIVE-24820:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2030:
URL: https://github.com/apache/hive/pull/2030#issuecomment-855489198


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607549)
Time Spent: 1h 10m  (was: 1h)

> MaterializedViewCache enables adding multiple entries of the same 
> Materialization instance
> --
>
> Key: HIVE-24820
> URL: https://issues.apache.org/jira/browse/HIVE-24820
> Project: Hive
>  Issue Type: Bug
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24522) Support DATE and VARCHAR column type in Kudu SerDe

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24522?focusedWorklogId=607560&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607560
 ]

ASF GitHub Bot logged work on HIVE-24522:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1997:
URL: https://github.com/apache/hive/pull/1997#issuecomment-855489252


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607560)
Time Spent: 50m  (was: 40m)

> Support DATE and VARCHAR column type in Kudu SerDe
> --
>
> Key: HIVE-24522
> URL: https://issues.apache.org/jira/browse/HIVE-24522
> Project: Hive
>  Issue Type: Sub-task
>  Components: Hive
>Affects Versions: 2.0.0
>Reporter: Greg Solovyev
>Assignee: Grant Henke
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> DATE type was added to Kudu in Kudu 1.12 (KUDU-2632)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24798) refactor TxnHandler.cleanupRecords to use predefined query strings

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24798?focusedWorklogId=607561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607561
 ]

ASF GitHub Bot logged work on HIVE-24798:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1996:
URL: https://github.com/apache/hive/pull/1996#issuecomment-855489255


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607561)
Time Spent: 20m  (was: 10m)

> refactor TxnHandler.cleanupRecords to use predefined query strings
> --
>
> Key: HIVE-24798
> URL: https://issues.apache.org/jira/browse/HIVE-24798
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> TxnHandler.cleanupRecords should use predefined query strings instead of 
> using a stringbuffer to build the delete queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24814) Harmonize Hive Date-Time Formats

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24814?focusedWorklogId=607557&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607557
 ]

ASF GitHub Bot logged work on HIVE-24814:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2009:
URL: https://github.com/apache/hive/pull/2009#issuecomment-855489242


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607557)
Time Spent: 1h 20m  (was: 1h 10m)

> Harmonize Hive Date-Time Formats
> 
>
> Key: HIVE-24814
> URL: https://issues.apache.org/jira/browse/HIVE-24814
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Harmonize Hive on JDK date-time formats courtesy of {{DateTimeFormatter}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22601) Some columns will be lost when a UDTF has multiple aliases in some cases

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22601?focusedWorklogId=607554&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607554
 ]

ASF GitHub Bot logged work on HIVE-22601:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2042:
URL: https://github.com/apache/hive/pull/2042#issuecomment-855489180


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607554)
Time Spent: 50m  (was: 40m)

> Some columns will be lost when a UDTF has multiple aliases in some cases
> 
>
> Key: HIVE-22601
> URL: https://issues.apache.org/jira/browse/HIVE-22601
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.1.1, 2.2.0, 2.3.6, 3.1.2
>Reporter: okumin
>Assignee: okumin
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22601.1.patch, HIVE-22601.2.patch, 
> HIVE-22601.3.patch, HIVE-22601.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Only one column will be retained when putting UDTFs with multiple aliases and 
> a top-level UNION together.
> For example, the result of the following SQL should have three columns, c1, 
> c2 and c3.
> {code:java}
> SELECT stack(1, 'a', 'b', 'c') AS (c1, c2, c3)
> UNION ALL
> SELECT stack(1, 'd', 'e', 'f') AS (c1, c2, c3);
> {code}
> However, It's only the c3 column which I can get.
> {code:java}
> +-+
> | _u1.c3  |
> +-+
> | c   |
> | f   |
> +-+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24866) FileNotFoundException during alter table concat

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24866?focusedWorklogId=607551&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607551
 ]

ASF GitHub Bot logged work on HIVE-24866:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2057:
URL: https://github.com/apache/hive/pull/2057#issuecomment-855489135


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607551)
Time Spent: 40m  (was: 0.5h)

> FileNotFoundException during alter table concat
> ---
>
> Key: HIVE-24866
> URL: https://issues.apache.org/jira/browse/HIVE-24866
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.4.0, 3.2.0, 4.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Because of the way combinefile IF groups files based on node and rack 
> locality, there are cases where single big orc file gets spread across 2 or 
> more combine hive split. When first task completes, as part of jobCloseOp the 
> source orc file of concatenation is moved/renamed which can lead to 
> FileNotFoundException in subsequent mappers that has partial split of that 
> file. 
> A simple fix would be for the mapper with start of the split to own the 
> entire orc file for concatenation. If a mapper gets partial split which is 
> not the start then it can skip the entire file. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24529) Metastore truncates milliseconds while storing timestamp column stats

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24529?focusedWorklogId=607550&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607550
 ]

ASF GitHub Bot logged work on HIVE-24529:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2041:
URL: https://github.com/apache/hive/pull/2041#issuecomment-855489186


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607550)
Time Spent: 20m  (was: 10m)

> Metastore truncates milliseconds while storing timestamp column stats
> -
>
> Key: HIVE-24529
> URL: https://issues.apache.org/jira/browse/HIVE-24529
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Steps to reproduce the issue:
> create table tnikhil (t timestamp);
> insert into tnikhil values ('2019-01-01 23:12:45.123456');
> analyze table tnikhil compute statistics for columns;
> select * from tnikhil;
> {noformat}
> +-+
> |  tnikhil.t  |
> +-+
> | 2019-01-01 23:12:45.123456  |
> +-+{noformat}
> desc formatted tnikhil t; 
> {noformat}
> +++
> |col_name| data_type  
> |
> +++
> | col_name   | t  
> |
> | data_type  | timestamp  
> |
> | min| 1546384365 
> |
> | max| 1546384365 
> |
> +++
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23391) Change requested lock for ALTER TABLE ADD COLUMN to DDL_SHARED

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23391?focusedWorklogId=607548&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607548
 ]

ASF GitHub Bot logged work on HIVE-23391:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2054:
URL: https://github.com/apache/hive/pull/2054#issuecomment-855489151


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607548)
Time Spent: 0.5h  (was: 20m)

> Change requested lock for ALTER TABLE ADD COLUMN to DDL_SHARED
> --
>
> Key: HIVE-23391
> URL: https://issues.apache.org/jira/browse/HIVE-23391
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Chovan
>Assignee: Zoltan Chovan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23391.2.patch, HIVE-23391.3.patch, 
> HIVE-23391.4.patch, HIVE-23391.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A long running query can block a simple add column query, as the add column 
> will require a DDL_EXCLUSIVE lock currently. By changing this to a shared 
> lock, this metadata only query can be executed without having to wait for the 
> previous query to finish.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24843) Remove unnecessary throw-catch in Deadline

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24843?focusedWorklogId=607547&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607547
 ]

ASF GitHub Bot logged work on HIVE-24843:
-

Author: ASF GitHub Bot
Created on: 07/Jun/21 00:17
Start Date: 07/Jun/21 00:17
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2047:
URL: https://github.com/apache/hive/pull/2047#issuecomment-855489173


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607547)
Time Spent: 20m  (was: 10m)

> Remove unnecessary throw-catch in Deadline
> --
>
> Key: HIVE-24843
> URL: https://issues.apache.org/jira/browse/HIVE-24843
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Miklos Szurap
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-24843.1.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The 
> [Deadline|https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Deadline.java]
>  has a throw-catch which is unnecessary. Previusly HIVE-16450 has refactored 
> most of the exceptions, but missed it at the check() method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25128) Remove Thrift Exceptions From RawStore alterCatalog

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25128?focusedWorklogId=607516&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607516
 ]

ASF GitHub Bot logged work on HIVE-25128:
-

Author: ASF GitHub Bot
Created on: 06/Jun/21 19:34
Start Date: 06/Jun/21 19:34
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on a change in pull request #2291:
URL: https://github.com/apache/hive/pull/2291#discussion_r646175377



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaRuntimeException.java
##
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore;
+
+import com.google.errorprone.annotations.FormatMethod;
+
+/**
+ * This is the root of all Hive Runtime Exceptions. If a client can reasonably
+ * be expected to recover from an exception, make it a checked exception. If a
+ * client cannot do anything to recover from the exception, make it an 
unchecked
+ * exception.
+ */
+public class HiveMetaRuntimeException extends RuntimeException {

Review comment:
   is the plan to use this class as a parent for all runtime exceptions on 
the HMS? I am just wondering why we have HiveMetaRuntimeException and 
HiveMetaDataAccessRuntimeException? Will there be other sub-types for this 
class?

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -656,10 +656,9 @@ public void createCatalog(Catalog cat) throws 
MetaException {
   }
 
   @Override
-  public void alterCatalog(String catName, Catalog cat)
-  throws MetaException, InvalidOperationException {
+  public void alterCatalog(String catName, Catalog cat) {
 if (!cat.getName().equals(catName)) {
-  throw new InvalidOperationException("You cannot change a catalog's 
name");
+  throw new HiveMetaRuntimeException("You cannot change a catalog's name: 
" + cat.getName() + " -> " + catName);

Review comment:
   Not a very enterprise-grade exception message. Should we re-word this to 
something like this? "Catalogs cannot be renamed"

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaDataAccessException.java
##
@@ -0,0 +1,36 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.metastore;
+
+/**
+ * Hive Exception when the metastore's underlying storage mechanism cannot be
+ * accessed.
+ */
+public class HiveMetaDataAccessException extends HiveMetaRuntimeException {

Review comment:
   nit: feels the class name feels incorrect
   1) Suggests that this would be thrown on "access" operations aka read only 
operations. This is being thrown on an alterCatalog() operation which is WRITE 
operation. Would something like HiveMetaDataException be suitable for both 
READ+WRITE operations? (other considerations, HiveMetaDataException, 
HiveAccessException or just MetaDataRuntimeException)
   2) Feels too long

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
##
@@ -656,10 +656,9 @@ public void createCatalog(Catalog cat) throws 
MetaException {
   }
 
   @Override
-  public void alterCatalog(String catName, Catalog cat)

[jira] [Work logged] (HIVE-24135) Drop database doesn't delete directory in managed location

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24135?focusedWorklogId=607501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607501
 ]

ASF GitHub Bot logged work on HIVE-24135:
-

Author: ASF GitHub Bot
Created on: 06/Jun/21 15:49
Start Date: 06/Jun/21 15:49
Worklog Time Spent: 10m 
  Work Description: nrg4878 opened a new pull request #2354:
URL: https://github.com/apache/hive/pull/2354


   
   
   ### What changes were proposed in this pull request?
   The database's managed location is not deleted when the database is dropped. 
This fix addresses the issue.
   
   ### Why are the changes needed?
   Even though db.getManagedLocation() might be null, there is always a default 
managed location for the database. On a drop database, we check to see if this 
location is null and only drop it if not null. But regardless of whether or not 
it is null, there is always an assigned location that needs to be dropped.
   
   ### Does this PR introduce _any_ user-facing change?
   NO
   
   ### How was this patch tested?
   Unit test + manually


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607501)
Time Spent: 1h 10m  (was: 1h)

> Drop database doesn't delete directory in managed location
> --
>
> Key: HIVE-24135
> URL: https://issues.apache.org/jira/browse/HIVE-24135
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Naveen Gangam
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Repro:
>  say the default managed location is managed/hive and the default external 
> location is external/hive.
> {code:java}
> create database db1; -- creates: external/hive/db1.db
> create table db1.table1 (i int); -- creates: managed/hive/db1.db and  
> managed/hive/db1.db/table1
> drop database db1 cascade; -- removes : external/hive/db1.db and 
> managed/hive/db1.db/table1
> {code}
> Problem: Directory managed/hive/db1.db remains.
> Since HIVE-22995, dbs have a managed (managedLocationUri) and an external 
> location (locationUri). I think the issue is that 
> HiveMetaStore.HMSHandler#drop_database_core deletes only the db directory in 
> the external location.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25165) Generate & track statistics per event type for incremental load in replication metrics

2021-06-06 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi resolved HIVE-25165.

Resolution: Fixed

> Generate & track statistics per event type for incremental load in 
> replication metrics
> --
>
> Key: HIVE-25165
> URL: https://issues.apache.org/jira/browse/HIVE-25165
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Generate and track statistics like mean, median. standard deviation, variance 
> etc per event type during incremental load and store them in replication 
> statistics 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25165) Generate & track statistics per event type for incremental load in replication metrics

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25165?focusedWorklogId=607482&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607482
 ]

ASF GitHub Bot logged work on HIVE-25165:
-

Author: ASF GitHub Bot
Created on: 06/Jun/21 10:17
Start Date: 06/Jun/21 10:17
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #2321:
URL: https://github.com/apache/hive/pull/2321


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607482)
Time Spent: 1.5h  (was: 1h 20m)

> Generate & track statistics per event type for incremental load in 
> replication metrics
> --
>
> Key: HIVE-25165
> URL: https://issues.apache.org/jira/browse/HIVE-25165
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Generate and track statistics like mean, median. standard deviation, variance 
> etc per event type during incremental load and store them in replication 
> statistics 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25165) Generate & track statistics per event type for incremental load in replication metrics

2021-06-06 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358051#comment-17358051
 ] 

Aasha Medhi commented on HIVE-25165:


+1 Committed to master. Thank you for the patch [~ayushtkn]

> Generate & track statistics per event type for incremental load in 
> replication metrics
> --
>
> Key: HIVE-25165
> URL: https://issues.apache.org/jira/browse/HIVE-25165
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Generate and track statistics like mean, median. standard deviation, variance 
> etc per event type during incremental load and store them in replication 
> statistics 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-25133) Allow custom configs for database level paths in external table replication

2021-06-06 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi resolved HIVE-25133.

Resolution: Fixed

> Allow custom configs for database level paths in external table replication
> ---
>
> Key: HIVE-25133
> URL: https://issues.apache.org/jira/browse/HIVE-25133
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Allow a way to provide configurations which should be used only by the 
> external data copy task of database level paths



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-25133) Allow custom configs for database level paths in external table replication

2021-06-06 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25133?focusedWorklogId=607481&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607481
 ]

ASF GitHub Bot logged work on HIVE-25133:
-

Author: ASF GitHub Bot
Created on: 06/Jun/21 10:13
Start Date: 06/Jun/21 10:13
Worklog Time Spent: 10m 
  Work Description: aasha merged pull request #2296:
URL: https://github.com/apache/hive/pull/2296


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607481)
Time Spent: 20m  (was: 10m)

> Allow custom configs for database level paths in external table replication
> ---
>
> Key: HIVE-25133
> URL: https://issues.apache.org/jira/browse/HIVE-25133
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Allow a way to provide configurations which should be used only by the 
> external data copy task of database level paths



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25133) Allow custom configs for database level paths in external table replication

2021-06-06 Thread Aasha Medhi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17358050#comment-17358050
 ] 

Aasha Medhi commented on HIVE-25133:


+1 Thank you for the patch [~ayushtkn]

> Allow custom configs for database level paths in external table replication
> ---
>
> Key: HIVE-25133
> URL: https://issues.apache.org/jira/browse/HIVE-25133
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Allow a way to provide configurations which should be used only by the 
> external data copy task of database level paths



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

55 matches

Mail list logo