[jira] [Comment Edited] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo

2021-09-16 Thread Csomor Viktor (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416465#comment-17416465
 ] 

Csomor Viktor edited comment on HIVE-25518 at 9/17/21, 5:54 AM:


This caused a wrong Intellij jar assembly. can't reproduce by using the single 
binary


was (Author: vcsomor):
This caused a wrong Intellij jar assembly

> CompactionTxHandler NPE if no CompactionInfo
> 
>
> Key: HIVE-25518
> URL: https://issues.apache.org/jira/browse/HIVE-25518
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Csomor Viktor
>Assignee: Csomor Viktor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt
>
>
> If no {{CompactionInfo}} is provided to the 
> {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of 
> the method. No information inside the COMPLETED_COMPACTION info.
> Stacktrace:
> {noformat}
> [TThreadPoolServer WorkerProcess-%d] ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116)
>   at 
> org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy13.mark_failed(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo

2021-09-16 Thread Csomor Viktor (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416465#comment-17416465
 ] 

Csomor Viktor edited comment on HIVE-25518 at 9/17/21, 5:54 AM:


This caused a wrong Intellij jar assembly. can't reproduce by using the binary


was (Author: vcsomor):
This caused a wrong Intellij jar assembly. can't reproduce by using the single 
binary

> CompactionTxHandler NPE if no CompactionInfo
> 
>
> Key: HIVE-25518
> URL: https://issues.apache.org/jira/browse/HIVE-25518
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Csomor Viktor
>Assignee: Csomor Viktor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt
>
>
> If no {{CompactionInfo}} is provided to the 
> {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of 
> the method. No information inside the COMPLETED_COMPACTION info.
> Stacktrace:
> {noformat}
> [TThreadPoolServer WorkerProcess-%d] ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116)
>   at 
> org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy13.mark_failed(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo

2021-09-16 Thread Csomor Viktor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csomor Viktor resolved HIVE-25518.
--
Resolution: Won't Fix

This caused a wrong Intellij jar assembly

> CompactionTxHandler NPE if no CompactionInfo
> 
>
> Key: HIVE-25518
> URL: https://issues.apache.org/jira/browse/HIVE-25518
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Csomor Viktor
>Assignee: Csomor Viktor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt
>
>
> If no {{CompactionInfo}} is provided to the 
> {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of 
> the method. No information inside the COMPLETED_COMPACTION info.
> Stacktrace:
> {noformat}
> [TThreadPoolServer WorkerProcess-%d] ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116)
>   at 
> org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy13.mark_failed(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work stopped] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo

2021-09-16 Thread Csomor Viktor (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25518 stopped by Csomor Viktor.

> CompactionTxHandler NPE if no CompactionInfo
> 
>
> Key: HIVE-25518
> URL: https://issues.apache.org/jira/browse/HIVE-25518
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Reporter: Csomor Viktor
>Assignee: Csomor Viktor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt
>
>
> If no {{CompactionInfo}} is provided to the 
> {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of 
> the method. No information inside the COMPLETED_COMPACTION info.
> Stacktrace:
> {noformat}
> [TThreadPoolServer WorkerProcess-%d] ERROR 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler - 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116)
>   at 
> org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy13.mark_failed(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.

2021-09-16 Thread Haymant Mangla (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haymant Mangla updated HIVE-25534:
--
Description: 
Remove the preserve xattr while calling distcp.
{code:java}
2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
[HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least one 
file system: 
 org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
supported for file system: s3a://hmangla1-dev
 at 
org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
 ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
 at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
 at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
 at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
 at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code}

  

  was:
Remove the preserve xattr while calling distcp.
2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
[HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least one 
file system: 
org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
supported for file system: s3a://hmangla1-dev
at 
org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
 ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
at 
org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
 


> Don't preserve FileAttribute.XATTR to initialise distcp.
> 
>
> Key: HIVE-25534
> URL: https://issues.apache.org/jira/browse/HIVE-25534
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>
> Remove the preserve xattr while calling distcp.
> {code:java}
> 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
> [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least 
> one file system: 
>  org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
> supported for file system: s3a://hmangla1-dev
>  at 
> org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
>  ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code}
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.

2021-09-16 Thread Haymant Mangla (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haymant Mangla reassigned HIVE-25534:
-


> Don't preserve FileAttribute.XATTR to initialise distcp.
> 
>
> Key: HIVE-25534
> URL: https://issues.apache.org/jira/browse/HIVE-25534
> Project: Hive
>  Issue Type: Bug
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>
> Remove the preserve xattr while calling distcp.
> 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: 
> [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least 
> one file system: 
> org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not 
> supported for file system: s3a://hmangla1-dev
> at 
> org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513)
>  ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>   at 
> org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>   at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>   at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>   at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) 
> ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) With CBO enabled, Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts

2021-09-16 Thread Needn Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Needn Yu updated HIVE-25533:

Summary: With CBO enabled, Incorrect query result when using where CLAUSE 
to query data from 2 "UNION ALL" parts  (was: Incorrect query result when using 
where CLAUSE to query data from 2 "UNION ALL" parts)

> With CBO enabled, Incorrect query result when using where CLAUSE to query 
> data from 2 "UNION ALL" parts
> ---
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> When querying form a view or CTE which "union all" 2 tables, such as the 
> following script shows
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1 VALUES('needn');
> CREATE VIEW v1 
> AS
> SELECT 'maggie'  AS c1 FROM n1
> UNION ALL
> SELECT c1 FROM n1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1 WHERE c1 = 'maggie';
> SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts

2021-09-16 Thread Needn Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Needn Yu updated HIVE-25533:

Description: 
When querying form a view or CTE which "union all" 2 tables, such as the 
following script shows
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1 VALUES('needn');

CREATE VIEW v1 
AS
SELECT 'maggie'  AS c1 FROM n1
UNION ALL
SELECT c1 FROM n1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1 WHERE c1 = 'maggie';
SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
{code}
 

  was:
When querying form a view or CTE which "union all" 2 tables, such as the 
following script shows
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1 VALUES('needn');

CREATE VIEW v1AS
SELECT 'maggie' FROM n1
UNION ALL
SELECT c1 FROM n1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1 WHERE c1 = 'maggie';
SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
{code}
 


> Incorrect query result when using where CLAUSE to query data from 2 "UNION 
> ALL" parts
> -
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> When querying form a view or CTE which "union all" 2 tables, such as the 
> following script shows
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1 VALUES('needn');
> CREATE VIEW v1 
> AS
> SELECT 'maggie'  AS c1 FROM n1
> UNION ALL
> SELECT c1 FROM n1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1 WHERE c1 = 'maggie';
> SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=652088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652088
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 17/Sep/21 03:49
Start Date: 17/Sep/21 03:49
Worklog Time Spent: 10m 
  Work Description: maheshk114 merged pull request #2645:
URL: https://github.com/apache/hive/pull/2645


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 652088)
Time Spent: 1h 50m  (was: 1h 40m)

> LLAP Scheduler task exits with fatal error if the executor node is down
> ---
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts

2021-09-16 Thread Needn Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Needn Yu updated HIVE-25533:

Description: 
When querying form a view or CTE which "union all" 2 tables, such as the 
following script shows
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1 VALUES('needn');

CREATE VIEW v1AS
SELECT 'maggie' FROM n1
UNION ALL
SELECT c1 FROM n1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1 WHERE c1 = 'maggie';
SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
{code}
 

  was:
When querying form a view or CTE which "union all" 2 tables, such as the 
following script shows
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1 VALUES('needn');

CREATE VIEW v1AS
SELECT 'maggie' FROM n1
UNION ALL
SELECT c1 FROM n1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1 WHERE c1 = 'maggie';
SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1WHERE c1 LIKE 'maggie%';
{code}
 


> Incorrect query result when using where CLAUSE to query data from 2 "UNION 
> ALL" parts
> -
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> When querying form a view or CTE which "union all" 2 tables, such as the 
> following script shows
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1 VALUES('needn');
> CREATE VIEW v1AS
> SELECT 'maggie' FROM n1
> UNION ALL
> SELECT c1 FROM n1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1 WHERE c1 = 'maggie';
> SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1 WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts

2021-09-16 Thread Needn Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Needn Yu updated HIVE-25533:

Description: 
When querying form a view or CTE which "union all" 2 tables, such as the 
following script shows
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1 VALUES('needn');

CREATE VIEW v1AS
SELECT 'maggie' FROM n1
UNION ALL
SELECT c1 FROM n1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1 WHERE c1 = 'maggie';
SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1WHERE c1 LIKE 'maggie%';
{code}
 

  was:
When querying form a view or CTE which "union all" 2 tables, such as the 
following script shows
{code:java}
CREATE TABLE n1 (c1 STRING);

INSERT OVERWRITE TABLE n1VALUES('needn');

CREATE VIEW v1AS
SELECT 'maggie' FROM n1
UNION ALL
SELECT c1 FROM v1;
{code}
Return the incorrect result when using "=" or "IN" with single element.

For example, the following 2 querys return nothing.
{code:java}
SELECT * FROM v1WHERE c1 = 'maggie';
SELECT * FROM v1WHERE c1 IN ('maggie');{code}
 

However, I can get correct result when using "LIKE" or "IN" with multiple 
element.

For example, the following 2 querys return expected result.
{code:java}
SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug');
SELECT * FROM v1WHERE c1 LIKE 'maggie%';
{code}
 


> Incorrect query result when using where CLAUSE to query data from 2 "UNION 
> ALL" parts
> -
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> When querying form a view or CTE which "union all" 2 tables, such as the 
> following script shows
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1 VALUES('needn');
> CREATE VIEW v1AS
> SELECT 'maggie' FROM n1
> UNION ALL
> SELECT c1 FROM n1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1 WHERE c1 = 'maggie';
> SELECT * FROM v1 WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts

2021-09-16 Thread Needn Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Needn Yu updated HIVE-25533:

Attachment: hive.png

> Incorrect query result when using where CLAUSE to query data from 2 "UNION 
> ALL" parts
> -
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> When querying form a view or CTE which "union all" 2 tables, such as the 
> following script shows
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1VALUES('needn');
> CREATE VIEW v1AS
> SELECT 'maggie' FROM n1
> UNION ALL
> SELECT c1 FROM v1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1WHERE c1 = 'maggie';
> SELECT * FROM v1WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts

2021-09-16 Thread Needn Yu (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Needn Yu updated HIVE-25533:

Attachment: (was: 微信图片_20210917111715.png)

> Incorrect query result when using where CLAUSE to query data from 2 "UNION 
> ALL" parts
> -
>
> Key: HIVE-25533
> URL: https://issues.apache.org/jira/browse/HIVE-25533
> Project: Hive
>  Issue Type: Bug
>  Components: Database/Schema
>Affects Versions: 3.1.0
> Environment: Azure HDInsight 4.1.7.5
> Hive 3.1.0
>Reporter: Needn Yu
>Priority: Critical
> Attachments: hive.png
>
>
> When querying form a view or CTE which "union all" 2 tables, such as the 
> following script shows
> {code:java}
> CREATE TABLE n1 (c1 STRING);
> INSERT OVERWRITE TABLE n1VALUES('needn');
> CREATE VIEW v1AS
> SELECT 'maggie' FROM n1
> UNION ALL
> SELECT c1 FROM v1;
> {code}
> Return the incorrect result when using "=" or "IN" with single element.
> For example, the following 2 querys return nothing.
> {code:java}
> SELECT * FROM v1WHERE c1 = 'maggie';
> SELECT * FROM v1WHERE c1 IN ('maggie');{code}
>  
> However, I can get correct result when using "LIKE" or "IN" with multiple 
> element.
> For example, the following 2 querys return expected result.
> {code:java}
> SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug');
> SELECT * FROM v1WHERE c1 LIKE 'maggie%';
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25532) Fix authorization support for Kill Query Command

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25532:
--
Labels: pull-request-available  (was: )

> Fix authorization support for Kill Query Command
> 
>
> Key: HIVE-25532
> URL: https://issues.apache.org/jira/browse/HIVE-25532
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Abhay
>Assignee: Abhay
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We added authorization for Kill Query command some time back with the help of 
> Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851
> However, we have observed that this hasn't been working as expected. The 
> Ranger service expects Hive to send in a privilege object of the type 
> SERVICE_NAME but we can see below
>  
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
>  that it is sending an empty array list. 
>  The Ranger service never throws an exception to this and this results in any 
> user being able to kill any query even though they don't have necessary 
> permissions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25532) Fix authorization support for Kill Query Command

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25532?focusedWorklogId=651991=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651991
 ]

ASF GitHub Bot logged work on HIVE-25532:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 20:46
Start Date: 16/Sep/21 20:46
Worklog Time Spent: 10m 
  Work Description: achennagiri opened a new pull request #2649:
URL: https://github.com/apache/hive/pull/2649


   ### What changes were proposed in this pull request?
   We added authorization support for Kill Query command a while back. Below is 
the ticket https://issues.apache.org/jira/browse/RANGER-1851
   
   However, we have observed that this hasn't been working as expected. The 
Ranger service expects Hive to send in a privilege object of the type 
SERVICE_NAME but we can see below
   
https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131
 that it is sending an empty array list. 
   The Ranger service never throws an exception to this and this results in any 
user being able to kill any query even though they don't have necessary 
permissions.
   
   
   ### Why are the changes needed?
   Currently, any user can kill any other query using the query id. Basically, 
KILL QUERY is an ADMIN level command and a user is supposed to have the 
necessary permissions to execute it without which it should fail. 
   We need this fix to address that bug.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   
   ### How was this patch tested?
   This patch was used to create the hive-service jar. This dev jar was 
replaced on a cluster running Hive and Ranger services. The hiveserver logs 
were used to confirm that the checkPrivileges() call returns an exception on a 
user without sufficient permissions(Basically, any user without SERVICE_ADMIN 
permission is not allowed to execute Kill query).
   
   
   Also, the logs are audited in the Ranger and they are as expected.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651991)
Remaining Estimate: 0h
Time Spent: 10m

> Fix authorization support for Kill Query Command
> 
>
> Key: HIVE-25532
> URL: https://issues.apache.org/jira/browse/HIVE-25532
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Abhay
>Assignee: Abhay
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We added authorization for Kill Query command some time back with the help of 
> Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851
> However, we have observed that this hasn't been working as expected. The 
> Ranger service expects Hive to send in a privilege object of the type 
> SERVICE_NAME but we can see below
>  
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
>  that it is sending an empty array list. 
>  The Ranger service never throws an exception to this and this results in any 
> user being able to kill any query even though they don't have necessary 
> permissions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651976
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 19:39
Start Date: 16/Sep/21 19:39
Worklog Time Spent: 10m 
  Work Description: szehon-ho commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710421984



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   > What I mean is if there is some fatal issue in the DB connection for 
instance, all the threads will try the same path and fail. It's better to just 
fail once instead.
   
   By the way in our case it would have recovered as db connection becomes 
available after awhile :) , but yea hard to generalize the case




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651976)
Time Spent: 3h  (was: 2h 50m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651973
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 19:33
Start Date: 16/Sep/21 19:33
Worklog Time Spent: 10m 
  Work Description: szehon-ho commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710418471



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   Yea it is maybe better..  Though I don't see anything about turning off 
txn handler, the only thing is :  metastore.txn.store.impl but it always has a 
default value.
   
   @pvary  @deniskuzZ any thoughts if we prefer eager initialization?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651973)
Time Spent: 2h 50m  (was: 2h 40m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651951
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 18:59
Start Date: 16/Sep/21 18:59
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710396047



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   > potentially add more connections for for HMS user who do not use any 
Txn functions
   
   Yea that's true .. I wonder if there is a way to know that the txn feature 
will be used beforehand.
   
   > Not sure all the threads will inevitably fail though, after the first 
succeeds they will skip
   
   What I mean is if there is some fatal issue in the DB connection for 
instance, all the threads will try the same path and fail. It's better to just 
fail once instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651951)
Time Spent: 2h 40m  (was: 2.5h)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> 

[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651945
 ]

ASF GitHub Bot logged work on HIVE-25317:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 18:51
Start Date: 16/Sep/21 18:51
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2459:
URL: https://github.com/apache/hive/pull/2459#discussion_r710388364



##
File path: llap-server/pom.xml
##
@@ -38,6 +38,7 @@
   org.apache.hive
   hive-exec
   ${project.version}
+  core

Review comment:
   we could relocate/shade away those deps to make it possible for other 
projects to use the normal artifact  - seems like there is a very good list in 
the trino project.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651945)
Time Spent: 3h 40m  (was: 3.5h)

> Relocate dependencies in shaded hive-exec module
> 
>
> Key: HIVE-25317
> URL: https://issues.apache.org/jira/browse/HIVE-25317
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.8
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> When we want to use shaded version of hive-exec (i.e., w/o classifier), more 
> dependencies conflict with Spark. We need to relocate these dependencies too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651928
 ]

ASF GitHub Bot logged work on HIVE-25317:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 18:23
Start Date: 16/Sep/21 18:23
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2459:
URL: https://github.com/apache/hive/pull/2459#discussion_r710369202



##
File path: llap-server/pom.xml
##
@@ -38,6 +38,7 @@
   org.apache.hive
   hive-exec
   ${project.version}
+  core

Review comment:
   @kgyrtkirk Guava is shaded in branch-2.3 via 
https://issues.apache.org/jira/browse/HIVE-23980. The issue is, in order for 
Spark to use shaded `hive-exec`, Hive will need to relocate more classes and at 
the same time making sure it won't break other modules (for instance, if the 
shaded class appears in certain API and another module imported the unshaded 
version of the class by itself).
   
   Currently we've abandoned this approach and decided to shade the 
`hive-exec-core` within Spark itself, following similar approach in Trino (see 
https://github.com/trinodb/trino-hive-apache).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651928)
Time Spent: 3.5h  (was: 3h 20m)

> Relocate dependencies in shaded hive-exec module
> 
>
> Key: HIVE-25317
> URL: https://issues.apache.org/jira/browse/HIVE-25317
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.8
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> When we want to use shaded version of hive-exec (i.e., w/o classifier), more 
> dependencies conflict with Spark. We need to relocate these dependencies too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25532) Fix authorization support for Kill Query Command

2021-09-16 Thread Abhay (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhay updated HIVE-25532:
-
Description: 
We added authorization for Kill Query command some time back with the help of 
Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851

However, we have observed that this hasn't been working as expected. The Ranger 
service expects Hive to send in a privilege object of the type SERVICE_NAME but 
we can see below
 
[https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
 that it is sending an empty array list. 
 The Ranger service never throws an exception to this and this results in any 
user being able to kill any query even though they don't have necessary 
permissions.

  was:
We added authorization for Kill Query command some time back with the help of 
Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851

However, we have observed that this hasn't been working as expected. The Ranger 
service expects Hive to send in a privilege object of the type SERVICE_NAME but 
we can see below
[https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
 that it is sending an empty array list. 
The Ranger service never throws an exception to this and this results in any 
user being able to kill any other query even though they don't have necessary 
permissions.


> Fix authorization support for Kill Query Command
> 
>
> Key: HIVE-25532
> URL: https://issues.apache.org/jira/browse/HIVE-25532
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Abhay
>Assignee: Abhay
>Priority: Major
>
> We added authorization for Kill Query command some time back with the help of 
> Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851
> However, we have observed that this hasn't been working as expected. The 
> Ranger service expects Hive to send in a privilege object of the type 
> SERVICE_NAME but we can see below
>  
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
>  that it is sending an empty array list. 
>  The Ranger service never throws an exception to this and this results in any 
> user being able to kill any query even though they don't have necessary 
> permissions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25532) Fix authorization support for Kill Query Command

2021-09-16 Thread Abhay (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhay reassigned HIVE-25532:



> Fix authorization support for Kill Query Command
> 
>
> Key: HIVE-25532
> URL: https://issues.apache.org/jira/browse/HIVE-25532
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Abhay
>Assignee: Abhay
>Priority: Major
>
> We added authorization for Kill Query command some time back with the help of 
> Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851
> However, we have observed that this hasn't been working as expected. The 
> Ranger service expects Hive to send in a privilege object of the type 
> SERVICE_NAME but we can see below
> [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131]
>  that it is sending an empty array list. 
> The Ranger service never throws an exception to this and this results in any 
> user being able to kill any other query even though they don't have necessary 
> permissions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651911
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 17:37
Start Date: 16/Sep/21 17:37
Worklog Time Spent: 10m 
  Work Description: szehon-ho commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710336872



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   Do you mean eager initialization on HMS startup?  I had thought it too 
and initially did not want to change the behavior and potentially add more 
connections for for HMS user who do not use any Txn functions.  Is it what you 
mean?  (Not sure all the threads will inevitably fail though, after the first 
succeeds they will skip)

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   Do you mean eager initialization on HMS startup?  It's possible and I 
had thought it too and initially did not want to change the behavior and 
potentially add more connections for for HMS user who do not use any Txn 
functions.  Is it what you mean?  (Not sure all the threads will inevitably 
fail though, after the first succeeds they will skip)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651911)
Time Spent: 2.5h  (was: 2h 20m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651910
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 17:36
Start Date: 16/Sep/21 17:36
Worklog Time Spent: 10m 
  Work Description: szehon-ho commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710336872



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   Do you mean eager initialization on HMS startup?  I had thought it too 
and initially did not want to change the behavior and add more connections for 
for HMS user who do not use any Txn functions.  Is it what you mean?  (Not sure 
all the threads will inevitably fail though, after the first succeeds they will 
skip)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651910)
Time Spent: 2h 20m  (was: 2h 10m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651909
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 17:36
Start Date: 16/Sep/21 17:36
Worklog Time Spent: 10m 
  Work Description: szehon-ho commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710336872



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   Do you mean eager initialization on HMS startup?  I had thought it too 
and initially did not want to change the behavior and add more connections for 
potentially nothing for HMS user does not use any Txn functions.  Is it what 
you mean?  (Not sure all the threads will inevitably fail though, after the 
first succeeds they will skip)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651909)
Time Spent: 2h 10m  (was: 2h)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651893
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 17:16
Start Date: 16/Sep/21 17:16
Worklog Time Spent: 10m 
  Work Description: sunchao commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710321412



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -368,33 +368,37 @@ public TxnHandler() {
   public void setConf(Configuration conf){
 this.conf = conf;
 
+int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+long getConnectionTimeoutMs = 3;
 synchronized (TxnHandler.class) {
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);

Review comment:
   I wonder if we should initialize these static fields separately (perhaps 
in HiveMetaStore) instead of within each handler thread. Otherwise, all the 
handler threads will try this code and fail inevitably. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651893)
Time Spent: 2h  (was: 1h 50m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651862
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 16:32
Start Date: 16/Sep/21 16:32
Worklog Time Spent: 10m 
  Work Description: szehon-ho commented on pull request #2647:
URL: https://github.com/apache/hive/pull/2647#issuecomment-921054439


   Thanks for the review.  Yea that's another option (or an 'initialized' 
boolean set only at the end), but I was afraid it would leave behind some 
connection or things that need cleanup if we partially initialize and 
re-initialize, though maybe it would not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651862)
Time Spent: 1h 50m  (was: 1h 40m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer: 
> Error occurred during processing of message.
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>  ~[hive-exec-3.1.2.jar:3.1.2]
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) 
> ~[hive-exec-3.1.2.jar:3.1.2]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>  ~[hive-exec-3.1.2.jar:3.1.2]
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown 
> Source) ~[?:?]
>   at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:?]
>   at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
>

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651861
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 16:31
Start Date: 16/Sep/21 16:31
Worklog Time Spent: 10m 
  Work Description: szehon-ho commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710281336



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -369,32 +369,36 @@ public void setConf(Configuration conf){
 this.conf = conf;
 
 synchronized (TxnHandler.class) {
+  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+  long getConnectionTimeoutMs = 3;
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
+  }
+
+  if (connPoolMutex == null) {
+/*the mutex pools should ideally be somewhat larger since some 
operations require 1
connection from each pool and we want to avoid taking a connection 
from primary pool
and then blocking because mutex pool is empty.  There is only 1 
thread in any HMS trying
to mutex on each MUTEX_KEY except MUTEX_KEY.CheckLock.  The 
CheckLock operation gets a
connection from connPool first, then connPoolMutex.  All others, go 
in the opposite
order (not very elegant...).  So number of connection requests for 
connPoolMutex cannot
exceed (size of connPool + MUTEX_KEY.values().length - 1).*/
-  connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + 
MUTEX_KEY.values().length, getConnectionTimeoutMs);
-  dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
+connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + 
MUTEX_KEY.values().length, getConnectionTimeoutMs);
+  }
+
+  if (dbProduct == null) {
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {
   determineDatabaseProduct(dbConn);
-  sqlGenerator = new SQLGenerator(dbProduct, conf);
 } catch (SQLException e) {
   String msg = "Unable to instantiate JDBC connection pooling, " + 
e.getMessage();

Review comment:
   Done

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection 
dbConn, List txnids)
 }
   }
 
-  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException {
+  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) {
 DataSourceProvider dsp = 
DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf);
 if (dsp != null) {
-  return dsp.create(conf);
+  try {
+return dsp.create(conf);
+  } catch (SQLException e) {
+String msg = "Unable to instantiate JDBC connection pooling, " + 
e.getMessage();
+LOG.error(msg);
+throw new RuntimeException(e);

Review comment:
   Done

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection 
dbConn, List txnids)
 }
   }
 
-  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException {
+  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) {

Review comment:
   Done

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -369,32 +369,36 @@ public void setConf(Configuration conf){
 this.conf = conf;
 
 synchronized (TxnHandler.class) {
+  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use 

[jira] [Work logged] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25529?focusedWorklogId=651816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651816
 ]

ASF GitHub Bot logged work on HIVE-25529:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:34
Start Date: 16/Sep/21 15:34
Worklog Time Spent: 10m 
  Work Description: marton-bod merged pull request #2644:
URL: https://github.com/apache/hive/pull/2644


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651816)
Time Spent: 20m  (was: 10m)

> Add tests for reading/writing Iceberg V2 tables with delete files
> -
>
> Key: HIVE-25529
> URL: https://issues.apache.org/jira/browse/HIVE-25529
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Since Iceberg V2 tables are now official, we can start testing out whether V2 
> tables can be created/read/written by Hive. While Hive has no delete 
> statement yet on Iceberg tables, we can nonetheless use the Iceberg API to 
> create delete files manually and then check if Hive honors those deletes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25485) Transform selects of literals under a UNION ALL to inline table scan

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25485?focusedWorklogId=651811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651811
 ]

ASF GitHub Bot logged work on HIVE-25485:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:26
Start Date: 16/Sep/21 15:26
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2608:
URL: https://github.com/apache/hive/pull/2608#discussion_r710227073



##
File path: ql/src/test/results/clientpositive/llap/union_literals.q.out
##
@@ -0,0 +1,397 @@
+PREHOOK: query: explain
+SELECT * FROM (
+   VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+   AS Colors
+PREHOOK: type: QUERY
+PREHOOK: Input: _dummy_database@_dummy_table
+ A masked pattern was here 
+POSTHOOK: query: explain
+SELECT * FROM (
+   VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+   AS Colors
+POSTHOOK: type: QUERY
+POSTHOOK: Input: _dummy_database@_dummy_table
+ A masked pattern was here 
+STAGE DEPENDENCIES:
+  Stage-0 is a root stage
+
+STAGE PLANS:
+  Stage: Stage-0
+Fetch Operator
+  limit: -1
+  Processor Tree:
+TableScan
+  alias: _dummy_table
+  Row Limit Per Split: 1
+  Select Operator
+expressions: array(const struct(1,'1'),const 
struct(2,'orange'),const struct(5,'yellow'),const struct(10,'green'),const 
struct(11,'blue'),const struct(12,'indigo'),const struct(20,'violet')) (type: 
array>)
+outputColumnNames: _col0
+UDTF Operator
+  function name: inline
+  Select Operator
+expressions: col1 (type: int), col2 (type: string)
+outputColumnNames: _col0, _col1
+ListSink
+
+PREHOOK: query: explain
+SELECT * FROM (
+   VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+   AS Colors
+union all
+  select 2,'2'
+union all
+  select 2,'2'
+PREHOOK: type: QUERY
+PREHOOK: Input: _dummy_database@_dummy_table
+ A masked pattern was here 
+POSTHOOK: query: explain
+SELECT * FROM (
+   VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+   AS Colors
+union all
+  select 2,'2'
+union all
+  select 2,'2'
+POSTHOOK: type: QUERY
+POSTHOOK: Input: _dummy_database@_dummy_table
+ A masked pattern was here 
+STAGE DEPENDENCIES:
+  Stage-1 is a root stage
+  Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+  Stage: Stage-1
+Tez
+ A masked pattern was here 
+  Edges:
+Map 1 <- Union 2 (CONTAINS)
+Map 3 <- Union 2 (CONTAINS)
+ A masked pattern was here 
+  Vertices:
+Map 1 
+Map Operator Tree:
+TableScan
+  alias: _dummy_table
+  Row Limit Per Split: 1
+  Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE 
Column stats: COMPLETE
+  Select Operator
+expressions: array(const struct(2,'2'),const 
struct(2,'2')) (type: array>)
+outputColumnNames: _col0
+Statistics: Num rows: 1 Data size: 56 Basic stats: 
COMPLETE Column stats: COMPLETE
+UDTF Operator
+  Statistics: Num rows: 1 Data size: 56 Basic stats: 
COMPLETE Column stats: COMPLETE
+  function name: inline
+  Select Operator
+expressions: col1 (type: int), col2 (type: string)
+outputColumnNames: _col0, _col1
+Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE
+File Output Operator
+  compressed: false
+  Statistics: Num rows: 2 Data size: 16 Basic stats: 
COMPLETE Column stats: COMPLETE
+  table:
+  input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
+  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+  serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+Execution mode: llap
+LLAP IO: no inputs
+Map 3 
+Map Operator Tree:
+TableScan
+  alias: _dummy_table
+  Row Limit Per Split: 1
+  Statistics: Num rows: 1 Data size: 10 Basic stats: 

[jira] [Work logged] (HIVE-25485) Transform selects of literals under a UNION ALL to inline table scan

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25485?focusedWorklogId=651787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651787
 ]

ASF GitHub Bot logged work on HIVE-25485:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:07
Start Date: 16/Sep/21 15:07
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2608:
URL: https://github.com/apache/hive/pull/2608#discussion_r710210766



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveTransformSimpleSelectsToInlineTableInUnion.java
##
@@ -0,0 +1,214 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.optimizer.calcite.rules;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import org.apache.calcite.plan.RelOptCluster;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rel.type.RelDataType;
+import org.apache.calcite.rel.type.RelRecordType;
+import org.apache.calcite.rex.RexBuilder;
+import org.apache.calcite.rex.RexCall;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.fun.SqlStdOperatorTable;
+import org.apache.hadoop.hive.ql.metadata.Table;
+import org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException;
+import org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable;
+import org.apache.hadoop.hive.ql.optimizer.calcite.TraitsUtil;
+import 
org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableFunctionScan;
+import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan;
+import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveUnion;
+import 
org.apache.hadoop.hive.ql.optimizer.calcite.translator.SqlFunctionConverter;
+import org.apache.hadoop.hive.ql.parse.SemanticAnalyzer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.collect.ImmutableList;
+
+/**
+ * Transforms SELECTS of literals under UNION ALL into inline table scans.
+ */
+public class HiveTransformSimpleSelectsToInlineTableInUnion extends RelOptRule 
{

Review comment:
   I was unable to give it a decent name
   renamed the class :+1: 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651787)
Time Spent: 50m  (was: 40m)

> Transform selects of literals under a UNION ALL to inline table scan
> 
>
> Key: HIVE-25485
> URL: https://issues.apache.org/jira/browse/HIVE-25485
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
> select 1
> union all
> select 1
> union all
> [...]
> union all
> select 1
> {code}
> results in a very big plan; which will have vertexes proportional to the 
> number of union all branch - hence it could be slow to execute it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651785
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:06
Start Date: 16/Sep/21 15:06
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2645:
URL: https://github.com/apache/hive/pull/2645#discussion_r710210043



##
File path: 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
##
@@ -1820,6 +1830,8 @@ private static boolean 
removeFromRunningTaskMap(TreeMap LLAP Scheduler task exits with fatal error if the executor node is down
> ---
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651784
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:06
Start Date: 16/Sep/21 15:06
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2645:
URL: https://github.com/apache/hive/pull/2645#discussion_r710209656



##
File path: 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
##
@@ -1447,23 +1454,26 @@ private SelectHostResult selectHost(TaskInfo request, 
Map
 if (request.shouldForceLocality()) {
   requestedHostsWillBecomeAvailable = true;
 } else {
-  LlapServiceInstance inst = 
activeInstances.getByHost(host).stream().findFirst().get();
-  NodeInfo nodeInfo = 
instanceToNodeMap.get(inst.getWorkerIdentity());
-  if (nodeInfo != null && nodeInfo.getEnableTime() > 
request.getLocalityDelayTimeout()
-  && nodeInfo.isDisabled() && nodeInfo.hadCommFailure()) {
-LOG.debug("Host={} will not become available within 
requested timeout", nodeInfo);
-// This node will likely be activated after the task 
timeout expires.
-  } else {
-// Worth waiting for the timeout.
-requestedHostsWillBecomeAvailable = true;
+  for (LlapServiceInstance inst : activeInstancesByHost) {
+NodeInfo nodeInfo = 
instanceToNodeMap.get(inst.getWorkerIdentity());
+if (nodeInfo == null) {
+  LOG.warn("Null NodeInfo when attempting to get host {}", 
host);
+  // Leave requestedHostWillBecomeAvailable as is. If some 
other host is found - delay,
+  // else ends up allocating to a random host immediately.
+  continue;

Review comment:
   we can avoid continue by changing second if to else if




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651784)
Time Spent: 1.5h  (was: 1h 20m)

> LLAP Scheduler task exits with fatal error if the executor node is down
> ---
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25516) ITestDbTxnManager is broken after HIVE-24120

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25516?focusedWorklogId=651783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651783
 ]

ASF GitHub Bot logged work on HIVE-25516:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:05
Start Date: 16/Sep/21 15:05
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged pull request #2637:
URL: https://github.com/apache/hive/pull/2637


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651783)
Time Spent: 20m  (was: 10m)

> ITestDbTxnManager is broken after HIVE-24120
> 
>
> Key: HIVE-25516
> URL: https://issues.apache.org/jira/browse/HIVE-25516
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651782
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:05
Start Date: 16/Sep/21 15:05
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2645:
URL: https://github.com/apache/hive/pull/2645#discussion_r710209073



##
File path: 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
##
@@ -1447,23 +1454,26 @@ private SelectHostResult selectHost(TaskInfo request, 
Map
 if (request.shouldForceLocality()) {
   requestedHostsWillBecomeAvailable = true;
 } else {
-  LlapServiceInstance inst = 
activeInstances.getByHost(host).stream().findFirst().get();
-  NodeInfo nodeInfo = 
instanceToNodeMap.get(inst.getWorkerIdentity());
-  if (nodeInfo != null && nodeInfo.getEnableTime() > 
request.getLocalityDelayTimeout()
-  && nodeInfo.isDisabled() && nodeInfo.hadCommFailure()) {
-LOG.debug("Host={} will not become available within 
requested timeout", nodeInfo);
-// This node will likely be activated after the task 
timeout expires.
-  } else {
-// Worth waiting for the timeout.
-requestedHostsWillBecomeAvailable = true;
+  for (LlapServiceInstance inst : activeInstancesByHost) {
+NodeInfo nodeInfo = 
instanceToNodeMap.get(inst.getWorkerIdentity());
+if (nodeInfo == null) {
+  LOG.warn("Null NodeInfo when attempting to get host {}", 
host);
+  // Leave requestedHostWillBecomeAvailable as is. If some 
other host is found - delay,
+  // else ends up allocating to a random host immediately.
+  continue;

Review comment:
   we can avoid continue by changing second if to else if




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651782)
Time Spent: 1h 20m  (was: 1h 10m)

> LLAP Scheduler task exits with fatal error if the executor node is down
> ---
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651776
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 15:00
Start Date: 16/Sep/21 15:00
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #2645:
URL: https://github.com/apache/hive/pull/2645#discussion_r710204368



##
File path: 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
##
@@ -1820,6 +1830,8 @@ private static boolean 
removeFromRunningTaskMap(TreeMap LLAP Scheduler task exits with fatal error if the executor node is down
> ---
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651774
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 14:58
Start Date: 16/Sep/21 14:58
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #2645:
URL: https://github.com/apache/hive/pull/2645#discussion_r710203031



##
File path: 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
##
@@ -1430,6 +1430,13 @@ private SelectHostResult selectHost(TaskInfo request, 
Map
 boolean requestedHostsWillBecomeAvailable = false;
 for (String host : requestedHosts) {
   prefHostCount++;
+
+  // Check if the host is removed from the registry after 
availableHostMap is created.
+  Set activeInstancesByHost = 
activeInstances.getByHost(host);
+  if (activeInstancesByHost == null || 
activeInstancesByHost.isEmpty()) {
+continue;
+  }
+
   // Pick the first host always. Weak attempt at cache affinity.
   if (availableHostMap.containsKey(host)) {

Review comment:
   i think having this separate check makes the code more readable.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651774)
Time Spent: 1h  (was: 50m)

> LLAP Scheduler task exits with fatal error if the executor node is down
> ---
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-25531) Remove the core classified hive-exec artifact

2021-09-16 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-25531:
---


> Remove the core classified hive-exec artifact
> -
>
> Key: HIVE-25531
> URL: https://issues.apache.org/jira/browse/HIVE-25531
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> * this artifact was introduced in HIVE-7423 
> * loading this artifact and the shaded hive-exec (along with the jdbc driver) 
> could create interesting classpath problems
> * if other projects have issues with the shaded hive-exec artifact we must 
> start fix those problems



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25530) AssertionError when query involves multiple JDBC tables and views

2021-09-16 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416148#comment-17416148
 ] 

Stamatis Zampetakis commented on HIVE-25530:


This seems related to HIVE-23479. I think that by fixing HIVE-23479 the 
{{AssertionError}} described here may also disappear. 

> AssertionError when query involves multiple JDBC tables and views
> -
>
> Key: HIVE-25530
> URL: https://issues.apache.org/jira/browse/HIVE-25530
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, HiveServer2
>Affects Versions: 4.0.0
>Reporter: Stamatis Zampetakis
>Assignee: Soumyakanti Das
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: engesc_6056.q
>
>
> An {{AssertionError}} is thrown during compilation when a query contains 
> multiple external JDBC tables and there are available materialized views 
> which can be used to answer the query. 
> The problem can be reproduced by running the scenario in [^engesc_6056.q].
> {code:bash}
> mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=engesc_6056.q 
> -Dtest.output.overwrite
> {code}
> The stacktrace is shown below:
> {noformat}
> java.lang.AssertionError: Rule's description should be unique; existing 
> rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE); new 
> rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:158)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:406)
>   at 
> org.apache.calcite.adapter.jdbc.JdbcConvention.register(JdbcConvention.java:66)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.registerClass(AbstractRelOptPlanner.java:233)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.registerClass(HiveVolcanoPlanner.java:90)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1224)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
>   at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
>   at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewBoxing$HiveMaterializedViewUnboxingRule.onMatch(HiveMaterializedViewBoxing.java:210)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229)
>   at 
> org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2027)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1717)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1589)
>   at 
> org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1341)
>   at 
> 

[jira] [Commented] (HIVE-25530) AssertionError when query involves multiple JDBC tables and views

2021-09-16 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416145#comment-17416145
 ] 

Stamatis Zampetakis commented on HIVE-25530:


Basically any query with more than one external JDBC table and at least one 
view can trigger the problem.

>From a quick look the culprit seems to be that we are 
>[instantiating|https://github.com/apache/hive/blob/3e861c5f2775cda4821199681e0e2e9d25654371/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L3023]
> a new {{JdbcConvention}} for every table in the query. Then when the 
>{{VolcanoPlanner}} runs it will [register the 
>rules|https://github.com/apache/calcite/blob/f3baf348598fcc6bc4f97a0abee3f99309e5bf76/core/src/main/java/org/apache/calcite/plan/AbstractRelOptPlanner.java#L239]
> for every convention that is not registered till now. Since there is a new 
>convention per table it will trigger the registering of the rules as many 
>times as distinct tables in the query.

> AssertionError when query involves multiple JDBC tables and views
> -
>
> Key: HIVE-25530
> URL: https://issues.apache.org/jira/browse/HIVE-25530
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, HiveServer2
>Affects Versions: 4.0.0
>Reporter: Stamatis Zampetakis
>Assignee: Soumyakanti Das
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: engesc_6056.q
>
>
> An {{AssertionError}} is thrown during compilation when a query contains 
> multiple external JDBC tables and there are available materialized views 
> which can be used to answer the query. 
> The problem can be reproduced by running the scenario in [^engesc_6056.q].
> {code:bash}
> mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=engesc_6056.q 
> -Dtest.output.overwrite
> {code}
> The stacktrace is shown below:
> {noformat}
> java.lang.AssertionError: Rule's description should be unique; existing 
> rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE); new 
> rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:158)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:406)
>   at 
> org.apache.calcite.adapter.jdbc.JdbcConvention.register(JdbcConvention.java:66)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.registerClass(AbstractRelOptPlanner.java:233)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.registerClass(HiveVolcanoPlanner.java:90)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1224)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
>   at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
>   at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewBoxing$HiveMaterializedViewUnboxingRule.onMatch(HiveMaterializedViewBoxing.java:210)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229)
>   at 
> org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2027)
>   at 
> 

[jira] [Assigned] (HIVE-25530) AssertionError when query involves multiple JDBC tables and views

2021-09-16 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-25530:
--


> AssertionError when query involves multiple JDBC tables and views
> -
>
> Key: HIVE-25530
> URL: https://issues.apache.org/jira/browse/HIVE-25530
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, HiveServer2
>Affects Versions: 4.0.0
>Reporter: Stamatis Zampetakis
>Assignee: Soumyakanti Das
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: engesc_6056.q
>
>
> An {{AssertionError}} is thrown during compilation when a query contains 
> multiple external JDBC tables and there are available materialized views 
> which can be used to answer the query. 
> The problem can be reproduced by running the scenario in [^engesc_6056.q].
> {code:bash}
> mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=engesc_6056.q 
> -Dtest.output.overwrite
> {code}
> The stacktrace is shown below:
> {noformat}
> java.lang.AssertionError: Rule's description should be unique; existing 
> rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE); new 
> rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:158)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:406)
>   at 
> org.apache.calcite.adapter.jdbc.JdbcConvention.register(JdbcConvention.java:66)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.registerClass(AbstractRelOptPlanner.java:233)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.registerClass(HiveVolcanoPlanner.java:90)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1224)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
>   at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
>   at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283)
>   at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewBoxing$HiveMaterializedViewUnboxingRule.onMatch(HiveMaterializedViewBoxing.java:210)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229)
>   at 
> org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2027)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1717)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1589)
>   at 
> org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1341)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:559)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12549)
>   at 
> 

[jira] [Updated] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25529:
--
Labels: pull-request-available  (was: )

> Add tests for reading/writing Iceberg V2 tables with delete files
> -
>
> Key: HIVE-25529
> URL: https://issues.apache.org/jira/browse/HIVE-25529
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since Iceberg V2 tables are now official, we can start testing out whether V2 
> tables can be created/read/written by Hive. While Hive has no delete 
> statement yet on Iceberg tables, we can nonetheless use the Iceberg API to 
> create delete files manually and then check if Hive honors those deletes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25529?focusedWorklogId=651644=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651644
 ]

ASF GitHub Bot logged work on HIVE-25529:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 13:02
Start Date: 16/Sep/21 13:02
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2644:
URL: https://github.com/apache/hive/pull/2644#discussion_r710095882



##
File path: 
iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java
##
@@ -299,4 +311,68 @@ public static void validateDataWithSQL(TestHiveShell 
shell, String tableName, Li
   }
 }
   }
+
+  /**
+   * @param table The table to create the delete file for
+   * @param deleteFilePath The path where the delete file should be created, 
relative to the table location root
+   * @param equalityFields List of field names that should play a role in the 
equality check
+   * @param fileFormat The file format that should be used for writing out the 
delete file
+   * @param rowsToDelete The rows that should be deleted. It's enough to fill 
out the fields that are relevant for the
+   * equality check, as listed in equalityFields, the rest 
of the fields are ignored
+   * @return The DeleteFile created
+   * @throws IOException If there is an error during DeleteFile write
+   */
+  public static DeleteFile createEqualityDeleteFile(Table table, String 
deleteFilePath, List equalityFields,
+  FileFormat fileFormat, List rowsToDelete) throws IOException {
+List equalityFieldIds = equalityFields.stream()
+.map(id -> table.schema().findField(id).fieldId())
+.collect(Collectors.toList());
+Schema eqDeleteRowSchema = 
table.schema().select(equalityFields.toArray(new String[]{}));
+
+FileAppenderFactory appenderFactory = new 
GenericAppenderFactory(table.schema(), table.spec(),
+ArrayUtil.toIntArray(equalityFieldIds), eqDeleteRowSchema, null);
+EncryptedOutputFile outputFile = 
table.encryption().encrypt(HadoopOutputFile.fromPath(
+new org.apache.hadoop.fs.Path(table.location(), deleteFilePath), new 
Configuration()));
+
+PartitionKey part = new PartitionKey(table.spec(), eqDeleteRowSchema);
+part.partition(rowsToDelete.get(0));
+EqualityDeleteWriter eqWriter = 
appenderFactory.newEqDeleteWriter(outputFile, fileFormat, part);
+try (EqualityDeleteWriter writer = eqWriter) {
+  writer.deleteAll(rowsToDelete);
+}
+return eqWriter.toDeleteFile();
+  }
+
+  /**
+   * @param table The table to create the delete file for
+   * @param deleteFilePath The path where the delete file should be created, 
relative to the table location root
+   * @param fileFormat The file format that should be used for writing out the 
delete file
+   * @param partitionValues A map of partition values 
(partitionKey=partitionVal, ...) to be used for the delete file
+   * @param deletes The list of position deletes, each containing the data 
file path, the position of the row in the
+   *data file and the row itself that should be deleted
+   * @return The DeleteFile created
+   * @throws IOException If there is an error during DeleteFile write
+   */
+  public static DeleteFile createPositionalDeleteFile(Table table, String 
deleteFilePath, FileFormat fileFormat,

Review comment:
   Thx for the investigation




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651644)
Remaining Estimate: 0h
Time Spent: 10m

> Add tests for reading/writing Iceberg V2 tables with delete files
> -
>
> Key: HIVE-25529
> URL: https://issues.apache.org/jira/browse/HIVE-25529
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Since Iceberg V2 tables are now official, we can start testing out whether V2 
> tables can be created/read/written by Hive. While Hive has no delete 
> statement yet on Iceberg tables, we can nonetheless use the Iceberg API to 
> create delete files manually and then check if Hive honors those deletes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651641=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651641
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 12:44
Start Date: 16/Sep/21 12:44
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710079309



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -369,32 +369,36 @@ public void setConf(Configuration conf){
 this.conf = conf;
 
 synchronized (TxnHandler.class) {
+  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+  long getConnectionTimeoutMs = 3;
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
+  }
+
+  if (connPoolMutex == null) {
+/*the mutex pools should ideally be somewhat larger since some 
operations require 1
connection from each pool and we want to avoid taking a connection 
from primary pool
and then blocking because mutex pool is empty.  There is only 1 
thread in any HMS trying
to mutex on each MUTEX_KEY except MUTEX_KEY.CheckLock.  The 
CheckLock operation gets a
connection from connPool first, then connPoolMutex.  All others, go 
in the opposite
order (not very elegant...).  So number of connection requests for 
connPoolMutex cannot
exceed (size of connPool + MUTEX_KEY.values().length - 1).*/
-  connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + 
MUTEX_KEY.values().length, getConnectionTimeoutMs);
-  dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
+connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + 
MUTEX_KEY.values().length, getConnectionTimeoutMs);
+  }
+
+  if (dbProduct == null) {
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {
   determineDatabaseProduct(dbConn);
-  sqlGenerator = new SQLGenerator(dbProduct, conf);
 } catch (SQLException e) {
   String msg = "Unable to instantiate JDBC connection pooling, " + 
e.getMessage();

Review comment:
   Should we update the exception text here as you are handling pooling 
exceptions inside of setupJdbcConnectionPool, like Unable to determine 
database product ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651641)
Time Spent: 1.5h  (was: 1h 20m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651639
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 12:42
Start Date: 16/Sep/21 12:42
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710079309



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -369,32 +369,36 @@ public void setConf(Configuration conf){
 this.conf = conf;
 
 synchronized (TxnHandler.class) {
+  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
+  long getConnectionTimeoutMs = 3;
   if (connPool == null) {
-Connection dbConn = null;
-// Set up the JDBC connection pool
-try {
-  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);
-  long getConnectionTimeoutMs = 3;
-  connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
-  /*the mutex pools should ideally be somewhat larger since some 
operations require 1
+connPool = setupJdbcConnectionPool(conf, maxPoolSize, 
getConnectionTimeoutMs);
+  }
+
+  if (connPoolMutex == null) {
+/*the mutex pools should ideally be somewhat larger since some 
operations require 1
connection from each pool and we want to avoid taking a connection 
from primary pool
and then blocking because mutex pool is empty.  There is only 1 
thread in any HMS trying
to mutex on each MUTEX_KEY except MUTEX_KEY.CheckLock.  The 
CheckLock operation gets a
connection from connPool first, then connPoolMutex.  All others, go 
in the opposite
order (not very elegant...).  So number of connection requests for 
connPoolMutex cannot
exceed (size of connPool + MUTEX_KEY.values().length - 1).*/
-  connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + 
MUTEX_KEY.values().length, getConnectionTimeoutMs);
-  dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED);
+connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + 
MUTEX_KEY.values().length, getConnectionTimeoutMs);
+  }
+
+  if (dbProduct == null) {
+try (Connection dbConn = 
getDbConn(Connection.TRANSACTION_READ_COMMITTED)) {
   determineDatabaseProduct(dbConn);
-  sqlGenerator = new SQLGenerator(dbProduct, conf);
 } catch (SQLException e) {
   String msg = "Unable to instantiate JDBC connection pooling, " + 
e.getMessage();

Review comment:
   Should we update the exception text here as you are handling pooling 
exceptions inside of setupJdbcConnectionPool?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651639)
Time Spent: 1h 20m  (was: 1h 10m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651638
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 12:40
Start Date: 16/Sep/21 12:40
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r71004



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection 
dbConn, List txnids)
 }
   }
 
-  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException {
+  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) {
 DataSourceProvider dsp = 
DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf);
 if (dsp != null) {
-  return dsp.create(conf);
+  try {
+return dsp.create(conf);
+  } catch (SQLException e) {
+String msg = "Unable to instantiate JDBC connection pooling, " + 
e.getMessage();
+LOG.error(msg);
+throw new RuntimeException(e);

Review comment:
   Could we please add msg to the exception, i.e. throw new 
RuntimeException(msg, e)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651638)
Time Spent: 1h 10m  (was: 1h)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651634
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 12:37
Start Date: 16/Sep/21 12:37
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710052878



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection 
dbConn, List txnids)
 }
   }
 
-  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException {
+  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) {
 DataSourceProvider dsp = 
DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf);
 if (dsp != null) {
-  return dsp.create(conf);
+  try {
+return dsp.create(conf);
+  } catch (SQLException e) {
+String msg = "Unable to instantiate JDBC connection pooling, " + 
e.getMessage();
+LOG.error(msg);
+throw new RuntimeException(e);

Review comment:
   Why is this change required? There is handling of checked SQLException 
in setConf.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651634)
Time Spent: 1h  (was: 50m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651621
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 12:05
Start Date: 16/Sep/21 12:05
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710052878



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection 
dbConn, List txnids)
 }
   }
 
-  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException {
+  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) {
 DataSourceProvider dsp = 
DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf);
 if (dsp != null) {
-  return dsp.create(conf);
+  try {
+return dsp.create(conf);
+  } catch (SQLException e) {
+String msg = "Unable to instantiate JDBC connection pooling, " + 
e.getMessage();
+LOG.error(msg);
+throw new RuntimeException(e);

Review comment:
   Why is this change required? There is handling of checked SQLException 
in setConf.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651621)
Time Spent: 50m  (was: 40m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651619=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651619
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 12:01
Start Date: 16/Sep/21 12:01
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710050068



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection 
dbConn, List txnids)
 }
   }
 
-  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException {
+  private static synchronized DataSource setupJdbcConnectionPool(Configuration 
conf, int maxPoolSize, long getConnectionTimeoutMs) {

Review comment:
   I don't think it needs to be static and synchronized. The only place 
where it's used is setConf method under the synchronized block




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651619)
Time Spent: 40m  (was: 0.5h)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer: 
> Error occurred during processing of message.
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>  ~[hive-exec-3.1.2.jar:3.1.2]
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) 
> 

[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651617=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651617
 ]

ASF GitHub Bot logged work on HIVE-25522:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 11:50
Start Date: 16/Sep/21 11:50
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2647:
URL: https://github.com/apache/hive/pull/2647#discussion_r710042962



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -369,32 +369,36 @@ public void setConf(Configuration conf){
 this.conf = conf;
 
 synchronized (TxnHandler.class) {
+  int maxPoolSize = MetastoreConf.getIntVar(conf, 
ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS);

Review comment:
   minor: maxPoolSize & connectionTimeoutMs could me moved outside of 
synchronized block




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651617)
Time Spent: 0.5h  (was: 20m)

> NullPointerException in TxnHandler
> --
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore.  Iceberg 
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] 
> metastore.RetryingHMSHandler: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>   at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy27.lock(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
>   at java.base/java.security.AccessController.doPrivileged(Native Method)
>   at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at 
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer: 
> Error occurred during processing of message.
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
>  ~[hive-exec-3.1.2.jar:3.1.2]
>   at 
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) 
> ~[hive-exec-3.1.2.jar:3.1.2]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
>  ~[hive-exec-3.1.2.jar:3.1.2]
>   at 

[jira] [Assigned] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files

2021-09-16 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod reassigned HIVE-25529:
-


> Add tests for reading/writing Iceberg V2 tables with delete files
> -
>
> Key: HIVE-25529
> URL: https://issues.apache.org/jira/browse/HIVE-25529
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>
> Since Iceberg V2 tables are now official, we can start testing out whether V2 
> tables can be created/read/written by Hive. While Hive has no delete 
> statement yet on Iceberg tables, we can nonetheless use the Iceberg API to 
> create delete files manually and then check if Hive honors those deletes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25503) Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25503?focusedWorklogId=651590=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651590
 ]

ASF GitHub Bot logged work on HIVE-25503:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 10:25
Start Date: 16/Sep/21 10:25
Worklog Time Spent: 10m 
  Work Description: deniskuzZ edited a comment on pull request #2612:
URL: https://github.com/apache/hive/pull/2612#issuecomment-920777855


   > Can we add test cases for this method?
   > Can we at least manually run these tests with the supported databases - 
the queries look scary 
   
   - added unit test 
   - performed manual test on all supported dbs using ITestDbTxnManager


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651590)
Time Spent: 40m  (was: 0.5h)

> Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries
> --
>
> Key: HIVE-25503
> URL: https://issues.apache.org/jira/browse/HIVE-25503
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Performace improvement. Accumulated entries in COMPLETED_TXN_COMPONENTS can 
> lead to query performance degradation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25503) Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25503?focusedWorklogId=651589=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651589
 ]

ASF GitHub Bot logged work on HIVE-25503:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 10:24
Start Date: 16/Sep/21 10:24
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on pull request #2612:
URL: https://github.com/apache/hive/pull/2612#issuecomment-920777855


   > Can we add test cases for this method?
   > Can we at least manually run these tests with the supported databases - 
the queries look scary 
   
   added unit test 
   performed manual test on all supported dbs using ITestDbTxnManager


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651589)
Time Spent: 0.5h  (was: 20m)

> Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries
> --
>
> Key: HIVE-25503
> URL: https://issues.apache.org/jira/browse/HIVE-25503
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Performace improvement. Accumulated entries in COMPLETED_TXN_COMPONENTS can 
> lead to query performance degradation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down

2021-09-16 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25527:
---
Summary: LLAP Scheduler task exits with fatal error if the executor node is 
down  (was: LLAP Scheduler task exits with fatal error if the executor node is 
down.)

> LLAP Scheduler task exits with fatal error if the executor node is down
> ---
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.

2021-09-16 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416034#comment-17416034
 ] 

Stamatis Zampetakis commented on HIVE-25527:


Hey [~maheshk114], can you please include also the error in the summary of this 
ticket.

> LLAP Scheduler task exits with fatal error if the executor node is down.
> 
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25303) CTAS hive.create.as.external.legacy tries to place data files in managed WH path

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=651556=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651556
 ]

ASF GitHub Bot logged work on HIVE-25303:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 09:14
Start Date: 16/Sep/21 09:14
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2442:
URL: https://github.com/apache/hive/pull/2442#discussion_r709936548



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
##
@@ -472,6 +474,28 @@ private void setLoadFileLocation(
   loc = cmv.getLocation();
 }
 Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new 
Path(loc);
+if (pCtx.getQueryProperties().isCTAS()) {
+  boolean isExternal = pCtx.getCreateTable().isExternal();
+  boolean isAcid = pCtx.getCreateTable().getTblProps().getOrDefault(
+  hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, 
"false").equalsIgnoreCase("true") ||
+  
pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES);
+  if ((HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || isExternal) && !isAcid) {

Review comment:
   1. doesnt matter; if that will be a performance bottleneck later we will 
handle it - but as it is now; it only adds complexity/reduces 
readability...this stuff works incorrectly because its too complicated already; 
it would be better to stop add twists...
   2. add a parameter to the rpc call and pass the value of 
`hive.create.as.external.legacy` over to the transformer so it will be able to 
handle that as well.
   3. can't we keep a full table object in the `ctd` instead of just some parts 
of it? could that help overcome that issue?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651556)
Time Spent: 3h 10m  (was: 3h)

> CTAS hive.create.as.external.legacy tries to place data files in managed WH 
> path
> 
>
> Key: HIVE-25303
> URL: https://issues.apache.org/jira/browse/HIVE-25303
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Under legacy table creation mode (hive.create.as.external.legacy=true), when 
> a database has been created in a specific LOCATION, in a session where that 
> database is Used, tables are created using the following command:
> {code:java}
> CREATE TABLE  AS SELECT {code}
> should inherit the HDFS path from the database's location. Instead, Hive is 
> trying to write the table data into 
> /warehouse/tablespace/managed/hive//
> +Design+: 
> In the CTAS query, first data is written in the target directory (which 
> happens in HS2) and then the table is created(This happens in HMS). So here 
> two decisions are being made i) target directory location ii) how the table 
> should be created (table type, sd e.t.c).
> When HS2 needs a target location that needs to be set, it'll make create 
> table dry run call to HMS (where table translation happens) and i) and ii) 
> decisions are made within HMS and returns table object. Then HS2 will use 
> this location set by HMS for placing the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25303) CTAS hive.create.as.external.legacy tries to place data files in managed WH path

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=651554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651554
 ]

ASF GitHub Bot logged work on HIVE-25303:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 09:09
Start Date: 16/Sep/21 09:09
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2442:
URL: https://github.com/apache/hive/pull/2442#discussion_r709932488



##
File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java
##
@@ -472,6 +474,32 @@ private void setLoadFileLocation(
   loc = cmv.getLocation();
 }
 Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new 
Path(loc);
+boolean isExternal = false;
+boolean isAcid = false;
+if (pCtx.getQueryProperties().isCTAS()) {
+  isExternal = pCtx.getCreateTable().isExternal();
+  isAcid = pCtx.getCreateTable().getTblProps().getOrDefault(
+  hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, 
"false").equalsIgnoreCase("true") ||
+  
pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES);
+  if ((HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) {

Review comment:
   that seems to me premature optimization which may just hit back 
later...it would be simpler to run everything related to location thru the 
translator and even move the handling of `CREATE_TABLE_AS_EXTERNAL` to there - 
so that everything is on the same page.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651554)
Time Spent: 3h  (was: 2h 50m)

> CTAS hive.create.as.external.legacy tries to place data files in managed WH 
> path
> 
>
> Key: HIVE-25303
> URL: https://issues.apache.org/jira/browse/HIVE-25303
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Standalone Metastore
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Under legacy table creation mode (hive.create.as.external.legacy=true), when 
> a database has been created in a specific LOCATION, in a session where that 
> database is Used, tables are created using the following command:
> {code:java}
> CREATE TABLE  AS SELECT {code}
> should inherit the HDFS path from the database's location. Instead, Hive is 
> trying to write the table data into 
> /warehouse/tablespace/managed/hive//
> +Design+: 
> In the CTAS query, first data is written in the target directory (which 
> happens in HS2) and then the table is created(This happens in HMS). So here 
> two decisions are being made i) target directory location ii) how the table 
> should be created (table type, sd e.t.c).
> When HS2 needs a target location that needs to be set, it'll make create 
> table dry run call to HMS (where table translation happens) and i) and ii) 
> decisions are made within HMS and returns table object. Then HS2 will use 
> this location set by HMS for placing the data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651541
 ]

ASF GitHub Bot logged work on HIVE-25317:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 08:42
Start Date: 16/Sep/21 08:42
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2459:
URL: https://github.com/apache/hive/pull/2459#discussion_r709911738



##
File path: llap-server/pom.xml
##
@@ -38,6 +38,7 @@
   org.apache.hive
   hive-exec
   ${project.version}
+  core

Review comment:
   note: on branch2 guava is most likely not properly shaded away HIVE-22126




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651541)
Time Spent: 3h 20m  (was: 3h 10m)

> Relocate dependencies in shaded hive-exec module
> 
>
> Key: HIVE-25317
> URL: https://issues.apache.org/jira/browse/HIVE-25317
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.8
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> When we want to use shaded version of hive-exec (i.e., w/o classifier), more 
> dependencies conflict with Spark. We need to relocate these dependencies too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651539
 ]

ASF GitHub Bot logged work on HIVE-25317:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 08:38
Start Date: 16/Sep/21 08:38
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2459:
URL: https://github.com/apache/hive/pull/2459#discussion_r709908764



##
File path: llap-server/pom.xml
##
@@ -38,6 +38,7 @@
   org.apache.hive
   hive-exec
   ${project.version}
+  core

Review comment:
   we should fix the issues with using the normal `hive-exec` artifact if 
there is any - loading the core jar could cause troubles...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651539)
Time Spent: 3h 10m  (was: 3h)

> Relocate dependencies in shaded hive-exec module
> 
>
> Key: HIVE-25317
> URL: https://issues.apache.org/jira/browse/HIVE-25317
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.3.8
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> When we want to use shaded version of hive-exec (i.e., w/o classifier), more 
> dependencies conflict with Spark. We need to relocate these dependencies too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651519=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651519
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 08:03
Start Date: 16/Sep/21 08:03
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2645:
URL: https://github.com/apache/hive/pull/2645#discussion_r709881332



##
File path: 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
##
@@ -1820,6 +1830,8 @@ private static boolean 
removeFromRunningTaskMap(TreeMap LLAP Scheduler task exits with fatal error if the executor node is down.
> 
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651517=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651517
 ]

ASF GitHub Bot logged work on HIVE-25527:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 08:01
Start Date: 16/Sep/21 08:01
Worklog Time Spent: 10m 
  Work Description: pgaref commented on a change in pull request #2645:
URL: https://github.com/apache/hive/pull/2645#discussion_r709880097



##
File path: 
llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java
##
@@ -1430,6 +1430,13 @@ private SelectHostResult selectHost(TaskInfo request, 
Map
 boolean requestedHostsWillBecomeAvailable = false;
 for (String host : requestedHosts) {
   prefHostCount++;
+
+  // Check if the host is removed from the registry after 
availableHostMap is created.
+  Set activeInstancesByHost = 
activeInstances.getByHost(host);
+  if (activeInstancesByHost == null || 
activeInstancesByHost.isEmpty()) {
+continue;
+  }
+
   // Pick the first host always. Weak attempt at cache affinity.
   if (availableHostMap.containsKey(host)) {

Review comment:
   I would avoid the continue statement above and modify the condition to:
   if (availableHostMap.containsKey(host) && activeInstancesByHost != null && 
!activeInstancesByHost.isEmpty())




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651517)
Time Spent: 40m  (was: 0.5h)

> LLAP Scheduler task exits with fatal error if the executor node is down.
> 
>
> Key: HIVE-25527
> URL: https://issues.apache.org/jira/browse/HIVE-25527
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In case the executor host has gone down, activeInstances will be updated with 
> null. So we need to check for empty/null values before accessing it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24263) Create an HMS endpoint to list partition locations

2021-09-16 Thread Szehon Ho (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho resolved HIVE-24263.
--
Resolution: Duplicate

> Create an HMS endpoint to list partition locations
> --
>
> Key: HIVE-24263
> URL: https://issues.apache.org/jira/browse/HIVE-24263
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24263.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In our company, we have a use-case to get quickly a list of partition 
> locations.  Currently it is done via listPartitions, which is a very heavy 
> operation in terms of memory and performance.
> This JIRA proposes an API: Map listPartitionLocations(String 
> db, String table, short max) that returns a map of partition names to 
> locations.
> For example, we have an integration from output of a Hive pipeline to Spark 
> jobs that consume directly from HDFS.  The Spark job scheduler needs to know 
> the partition paths that are available for consumption (the partition name is 
> not sufficient as it's input is HDFS path), and so we have to do heavy 
> listPartitions() for this.
> Another use-case is for a HDFS data removal tool that does a nightly crawl to 
> see if there are associated hive partitions mapped to a given partition path. 
>  The nightly crawling job could be much less resource-intensive if we had a 
> listPartitionLocations().
> As there is already an internal method in the ObjectStore for this done for 
> dropPartitions, it is only a matter of exposing this API to 
> HiveMetaStoreClient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HIVE-24263) Create an HMS endpoint to list partition locations

2021-09-16 Thread Szehon Ho (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24263 started by Szehon Ho.

> Create an HMS endpoint to list partition locations
> --
>
> Key: HIVE-24263
> URL: https://issues.apache.org/jira/browse/HIVE-24263
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24263.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In our company, we have a use-case to get quickly a list of partition 
> locations.  Currently it is done via listPartitions, which is a very heavy 
> operation in terms of memory and performance.
> This JIRA proposes an API: Map listPartitionLocations(String 
> db, String table, short max) that returns a map of partition names to 
> locations.
> For example, we have an integration from output of a Hive pipeline to Spark 
> jobs that consume directly from HDFS.  The Spark job scheduler needs to know 
> the partition paths that are available for consumption (the partition name is 
> not sufficient as it's input is HDFS path), and so we have to do heavy 
> listPartitions() for this.
> Another use-case is for a HDFS data removal tool that does a nightly crawl to 
> see if there are associated hive partitions mapped to a given partition path. 
>  The nightly crawling job could be much less resource-intensive if we had a 
> listPartitionLocations().
> As there is already an internal method in the ObjectStore for this done for 
> dropPartitions, it is only a matter of exposing this API to 
> HiveMetaStoreClient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24263) Create an HMS endpoint to list partition locations

2021-09-16 Thread Szehon Ho (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-24263:
-
Status: Open  (was: Patch Available)

> Create an HMS endpoint to list partition locations
> --
>
> Key: HIVE-24263
> URL: https://issues.apache.org/jira/browse/HIVE-24263
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24263.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In our company, we have a use-case to get quickly a list of partition 
> locations.  Currently it is done via listPartitions, which is a very heavy 
> operation in terms of memory and performance.
> This JIRA proposes an API: Map listPartitionLocations(String 
> db, String table, short max) that returns a map of partition names to 
> locations.
> For example, we have an integration from output of a Hive pipeline to Spark 
> jobs that consume directly from HDFS.  The Spark job scheduler needs to know 
> the partition paths that are available for consumption (the partition name is 
> not sufficient as it's input is HDFS path), and so we have to do heavy 
> listPartitions() for this.
> Another use-case is for a HDFS data removal tool that does a nightly crawl to 
> see if there are associated hive partitions mapped to a given partition path. 
>  The nightly crawling job could be much less resource-intensive if we had a 
> listPartitionLocations().
> As there is already an internal method in the ObjectStore for this done for 
> dropPartitions, it is only a matter of exposing this API to 
> HiveMetaStoreClient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2021-09-16 Thread Szehon Ho (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-20789:
-
Status: Open  (was: Patch Available)

> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
> URL: https://issues.apache.org/jira/browse/HIVE-20789
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-20789.2.patch, HIVE-20789.patch
>
>
> We have had a scenario that health checks sending 0 bytes to HiveServer2 
> sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
> don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will 
> block reading the socket.
> This is the stack (we are running an older version of Hive here)
> {noformat}
> "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <23781b74> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Eventually HiveServer2 has no more free threads left.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets

2021-09-16 Thread Szehon Ho (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho resolved HIVE-20789.
--
Resolution: Won't Fix

> HiveServer2 should have Timeouts against clients that never close sockets
> -
>
> Key: HIVE-20789
> URL: https://issues.apache.org/jira/browse/HIVE-20789
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
>Priority: Major
> Attachments: HIVE-20789.2.patch, HIVE-20789.patch
>
>
> We have had a scenario that health checks sending 0 bytes to HiveServer2 
> sockets would DDOS the HiveServer2, if for some reason they hang or otherwise 
> don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will 
> block reading the socket.
> This is the stack (we are running an older version of Hive here)
> {noformat}
> "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> at java.net.SocketInputStream.read(SocketInputStream.java:171)
> at java.net.SocketInputStream.read(SocketInputStream.java:141)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> - locked <23781b74> (a java.io.BufferedInputStream)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346)
> at 
> org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423)
> at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405)
> at 
> org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){noformat}
> Eventually HiveServer2 has no more free threads left.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25526) Run create_table Q test from TestCliDriver

2021-09-16 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17415936#comment-17415936
 ] 

László Pintér commented on HIVE-25526:
--

Submitted to master. Thanks for the patch, [~lvegh]!

> Run create_table Q test from TestCliDriver
> --
>
> Key: HIVE-25526
> URL: https://issues.apache.org/jira/browse/HIVE-25526
> Project: Hive
>  Issue Type: Test
>  Components: Hive
>Reporter: Laszlo Vegh
>Assignee: Laszlo Vegh
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> create_table QTest should be picked up by the TestCliDriver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25526) Run create_table Q test from TestCliDriver

2021-09-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér resolved HIVE-25526.
--
Resolution: Fixed

> Run create_table Q test from TestCliDriver
> --
>
> Key: HIVE-25526
> URL: https://issues.apache.org/jira/browse/HIVE-25526
> Project: Hive
>  Issue Type: Test
>  Components: Hive
>Reporter: Laszlo Vegh
>Assignee: Laszlo Vegh
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> create_table QTest should be picked up by the TestCliDriver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25526) Run create_table Q test from TestCliDriver

2021-09-16 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25526?focusedWorklogId=651501=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651501
 ]

ASF GitHub Bot logged work on HIVE-25526:
-

Author: ASF GitHub Bot
Created on: 16/Sep/21 07:28
Start Date: 16/Sep/21 07:28
Worklog Time Spent: 10m 
  Work Description: lcspinter merged pull request #2643:
URL: https://github.com/apache/hive/pull/2643


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 651501)
Time Spent: 0.5h  (was: 20m)

> Run create_table Q test from TestCliDriver
> --
>
> Key: HIVE-25526
> URL: https://issues.apache.org/jira/browse/HIVE-25526
> Project: Hive
>  Issue Type: Test
>  Components: Hive
>Reporter: Laszlo Vegh
>Assignee: Laszlo Vegh
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> create_table QTest should be picked up by the TestCliDriver.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)