[jira] [Comment Edited] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo
[ https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416465#comment-17416465 ] Csomor Viktor edited comment on HIVE-25518 at 9/17/21, 5:54 AM: This caused a wrong Intellij jar assembly. can't reproduce by using the single binary was (Author: vcsomor): This caused a wrong Intellij jar assembly > CompactionTxHandler NPE if no CompactionInfo > > > Key: HIVE-25518 > URL: https://issues.apache.org/jira/browse/HIVE-25518 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Reporter: Csomor Viktor >Assignee: Csomor Viktor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt > > > If no {{CompactionInfo}} is provided to the > {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of > the method. No information inside the COMPLETED_COMPACTION info. > Stacktrace: > {noformat} > [TThreadPoolServer WorkerProcess-%d] ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler - > java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116) > at > org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy13.mark_failed(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo
[ https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416465#comment-17416465 ] Csomor Viktor edited comment on HIVE-25518 at 9/17/21, 5:54 AM: This caused a wrong Intellij jar assembly. can't reproduce by using the binary was (Author: vcsomor): This caused a wrong Intellij jar assembly. can't reproduce by using the single binary > CompactionTxHandler NPE if no CompactionInfo > > > Key: HIVE-25518 > URL: https://issues.apache.org/jira/browse/HIVE-25518 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Reporter: Csomor Viktor >Assignee: Csomor Viktor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt > > > If no {{CompactionInfo}} is provided to the > {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of > the method. No information inside the COMPLETED_COMPACTION info. > Stacktrace: > {noformat} > [TThreadPoolServer WorkerProcess-%d] ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler - > java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116) > at > org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy13.mark_failed(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo
[ https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Csomor Viktor resolved HIVE-25518. -- Resolution: Won't Fix This caused a wrong Intellij jar assembly > CompactionTxHandler NPE if no CompactionInfo > > > Key: HIVE-25518 > URL: https://issues.apache.org/jira/browse/HIVE-25518 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Reporter: Csomor Viktor >Assignee: Csomor Viktor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt > > > If no {{CompactionInfo}} is provided to the > {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of > the method. No information inside the COMPLETED_COMPACTION info. > Stacktrace: > {noformat} > [TThreadPoolServer WorkerProcess-%d] ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler - > java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116) > at > org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy13.mark_failed(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work stopped] (HIVE-25518) CompactionTxHandler NPE if no CompactionInfo
[ https://issues.apache.org/jira/browse/HIVE-25518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-25518 stopped by Csomor Viktor. > CompactionTxHandler NPE if no CompactionInfo > > > Key: HIVE-25518 > URL: https://issues.apache.org/jira/browse/HIVE-25518 > Project: Hive > Issue Type: Bug > Components: Standalone Metastore >Reporter: Csomor Viktor >Assignee: Csomor Viktor >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE_25518_CompactionTxnHandle_NPE.txt > > > If no {{CompactionInfo}} is provided to the > {{CompactionTxHandler#markFailed()}} then an NPE happens at the beginning of > the method. No information inside the COMPLETED_COMPACTION info. > Stacktrace: > {noformat} > [TThreadPoolServer WorkerProcess-%d] ERROR > org.apache.hadoop.hive.metastore.RetryingHMSHandler - > java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.CompactionTxnHandler.markFailed(CompactionTxnHandler.java:1116) > at > org.apache.hadoop.hive.metastore.HMSHandler.mark_failed(HMSHandler.java:8716) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy13.mark_failed(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23846) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$mark_failed.getResult(ThriftHiveMetastore.java:23825) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.
[ https://issues.apache.org/jira/browse/HIVE-25534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haymant Mangla updated HIVE-25534: -- Description: Remove the preserve xattr while calling distcp. {code:java} 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least one file system: org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not supported for file system: s3a://hmangla1-dev at org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code} was: Remove the preserve xattr while calling distcp. 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least one file system: org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not supported for file system: s3a://hmangla1-dev at org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > Don't preserve FileAttribute.XATTR to initialise distcp. > > > Key: HIVE-25534 > URL: https://issues.apache.org/jira/browse/HIVE-25534 > Project: Hive > Issue Type: Bug >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > > Remove the preserve xattr while calling distcp. > {code:java} > 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: > [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least > one file system: > org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not > supported for file system: s3a://hmangla1-dev > at > org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?]{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25534) Don't preserve FileAttribute.XATTR to initialise distcp.
[ https://issues.apache.org/jira/browse/HIVE-25534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haymant Mangla reassigned HIVE-25534: - > Don't preserve FileAttribute.XATTR to initialise distcp. > > > Key: HIVE-25534 > URL: https://issues.apache.org/jira/browse/HIVE-25534 > Project: Hive > Issue Type: Bug >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > > Remove the preserve xattr while calling distcp. > 2021-08-23 10:06:18,485 ERROR org.apache.hadoop.tools.DistCp: > [HiveServer2-Background-Pool: Thread-73]: XAttrs not supported on at least > one file system: > org.apache.hadoop.tools.CopyListing$XAttrsNotSupportedException: XAttrs not > supported for file system: s3a://hmangla1-dev > at > org.apache.hadoop.tools.util.DistCpUtils.checkFileSystemXAttrSupport(DistCpUtils.java:513) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at > org.apache.hadoop.tools.DistCp.configureOutputFormat(DistCp.java:337) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at org.apache.hadoop.tools.DistCp.createJob(DistCp.java:304) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:214) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > at org.apache.hadoop.tools.DistCp.execute(DistCp.java:193) > ~[hadoop-distcp-3.1.1.7.1.6.0-297.jar:?] > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25533) With CBO enabled, Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts
[ https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Needn Yu updated HIVE-25533: Summary: With CBO enabled, Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts (was: Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts) > With CBO enabled, Incorrect query result when using where CLAUSE to query > data from 2 "UNION ALL" parts > --- > > Key: HIVE-25533 > URL: https://issues.apache.org/jira/browse/HIVE-25533 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 3.1.0 > Environment: Azure HDInsight 4.1.7.5 > Hive 3.1.0 >Reporter: Needn Yu >Priority: Critical > Attachments: hive.png > > > When querying form a view or CTE which "union all" 2 tables, such as the > following script shows > {code:java} > CREATE TABLE n1 (c1 STRING); > INSERT OVERWRITE TABLE n1 VALUES('needn'); > CREATE VIEW v1 > AS > SELECT 'maggie' AS c1 FROM n1 > UNION ALL > SELECT c1 FROM n1; > {code} > Return the incorrect result when using "=" or "IN" with single element. > For example, the following 2 querys return nothing. > {code:java} > SELECT * FROM v1 WHERE c1 = 'maggie'; > SELECT * FROM v1 WHERE c1 IN ('maggie');{code} > > However, I can get correct result when using "LIKE" or "IN" with multiple > element. > For example, the following 2 querys return expected result. > {code:java} > SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug'); > SELECT * FROM v1 WHERE c1 LIKE 'maggie%'; > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts
[ https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Needn Yu updated HIVE-25533: Description: When querying form a view or CTE which "union all" 2 tables, such as the following script shows {code:java} CREATE TABLE n1 (c1 STRING); INSERT OVERWRITE TABLE n1 VALUES('needn'); CREATE VIEW v1 AS SELECT 'maggie' AS c1 FROM n1 UNION ALL SELECT c1 FROM n1; {code} Return the incorrect result when using "=" or "IN" with single element. For example, the following 2 querys return nothing. {code:java} SELECT * FROM v1 WHERE c1 = 'maggie'; SELECT * FROM v1 WHERE c1 IN ('maggie');{code} However, I can get correct result when using "LIKE" or "IN" with multiple element. For example, the following 2 querys return expected result. {code:java} SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug'); SELECT * FROM v1 WHERE c1 LIKE 'maggie%'; {code} was: When querying form a view or CTE which "union all" 2 tables, such as the following script shows {code:java} CREATE TABLE n1 (c1 STRING); INSERT OVERWRITE TABLE n1 VALUES('needn'); CREATE VIEW v1AS SELECT 'maggie' FROM n1 UNION ALL SELECT c1 FROM n1; {code} Return the incorrect result when using "=" or "IN" with single element. For example, the following 2 querys return nothing. {code:java} SELECT * FROM v1 WHERE c1 = 'maggie'; SELECT * FROM v1 WHERE c1 IN ('maggie');{code} However, I can get correct result when using "LIKE" or "IN" with multiple element. For example, the following 2 querys return expected result. {code:java} SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug'); SELECT * FROM v1 WHERE c1 LIKE 'maggie%'; {code} > Incorrect query result when using where CLAUSE to query data from 2 "UNION > ALL" parts > - > > Key: HIVE-25533 > URL: https://issues.apache.org/jira/browse/HIVE-25533 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 3.1.0 > Environment: Azure HDInsight 4.1.7.5 > Hive 3.1.0 >Reporter: Needn Yu >Priority: Critical > Attachments: hive.png > > > When querying form a view or CTE which "union all" 2 tables, such as the > following script shows > {code:java} > CREATE TABLE n1 (c1 STRING); > INSERT OVERWRITE TABLE n1 VALUES('needn'); > CREATE VIEW v1 > AS > SELECT 'maggie' AS c1 FROM n1 > UNION ALL > SELECT c1 FROM n1; > {code} > Return the incorrect result when using "=" or "IN" with single element. > For example, the following 2 querys return nothing. > {code:java} > SELECT * FROM v1 WHERE c1 = 'maggie'; > SELECT * FROM v1 WHERE c1 IN ('maggie');{code} > > However, I can get correct result when using "LIKE" or "IN" with multiple > element. > For example, the following 2 querys return expected result. > {code:java} > SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug'); > SELECT * FROM v1 WHERE c1 LIKE 'maggie%'; > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=652088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-652088 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 17/Sep/21 03:49 Start Date: 17/Sep/21 03:49 Worklog Time Spent: 10m Work Description: maheshk114 merged pull request #2645: URL: https://github.com/apache/hive/pull/2645 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 652088) Time Spent: 1h 50m (was: 1h 40m) > LLAP Scheduler task exits with fatal error if the executor node is down > --- > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts
[ https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Needn Yu updated HIVE-25533: Description: When querying form a view or CTE which "union all" 2 tables, such as the following script shows {code:java} CREATE TABLE n1 (c1 STRING); INSERT OVERWRITE TABLE n1 VALUES('needn'); CREATE VIEW v1AS SELECT 'maggie' FROM n1 UNION ALL SELECT c1 FROM n1; {code} Return the incorrect result when using "=" or "IN" with single element. For example, the following 2 querys return nothing. {code:java} SELECT * FROM v1 WHERE c1 = 'maggie'; SELECT * FROM v1 WHERE c1 IN ('maggie');{code} However, I can get correct result when using "LIKE" or "IN" with multiple element. For example, the following 2 querys return expected result. {code:java} SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug'); SELECT * FROM v1 WHERE c1 LIKE 'maggie%'; {code} was: When querying form a view or CTE which "union all" 2 tables, such as the following script shows {code:java} CREATE TABLE n1 (c1 STRING); INSERT OVERWRITE TABLE n1 VALUES('needn'); CREATE VIEW v1AS SELECT 'maggie' FROM n1 UNION ALL SELECT c1 FROM n1; {code} Return the incorrect result when using "=" or "IN" with single element. For example, the following 2 querys return nothing. {code:java} SELECT * FROM v1 WHERE c1 = 'maggie'; SELECT * FROM v1 WHERE c1 IN ('maggie');{code} However, I can get correct result when using "LIKE" or "IN" with multiple element. For example, the following 2 querys return expected result. {code:java} SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug'); SELECT * FROM v1WHERE c1 LIKE 'maggie%'; {code} > Incorrect query result when using where CLAUSE to query data from 2 "UNION > ALL" parts > - > > Key: HIVE-25533 > URL: https://issues.apache.org/jira/browse/HIVE-25533 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 3.1.0 > Environment: Azure HDInsight 4.1.7.5 > Hive 3.1.0 >Reporter: Needn Yu >Priority: Critical > Attachments: hive.png > > > When querying form a view or CTE which "union all" 2 tables, such as the > following script shows > {code:java} > CREATE TABLE n1 (c1 STRING); > INSERT OVERWRITE TABLE n1 VALUES('needn'); > CREATE VIEW v1AS > SELECT 'maggie' FROM n1 > UNION ALL > SELECT c1 FROM n1; > {code} > Return the incorrect result when using "=" or "IN" with single element. > For example, the following 2 querys return nothing. > {code:java} > SELECT * FROM v1 WHERE c1 = 'maggie'; > SELECT * FROM v1 WHERE c1 IN ('maggie');{code} > > However, I can get correct result when using "LIKE" or "IN" with multiple > element. > For example, the following 2 querys return expected result. > {code:java} > SELECT * FROM v1 WHERE c1 IN ('maggie','This is a bug'); > SELECT * FROM v1 WHERE c1 LIKE 'maggie%'; > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts
[ https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Needn Yu updated HIVE-25533: Description: When querying form a view or CTE which "union all" 2 tables, such as the following script shows {code:java} CREATE TABLE n1 (c1 STRING); INSERT OVERWRITE TABLE n1 VALUES('needn'); CREATE VIEW v1AS SELECT 'maggie' FROM n1 UNION ALL SELECT c1 FROM n1; {code} Return the incorrect result when using "=" or "IN" with single element. For example, the following 2 querys return nothing. {code:java} SELECT * FROM v1 WHERE c1 = 'maggie'; SELECT * FROM v1 WHERE c1 IN ('maggie');{code} However, I can get correct result when using "LIKE" or "IN" with multiple element. For example, the following 2 querys return expected result. {code:java} SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug'); SELECT * FROM v1WHERE c1 LIKE 'maggie%'; {code} was: When querying form a view or CTE which "union all" 2 tables, such as the following script shows {code:java} CREATE TABLE n1 (c1 STRING); INSERT OVERWRITE TABLE n1VALUES('needn'); CREATE VIEW v1AS SELECT 'maggie' FROM n1 UNION ALL SELECT c1 FROM v1; {code} Return the incorrect result when using "=" or "IN" with single element. For example, the following 2 querys return nothing. {code:java} SELECT * FROM v1WHERE c1 = 'maggie'; SELECT * FROM v1WHERE c1 IN ('maggie');{code} However, I can get correct result when using "LIKE" or "IN" with multiple element. For example, the following 2 querys return expected result. {code:java} SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug'); SELECT * FROM v1WHERE c1 LIKE 'maggie%'; {code} > Incorrect query result when using where CLAUSE to query data from 2 "UNION > ALL" parts > - > > Key: HIVE-25533 > URL: https://issues.apache.org/jira/browse/HIVE-25533 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 3.1.0 > Environment: Azure HDInsight 4.1.7.5 > Hive 3.1.0 >Reporter: Needn Yu >Priority: Critical > Attachments: hive.png > > > When querying form a view or CTE which "union all" 2 tables, such as the > following script shows > {code:java} > CREATE TABLE n1 (c1 STRING); > INSERT OVERWRITE TABLE n1 VALUES('needn'); > CREATE VIEW v1AS > SELECT 'maggie' FROM n1 > UNION ALL > SELECT c1 FROM n1; > {code} > Return the incorrect result when using "=" or "IN" with single element. > For example, the following 2 querys return nothing. > {code:java} > SELECT * FROM v1 WHERE c1 = 'maggie'; > SELECT * FROM v1 WHERE c1 IN ('maggie');{code} > > However, I can get correct result when using "LIKE" or "IN" with multiple > element. > For example, the following 2 querys return expected result. > {code:java} > SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug'); > SELECT * FROM v1WHERE c1 LIKE 'maggie%'; > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts
[ https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Needn Yu updated HIVE-25533: Attachment: hive.png > Incorrect query result when using where CLAUSE to query data from 2 "UNION > ALL" parts > - > > Key: HIVE-25533 > URL: https://issues.apache.org/jira/browse/HIVE-25533 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 3.1.0 > Environment: Azure HDInsight 4.1.7.5 > Hive 3.1.0 >Reporter: Needn Yu >Priority: Critical > Attachments: hive.png > > > When querying form a view or CTE which "union all" 2 tables, such as the > following script shows > {code:java} > CREATE TABLE n1 (c1 STRING); > INSERT OVERWRITE TABLE n1VALUES('needn'); > CREATE VIEW v1AS > SELECT 'maggie' FROM n1 > UNION ALL > SELECT c1 FROM v1; > {code} > Return the incorrect result when using "=" or "IN" with single element. > For example, the following 2 querys return nothing. > {code:java} > SELECT * FROM v1WHERE c1 = 'maggie'; > SELECT * FROM v1WHERE c1 IN ('maggie');{code} > > However, I can get correct result when using "LIKE" or "IN" with multiple > element. > For example, the following 2 querys return expected result. > {code:java} > SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug'); > SELECT * FROM v1WHERE c1 LIKE 'maggie%'; > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25533) Incorrect query result when using where CLAUSE to query data from 2 "UNION ALL" parts
[ https://issues.apache.org/jira/browse/HIVE-25533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Needn Yu updated HIVE-25533: Attachment: (was: 微信图片_20210917111715.png) > Incorrect query result when using where CLAUSE to query data from 2 "UNION > ALL" parts > - > > Key: HIVE-25533 > URL: https://issues.apache.org/jira/browse/HIVE-25533 > Project: Hive > Issue Type: Bug > Components: Database/Schema >Affects Versions: 3.1.0 > Environment: Azure HDInsight 4.1.7.5 > Hive 3.1.0 >Reporter: Needn Yu >Priority: Critical > Attachments: hive.png > > > When querying form a view or CTE which "union all" 2 tables, such as the > following script shows > {code:java} > CREATE TABLE n1 (c1 STRING); > INSERT OVERWRITE TABLE n1VALUES('needn'); > CREATE VIEW v1AS > SELECT 'maggie' FROM n1 > UNION ALL > SELECT c1 FROM v1; > {code} > Return the incorrect result when using "=" or "IN" with single element. > For example, the following 2 querys return nothing. > {code:java} > SELECT * FROM v1WHERE c1 = 'maggie'; > SELECT * FROM v1WHERE c1 IN ('maggie');{code} > > However, I can get correct result when using "LIKE" or "IN" with multiple > element. > For example, the following 2 querys return expected result. > {code:java} > SELECT * FROM v1WHERE c1 IN ('maggie','This is a bug'); > SELECT * FROM v1WHERE c1 LIKE 'maggie%'; > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25532) Fix authorization support for Kill Query Command
[ https://issues.apache.org/jira/browse/HIVE-25532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25532: -- Labels: pull-request-available (was: ) > Fix authorization support for Kill Query Command > > > Key: HIVE-25532 > URL: https://issues.apache.org/jira/browse/HIVE-25532 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Abhay >Assignee: Abhay >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We added authorization for Kill Query command some time back with the help of > Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851 > However, we have observed that this hasn't been working as expected. The > Ranger service expects Hive to send in a privilege object of the type > SERVICE_NAME but we can see below > > [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131] > that it is sending an empty array list. > The Ranger service never throws an exception to this and this results in any > user being able to kill any query even though they don't have necessary > permissions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25532) Fix authorization support for Kill Query Command
[ https://issues.apache.org/jira/browse/HIVE-25532?focusedWorklogId=651991=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651991 ] ASF GitHub Bot logged work on HIVE-25532: - Author: ASF GitHub Bot Created on: 16/Sep/21 20:46 Start Date: 16/Sep/21 20:46 Worklog Time Spent: 10m Work Description: achennagiri opened a new pull request #2649: URL: https://github.com/apache/hive/pull/2649 ### What changes were proposed in this pull request? We added authorization support for Kill Query command a while back. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851 However, we have observed that this hasn't been working as expected. The Ranger service expects Hive to send in a privilege object of the type SERVICE_NAME but we can see below https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131 that it is sending an empty array list. The Ranger service never throws an exception to this and this results in any user being able to kill any query even though they don't have necessary permissions. ### Why are the changes needed? Currently, any user can kill any other query using the query id. Basically, KILL QUERY is an ADMIN level command and a user is supposed to have the necessary permissions to execute it without which it should fail. We need this fix to address that bug. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? This patch was used to create the hive-service jar. This dev jar was replaced on a cluster running Hive and Ranger services. The hiveserver logs were used to confirm that the checkPrivileges() call returns an exception on a user without sufficient permissions(Basically, any user without SERVICE_ADMIN permission is not allowed to execute Kill query). Also, the logs are audited in the Ranger and they are as expected. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651991) Remaining Estimate: 0h Time Spent: 10m > Fix authorization support for Kill Query Command > > > Key: HIVE-25532 > URL: https://issues.apache.org/jira/browse/HIVE-25532 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Abhay >Assignee: Abhay >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We added authorization for Kill Query command some time back with the help of > Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851 > However, we have observed that this hasn't been working as expected. The > Ranger service expects Hive to send in a privilege object of the type > SERVICE_NAME but we can see below > > [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131] > that it is sending an empty array list. > The Ranger service never throws an exception to this and this results in any > user being able to kill any query even though they don't have necessary > permissions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651976 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 19:39 Start Date: 16/Sep/21 19:39 Worklog Time Spent: 10m Work Description: szehon-ho commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710421984 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: > What I mean is if there is some fatal issue in the DB connection for instance, all the threads will try the same path and fail. It's better to just fail once instead. By the way in our case it would have recovered as db connection becomes available after awhile :) , but yea hard to generalize the case -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651976) Time Spent: 3h (was: 2h 50m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651973 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 19:33 Start Date: 16/Sep/21 19:33 Worklog Time Spent: 10m Work Description: szehon-ho commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710418471 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: Yea it is maybe better.. Though I don't see anything about turning off txn handler, the only thing is : metastore.txn.store.impl but it always has a default value. @pvary @deniskuzZ any thoughts if we prefer eager initialization? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651973) Time Spent: 2h 50m (was: 2h 40m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651951=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651951 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 18:59 Start Date: 16/Sep/21 18:59 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710396047 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: > potentially add more connections for for HMS user who do not use any Txn functions Yea that's true .. I wonder if there is a way to know that the txn feature will be used beforehand. > Not sure all the threads will inevitably fail though, after the first succeeds they will skip What I mean is if there is some fatal issue in the DB connection for instance, all the threads will try the same path and fail. It's better to just fail once instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651951) Time Spent: 2h 40m (was: 2.5h) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at >
[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module
[ https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651945 ] ASF GitHub Bot logged work on HIVE-25317: - Author: ASF GitHub Bot Created on: 16/Sep/21 18:51 Start Date: 16/Sep/21 18:51 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2459: URL: https://github.com/apache/hive/pull/2459#discussion_r710388364 ## File path: llap-server/pom.xml ## @@ -38,6 +38,7 @@ org.apache.hive hive-exec ${project.version} + core Review comment: we could relocate/shade away those deps to make it possible for other projects to use the normal artifact - seems like there is a very good list in the trino project. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651945) Time Spent: 3h 40m (was: 3.5h) > Relocate dependencies in shaded hive-exec module > > > Key: HIVE-25317 > URL: https://issues.apache.org/jira/browse/HIVE-25317 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.8 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > When we want to use shaded version of hive-exec (i.e., w/o classifier), more > dependencies conflict with Spark. We need to relocate these dependencies too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module
[ https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651928=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651928 ] ASF GitHub Bot logged work on HIVE-25317: - Author: ASF GitHub Bot Created on: 16/Sep/21 18:23 Start Date: 16/Sep/21 18:23 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2459: URL: https://github.com/apache/hive/pull/2459#discussion_r710369202 ## File path: llap-server/pom.xml ## @@ -38,6 +38,7 @@ org.apache.hive hive-exec ${project.version} + core Review comment: @kgyrtkirk Guava is shaded in branch-2.3 via https://issues.apache.org/jira/browse/HIVE-23980. The issue is, in order for Spark to use shaded `hive-exec`, Hive will need to relocate more classes and at the same time making sure it won't break other modules (for instance, if the shaded class appears in certain API and another module imported the unshaded version of the class by itself). Currently we've abandoned this approach and decided to shade the `hive-exec-core` within Spark itself, following similar approach in Trino (see https://github.com/trinodb/trino-hive-apache). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651928) Time Spent: 3.5h (was: 3h 20m) > Relocate dependencies in shaded hive-exec module > > > Key: HIVE-25317 > URL: https://issues.apache.org/jira/browse/HIVE-25317 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.8 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > When we want to use shaded version of hive-exec (i.e., w/o classifier), more > dependencies conflict with Spark. We need to relocate these dependencies too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25532) Fix authorization support for Kill Query Command
[ https://issues.apache.org/jira/browse/HIVE-25532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhay updated HIVE-25532: - Description: We added authorization for Kill Query command some time back with the help of Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851 However, we have observed that this hasn't been working as expected. The Ranger service expects Hive to send in a privilege object of the type SERVICE_NAME but we can see below [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131] that it is sending an empty array list. The Ranger service never throws an exception to this and this results in any user being able to kill any query even though they don't have necessary permissions. was: We added authorization for Kill Query command some time back with the help of Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851 However, we have observed that this hasn't been working as expected. The Ranger service expects Hive to send in a privilege object of the type SERVICE_NAME but we can see below [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131] that it is sending an empty array list. The Ranger service never throws an exception to this and this results in any user being able to kill any other query even though they don't have necessary permissions. > Fix authorization support for Kill Query Command > > > Key: HIVE-25532 > URL: https://issues.apache.org/jira/browse/HIVE-25532 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Abhay >Assignee: Abhay >Priority: Major > > We added authorization for Kill Query command some time back with the help of > Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851 > However, we have observed that this hasn't been working as expected. The > Ranger service expects Hive to send in a privilege object of the type > SERVICE_NAME but we can see below > > [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131] > that it is sending an empty array list. > The Ranger service never throws an exception to this and this results in any > user being able to kill any query even though they don't have necessary > permissions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25532) Fix authorization support for Kill Query Command
[ https://issues.apache.org/jira/browse/HIVE-25532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhay reassigned HIVE-25532: > Fix authorization support for Kill Query Command > > > Key: HIVE-25532 > URL: https://issues.apache.org/jira/browse/HIVE-25532 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Abhay >Assignee: Abhay >Priority: Major > > We added authorization for Kill Query command some time back with the help of > Ranger. Below is the ticket https://issues.apache.org/jira/browse/RANGER-1851 > However, we have observed that this hasn't been working as expected. The > Ranger service expects Hive to send in a privilege object of the type > SERVICE_NAME but we can see below > [https://github.com/apache/hive/blob/master/service/src/java/org/apache/hive/service/server/KillQueryImpl.java#L131] > that it is sending an empty array list. > The Ranger service never throws an exception to this and this results in any > user being able to kill any other query even though they don't have necessary > permissions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651911 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 17:37 Start Date: 16/Sep/21 17:37 Worklog Time Spent: 10m Work Description: szehon-ho commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710336872 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: Do you mean eager initialization on HMS startup? I had thought it too and initially did not want to change the behavior and potentially add more connections for for HMS user who do not use any Txn functions. Is it what you mean? (Not sure all the threads will inevitably fail though, after the first succeeds they will skip) ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: Do you mean eager initialization on HMS startup? It's possible and I had thought it too and initially did not want to change the behavior and potentially add more connections for for HMS user who do not use any Txn functions. Is it what you mean? (Not sure all the threads will inevitably fail though, after the first succeeds they will skip) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651911) Time Spent: 2.5h (was: 2h 20m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651910=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651910 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 17:36 Start Date: 16/Sep/21 17:36 Worklog Time Spent: 10m Work Description: szehon-ho commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710336872 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: Do you mean eager initialization on HMS startup? I had thought it too and initially did not want to change the behavior and add more connections for for HMS user who do not use any Txn functions. Is it what you mean? (Not sure all the threads will inevitably fail though, after the first succeeds they will skip) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651910) Time Spent: 2h 20m (was: 2h 10m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651909 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 17:36 Start Date: 16/Sep/21 17:36 Worklog Time Spent: 10m Work Description: szehon-ho commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710336872 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: Do you mean eager initialization on HMS startup? I had thought it too and initially did not want to change the behavior and add more connections for potentially nothing for HMS user does not use any Txn functions. Is it what you mean? (Not sure all the threads will inevitably fail though, after the first succeeds they will skip) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651909) Time Spent: 2h 10m (was: 2h) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651893 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 17:16 Start Date: 16/Sep/21 17:16 Worklog Time Spent: 10m Work Description: sunchao commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710321412 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -368,33 +368,37 @@ public TxnHandler() { public void setConf(Configuration conf){ this.conf = conf; +int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); +long getConnectionTimeoutMs = 3; synchronized (TxnHandler.class) { if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); Review comment: I wonder if we should initialize these static fields separately (perhaps in HiveMetaStore) instead of within each handler thread. Otherwise, all the handler threads will try this code and fail inevitably. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651893) Time Spent: 2h (was: 1h 50m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651862=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651862 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 16:32 Start Date: 16/Sep/21 16:32 Worklog Time Spent: 10m Work Description: szehon-ho commented on pull request #2647: URL: https://github.com/apache/hive/pull/2647#issuecomment-921054439 Thanks for the review. Yea that's another option (or an 'initialized' boolean set only at the end), but I was afraid it would leave behind some connection or things that need cleanup if we partially initialize and re-initialize, though maybe it would not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651862) Time Spent: 1h 50m (was: 1h 40m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer: > Error occurred during processing of message. > java.lang.NullPointerException: null > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > ~[hive-exec-3.1.2.jar:3.1.2] > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > ~[hive-exec-3.1.2.jar:3.1.2] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > ~[hive-exec-3.1.2.jar:3.1.2] > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown > Source) ~[?:?] > at > jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:?] > at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651861=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651861 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 16:31 Start Date: 16/Sep/21 16:31 Worklog Time Spent: 10m Work Description: szehon-ho commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710281336 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -369,32 +369,36 @@ public void setConf(Configuration conf){ this.conf = conf; synchronized (TxnHandler.class) { + int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); + long getConnectionTimeoutMs = 3; if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); + } + + if (connPoolMutex == null) { +/*the mutex pools should ideally be somewhat larger since some operations require 1 connection from each pool and we want to avoid taking a connection from primary pool and then blocking because mutex pool is empty. There is only 1 thread in any HMS trying to mutex on each MUTEX_KEY except MUTEX_KEY.CheckLock. The CheckLock operation gets a connection from connPool first, then connPoolMutex. All others, go in the opposite order (not very elegant...). So number of connection requests for connPoolMutex cannot exceed (size of connPool + MUTEX_KEY.values().length - 1).*/ - connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + MUTEX_KEY.values().length, getConnectionTimeoutMs); - dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); +connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + MUTEX_KEY.values().length, getConnectionTimeoutMs); + } + + if (dbProduct == null) { +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { determineDatabaseProduct(dbConn); - sqlGenerator = new SQLGenerator(dbProduct, conf); } catch (SQLException e) { String msg = "Unable to instantiate JDBC connection pooling, " + e.getMessage(); Review comment: Done ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection dbConn, List txnids) } } - private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException { + private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) { DataSourceProvider dsp = DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf); if (dsp != null) { - return dsp.create(conf); + try { +return dsp.create(conf); + } catch (SQLException e) { +String msg = "Unable to instantiate JDBC connection pooling, " + e.getMessage(); +LOG.error(msg); +throw new RuntimeException(e); Review comment: Done ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection dbConn, List txnids) } } - private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException { + private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) { Review comment: Done ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -369,32 +369,36 @@ public void setConf(Configuration conf){ this.conf = conf; synchronized (TxnHandler.class) { + int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use
[jira] [Work logged] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files
[ https://issues.apache.org/jira/browse/HIVE-25529?focusedWorklogId=651816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651816 ] ASF GitHub Bot logged work on HIVE-25529: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:34 Start Date: 16/Sep/21 15:34 Worklog Time Spent: 10m Work Description: marton-bod merged pull request #2644: URL: https://github.com/apache/hive/pull/2644 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651816) Time Spent: 20m (was: 10m) > Add tests for reading/writing Iceberg V2 tables with delete files > - > > Key: HIVE-25529 > URL: https://issues.apache.org/jira/browse/HIVE-25529 > Project: Hive > Issue Type: Task >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Since Iceberg V2 tables are now official, we can start testing out whether V2 > tables can be created/read/written by Hive. While Hive has no delete > statement yet on Iceberg tables, we can nonetheless use the Iceberg API to > create delete files manually and then check if Hive honors those deletes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25485) Transform selects of literals under a UNION ALL to inline table scan
[ https://issues.apache.org/jira/browse/HIVE-25485?focusedWorklogId=651811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651811 ] ASF GitHub Bot logged work on HIVE-25485: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:26 Start Date: 16/Sep/21 15:26 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2608: URL: https://github.com/apache/hive/pull/2608#discussion_r710227073 ## File path: ql/src/test/results/clientpositive/llap/union_literals.q.out ## @@ -0,0 +1,397 @@ +PREHOOK: query: explain +SELECT * FROM ( + VALUES(1, '1'), + (2, 'orange'), + (5, 'yellow'), + (10, 'green'), + (11, 'blue'), + (12, 'indigo'), + (20, 'violet')) + AS Colors +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table + A masked pattern was here +POSTHOOK: query: explain +SELECT * FROM ( + VALUES(1, '1'), + (2, 'orange'), + (5, 'yellow'), + (10, 'green'), + (11, 'blue'), + (12, 'indigo'), + (20, 'violet')) + AS Colors +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table + A masked pattern was here +STAGE DEPENDENCIES: + Stage-0 is a root stage + +STAGE PLANS: + Stage: Stage-0 +Fetch Operator + limit: -1 + Processor Tree: +TableScan + alias: _dummy_table + Row Limit Per Split: 1 + Select Operator +expressions: array(const struct(1,'1'),const struct(2,'orange'),const struct(5,'yellow'),const struct(10,'green'),const struct(11,'blue'),const struct(12,'indigo'),const struct(20,'violet')) (type: array>) +outputColumnNames: _col0 +UDTF Operator + function name: inline + Select Operator +expressions: col1 (type: int), col2 (type: string) +outputColumnNames: _col0, _col1 +ListSink + +PREHOOK: query: explain +SELECT * FROM ( + VALUES(1, '1'), + (2, 'orange'), + (5, 'yellow'), + (10, 'green'), + (11, 'blue'), + (12, 'indigo'), + (20, 'violet')) + AS Colors +union all + select 2,'2' +union all + select 2,'2' +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table + A masked pattern was here +POSTHOOK: query: explain +SELECT * FROM ( + VALUES(1, '1'), + (2, 'orange'), + (5, 'yellow'), + (10, 'green'), + (11, 'blue'), + (12, 'indigo'), + (20, 'violet')) + AS Colors +union all + select 2,'2' +union all + select 2,'2' +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table + A masked pattern was here +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 +Tez + A masked pattern was here + Edges: +Map 1 <- Union 2 (CONTAINS) +Map 3 <- Union 2 (CONTAINS) + A masked pattern was here + Vertices: +Map 1 +Map Operator Tree: +TableScan + alias: _dummy_table + Row Limit Per Split: 1 + Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column stats: COMPLETE + Select Operator +expressions: array(const struct(2,'2'),const struct(2,'2')) (type: array>) +outputColumnNames: _col0 +Statistics: Num rows: 1 Data size: 56 Basic stats: COMPLETE Column stats: COMPLETE +UDTF Operator + Statistics: Num rows: 1 Data size: 56 Basic stats: COMPLETE Column stats: COMPLETE + function name: inline + Select Operator +expressions: col1 (type: int), col2 (type: string) +outputColumnNames: _col0, _col1 +Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE +File Output Operator + compressed: false + Statistics: Num rows: 2 Data size: 16 Basic stats: COMPLETE Column stats: COMPLETE + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe +Execution mode: llap +LLAP IO: no inputs +Map 3 +Map Operator Tree: +TableScan + alias: _dummy_table + Row Limit Per Split: 1 + Statistics: Num rows: 1 Data size: 10 Basic stats:
[jira] [Work logged] (HIVE-25485) Transform selects of literals under a UNION ALL to inline table scan
[ https://issues.apache.org/jira/browse/HIVE-25485?focusedWorklogId=651787=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651787 ] ASF GitHub Bot logged work on HIVE-25485: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:07 Start Date: 16/Sep/21 15:07 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2608: URL: https://github.com/apache/hive/pull/2608#discussion_r710210766 ## File path: ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveTransformSimpleSelectsToInlineTableInUnion.java ## @@ -0,0 +1,214 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hive.ql.optimizer.calcite.rules; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import org.apache.calcite.plan.RelOptCluster; +import org.apache.calcite.plan.RelOptRule; +import org.apache.calcite.plan.RelOptRuleCall; +import org.apache.calcite.rel.RelNode; +import org.apache.calcite.rel.core.Project; +import org.apache.calcite.rel.type.RelDataType; +import org.apache.calcite.rel.type.RelRecordType; +import org.apache.calcite.rex.RexBuilder; +import org.apache.calcite.rex.RexCall; +import org.apache.calcite.rex.RexNode; +import org.apache.calcite.sql.SqlOperator; +import org.apache.calcite.sql.fun.SqlStdOperatorTable; +import org.apache.hadoop.hive.ql.metadata.Table; +import org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException; +import org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable; +import org.apache.hadoop.hive.ql.optimizer.calcite.TraitsUtil; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableFunctionScan; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan; +import org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveUnion; +import org.apache.hadoop.hive.ql.optimizer.calcite.translator.SqlFunctionConverter; +import org.apache.hadoop.hive.ql.parse.SemanticAnalyzer; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.collect.ImmutableList; + +/** + * Transforms SELECTS of literals under UNION ALL into inline table scans. + */ +public class HiveTransformSimpleSelectsToInlineTableInUnion extends RelOptRule { Review comment: I was unable to give it a decent name renamed the class :+1: -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651787) Time Spent: 50m (was: 40m) > Transform selects of literals under a UNION ALL to inline table scan > > > Key: HIVE-25485 > URL: https://issues.apache.org/jira/browse/HIVE-25485 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > {code} > select 1 > union all > select 1 > union all > [...] > union all > select 1 > {code} > results in a very big plan; which will have vertexes proportional to the > number of union all branch - hence it could be slow to execute it -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651785=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651785 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:06 Start Date: 16/Sep/21 15:06 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #2645: URL: https://github.com/apache/hive/pull/2645#discussion_r710210043 ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java ## @@ -1820,6 +1830,8 @@ private static boolean removeFromRunningTaskMap(TreeMap LLAP Scheduler task exits with fatal error if the executor node is down > --- > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651784=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651784 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:06 Start Date: 16/Sep/21 15:06 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #2645: URL: https://github.com/apache/hive/pull/2645#discussion_r710209656 ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java ## @@ -1447,23 +1454,26 @@ private SelectHostResult selectHost(TaskInfo request, Map if (request.shouldForceLocality()) { requestedHostsWillBecomeAvailable = true; } else { - LlapServiceInstance inst = activeInstances.getByHost(host).stream().findFirst().get(); - NodeInfo nodeInfo = instanceToNodeMap.get(inst.getWorkerIdentity()); - if (nodeInfo != null && nodeInfo.getEnableTime() > request.getLocalityDelayTimeout() - && nodeInfo.isDisabled() && nodeInfo.hadCommFailure()) { -LOG.debug("Host={} will not become available within requested timeout", nodeInfo); -// This node will likely be activated after the task timeout expires. - } else { -// Worth waiting for the timeout. -requestedHostsWillBecomeAvailable = true; + for (LlapServiceInstance inst : activeInstancesByHost) { +NodeInfo nodeInfo = instanceToNodeMap.get(inst.getWorkerIdentity()); +if (nodeInfo == null) { + LOG.warn("Null NodeInfo when attempting to get host {}", host); + // Leave requestedHostWillBecomeAvailable as is. If some other host is found - delay, + // else ends up allocating to a random host immediately. + continue; Review comment: we can avoid continue by changing second if to else if -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651784) Time Spent: 1.5h (was: 1h 20m) > LLAP Scheduler task exits with fatal error if the executor node is down > --- > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25516) ITestDbTxnManager is broken after HIVE-24120
[ https://issues.apache.org/jira/browse/HIVE-25516?focusedWorklogId=651783=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651783 ] ASF GitHub Bot logged work on HIVE-25516: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:05 Start Date: 16/Sep/21 15:05 Worklog Time Spent: 10m Work Description: deniskuzZ merged pull request #2637: URL: https://github.com/apache/hive/pull/2637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651783) Time Spent: 20m (was: 10m) > ITestDbTxnManager is broken after HIVE-24120 > > > Key: HIVE-25516 > URL: https://issues.apache.org/jira/browse/HIVE-25516 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651782 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:05 Start Date: 16/Sep/21 15:05 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #2645: URL: https://github.com/apache/hive/pull/2645#discussion_r710209073 ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java ## @@ -1447,23 +1454,26 @@ private SelectHostResult selectHost(TaskInfo request, Map if (request.shouldForceLocality()) { requestedHostsWillBecomeAvailable = true; } else { - LlapServiceInstance inst = activeInstances.getByHost(host).stream().findFirst().get(); - NodeInfo nodeInfo = instanceToNodeMap.get(inst.getWorkerIdentity()); - if (nodeInfo != null && nodeInfo.getEnableTime() > request.getLocalityDelayTimeout() - && nodeInfo.isDisabled() && nodeInfo.hadCommFailure()) { -LOG.debug("Host={} will not become available within requested timeout", nodeInfo); -// This node will likely be activated after the task timeout expires. - } else { -// Worth waiting for the timeout. -requestedHostsWillBecomeAvailable = true; + for (LlapServiceInstance inst : activeInstancesByHost) { +NodeInfo nodeInfo = instanceToNodeMap.get(inst.getWorkerIdentity()); +if (nodeInfo == null) { + LOG.warn("Null NodeInfo when attempting to get host {}", host); + // Leave requestedHostWillBecomeAvailable as is. If some other host is found - delay, + // else ends up allocating to a random host immediately. + continue; Review comment: we can avoid continue by changing second if to else if -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651782) Time Spent: 1h 20m (was: 1h 10m) > LLAP Scheduler task exits with fatal error if the executor node is down > --- > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651776=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651776 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 16/Sep/21 15:00 Start Date: 16/Sep/21 15:00 Worklog Time Spent: 10m Work Description: maheshk114 commented on a change in pull request #2645: URL: https://github.com/apache/hive/pull/2645#discussion_r710204368 ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java ## @@ -1820,6 +1830,8 @@ private static boolean removeFromRunningTaskMap(TreeMap LLAP Scheduler task exits with fatal error if the executor node is down > --- > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651774 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 16/Sep/21 14:58 Start Date: 16/Sep/21 14:58 Worklog Time Spent: 10m Work Description: maheshk114 commented on a change in pull request #2645: URL: https://github.com/apache/hive/pull/2645#discussion_r710203031 ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java ## @@ -1430,6 +1430,13 @@ private SelectHostResult selectHost(TaskInfo request, Map boolean requestedHostsWillBecomeAvailable = false; for (String host : requestedHosts) { prefHostCount++; + + // Check if the host is removed from the registry after availableHostMap is created. + Set activeInstancesByHost = activeInstances.getByHost(host); + if (activeInstancesByHost == null || activeInstancesByHost.isEmpty()) { +continue; + } + // Pick the first host always. Weak attempt at cache affinity. if (availableHostMap.containsKey(host)) { Review comment: i think having this separate check makes the code more readable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651774) Time Spent: 1h (was: 50m) > LLAP Scheduler task exits with fatal error if the executor node is down > --- > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25531) Remove the core classified hive-exec artifact
[ https://issues.apache.org/jira/browse/HIVE-25531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich reassigned HIVE-25531: --- > Remove the core classified hive-exec artifact > - > > Key: HIVE-25531 > URL: https://issues.apache.org/jira/browse/HIVE-25531 > Project: Hive > Issue Type: Improvement >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich >Priority: Major > > * this artifact was introduced in HIVE-7423 > * loading this artifact and the shaded hive-exec (along with the jdbc driver) > could create interesting classpath problems > * if other projects have issues with the shaded hive-exec artifact we must > start fix those problems -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25530) AssertionError when query involves multiple JDBC tables and views
[ https://issues.apache.org/jira/browse/HIVE-25530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416148#comment-17416148 ] Stamatis Zampetakis commented on HIVE-25530: This seems related to HIVE-23479. I think that by fixing HIVE-23479 the {{AssertionError}} described here may also disappear. > AssertionError when query involves multiple JDBC tables and views > - > > Key: HIVE-25530 > URL: https://issues.apache.org/jira/browse/HIVE-25530 > Project: Hive > Issue Type: Bug > Components: CBO, HiveServer2 >Affects Versions: 4.0.0 >Reporter: Stamatis Zampetakis >Assignee: Soumyakanti Das >Priority: Major > Fix For: 4.0.0 > > Attachments: engesc_6056.q > > > An {{AssertionError}} is thrown during compilation when a query contains > multiple external JDBC tables and there are available materialized views > which can be used to answer the query. > The problem can be reproduced by running the scenario in [^engesc_6056.q]. > {code:bash} > mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=engesc_6056.q > -Dtest.output.overwrite > {code} > The stacktrace is shown below: > {noformat} > java.lang.AssertionError: Rule's description should be unique; existing > rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE); new > rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE) > at > org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:158) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:406) > at > org.apache.calcite.adapter.jdbc.JdbcConvention.register(JdbcConvention.java:66) > at > org.apache.calcite.plan.AbstractRelOptPlanner.registerClass(AbstractRelOptPlanner.java:233) > at > org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.registerClass(HiveVolcanoPlanner.java:90) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1224) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) > at > org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) > at > org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283) > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewBoxing$HiveMaterializedViewUnboxingRule.onMatch(HiveMaterializedViewBoxing.java:210) > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229) > at > org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2027) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1717) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1589) > at > org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1341) > at >
[jira] [Commented] (HIVE-25530) AssertionError when query involves multiple JDBC tables and views
[ https://issues.apache.org/jira/browse/HIVE-25530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416145#comment-17416145 ] Stamatis Zampetakis commented on HIVE-25530: Basically any query with more than one external JDBC table and at least one view can trigger the problem. >From a quick look the culprit seems to be that we are >[instantiating|https://github.com/apache/hive/blob/3e861c5f2775cda4821199681e0e2e9d25654371/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L3023] > a new {{JdbcConvention}} for every table in the query. Then when the >{{VolcanoPlanner}} runs it will [register the >rules|https://github.com/apache/calcite/blob/f3baf348598fcc6bc4f97a0abee3f99309e5bf76/core/src/main/java/org/apache/calcite/plan/AbstractRelOptPlanner.java#L239] > for every convention that is not registered till now. Since there is a new >convention per table it will trigger the registering of the rules as many >times as distinct tables in the query. > AssertionError when query involves multiple JDBC tables and views > - > > Key: HIVE-25530 > URL: https://issues.apache.org/jira/browse/HIVE-25530 > Project: Hive > Issue Type: Bug > Components: CBO, HiveServer2 >Affects Versions: 4.0.0 >Reporter: Stamatis Zampetakis >Assignee: Soumyakanti Das >Priority: Major > Fix For: 4.0.0 > > Attachments: engesc_6056.q > > > An {{AssertionError}} is thrown during compilation when a query contains > multiple external JDBC tables and there are available materialized views > which can be used to answer the query. > The problem can be reproduced by running the scenario in [^engesc_6056.q]. > {code:bash} > mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=engesc_6056.q > -Dtest.output.overwrite > {code} > The stacktrace is shown below: > {noformat} > java.lang.AssertionError: Rule's description should be unique; existing > rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE); new > rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE) > at > org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:158) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:406) > at > org.apache.calcite.adapter.jdbc.JdbcConvention.register(JdbcConvention.java:66) > at > org.apache.calcite.plan.AbstractRelOptPlanner.registerClass(AbstractRelOptPlanner.java:233) > at > org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.registerClass(HiveVolcanoPlanner.java:90) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1224) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) > at > org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) > at > org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283) > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewBoxing$HiveMaterializedViewUnboxingRule.onMatch(HiveMaterializedViewBoxing.java:210) > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229) > at > org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2027) > at >
[jira] [Assigned] (HIVE-25530) AssertionError when query involves multiple JDBC tables and views
[ https://issues.apache.org/jira/browse/HIVE-25530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis reassigned HIVE-25530: -- > AssertionError when query involves multiple JDBC tables and views > - > > Key: HIVE-25530 > URL: https://issues.apache.org/jira/browse/HIVE-25530 > Project: Hive > Issue Type: Bug > Components: CBO, HiveServer2 >Affects Versions: 4.0.0 >Reporter: Stamatis Zampetakis >Assignee: Soumyakanti Das >Priority: Major > Fix For: 4.0.0 > > Attachments: engesc_6056.q > > > An {{AssertionError}} is thrown during compilation when a query contains > multiple external JDBC tables and there are available materialized views > which can be used to answer the query. > The problem can be reproduced by running the scenario in [^engesc_6056.q]. > {code:bash} > mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile=engesc_6056.q > -Dtest.output.overwrite > {code} > The stacktrace is shown below: > {noformat} > java.lang.AssertionError: Rule's description should be unique; existing > rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE); new > rule=JdbcToEnumerableConverterRule(in:JDBC.DERBY,out:ENUMERABLE) > at > org.apache.calcite.plan.AbstractRelOptPlanner.addRule(AbstractRelOptPlanner.java:158) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.addRule(VolcanoPlanner.java:406) > at > org.apache.calcite.adapter.jdbc.JdbcConvention.register(JdbcConvention.java:66) > at > org.apache.calcite.plan.AbstractRelOptPlanner.registerClass(AbstractRelOptPlanner.java:233) > at > org.apache.hadoop.hive.ql.optimizer.calcite.cost.HiveVolcanoPlanner.registerClass(HiveVolcanoPlanner.java:90) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1224) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) > at > org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84) > at > org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604) > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:148) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268) > at > org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283) > at > org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewBoxing$HiveMaterializedViewUnboxingRule.onMatch(HiveMaterializedViewBoxing.java:210) > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:229) > at > org.apache.calcite.plan.volcano.IterativeRuleDriver.drive(IterativeRuleDriver.java:58) > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:510) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyMaterializedViewRewriting(CalcitePlanner.java:2027) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1717) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1589) > at > org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) > at > org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) > at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) > at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1341) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:559) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12549) > at >
[jira] [Updated] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files
[ https://issues.apache.org/jira/browse/HIVE-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25529: -- Labels: pull-request-available (was: ) > Add tests for reading/writing Iceberg V2 tables with delete files > - > > Key: HIVE-25529 > URL: https://issues.apache.org/jira/browse/HIVE-25529 > Project: Hive > Issue Type: Task >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Since Iceberg V2 tables are now official, we can start testing out whether V2 > tables can be created/read/written by Hive. While Hive has no delete > statement yet on Iceberg tables, we can nonetheless use the Iceberg API to > create delete files manually and then check if Hive honors those deletes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files
[ https://issues.apache.org/jira/browse/HIVE-25529?focusedWorklogId=651644=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651644 ] ASF GitHub Bot logged work on HIVE-25529: - Author: ASF GitHub Bot Created on: 16/Sep/21 13:02 Start Date: 16/Sep/21 13:02 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2644: URL: https://github.com/apache/hive/pull/2644#discussion_r710095882 ## File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/HiveIcebergTestUtils.java ## @@ -299,4 +311,68 @@ public static void validateDataWithSQL(TestHiveShell shell, String tableName, Li } } } + + /** + * @param table The table to create the delete file for + * @param deleteFilePath The path where the delete file should be created, relative to the table location root + * @param equalityFields List of field names that should play a role in the equality check + * @param fileFormat The file format that should be used for writing out the delete file + * @param rowsToDelete The rows that should be deleted. It's enough to fill out the fields that are relevant for the + * equality check, as listed in equalityFields, the rest of the fields are ignored + * @return The DeleteFile created + * @throws IOException If there is an error during DeleteFile write + */ + public static DeleteFile createEqualityDeleteFile(Table table, String deleteFilePath, List equalityFields, + FileFormat fileFormat, List rowsToDelete) throws IOException { +List equalityFieldIds = equalityFields.stream() +.map(id -> table.schema().findField(id).fieldId()) +.collect(Collectors.toList()); +Schema eqDeleteRowSchema = table.schema().select(equalityFields.toArray(new String[]{})); + +FileAppenderFactory appenderFactory = new GenericAppenderFactory(table.schema(), table.spec(), +ArrayUtil.toIntArray(equalityFieldIds), eqDeleteRowSchema, null); +EncryptedOutputFile outputFile = table.encryption().encrypt(HadoopOutputFile.fromPath( +new org.apache.hadoop.fs.Path(table.location(), deleteFilePath), new Configuration())); + +PartitionKey part = new PartitionKey(table.spec(), eqDeleteRowSchema); +part.partition(rowsToDelete.get(0)); +EqualityDeleteWriter eqWriter = appenderFactory.newEqDeleteWriter(outputFile, fileFormat, part); +try (EqualityDeleteWriter writer = eqWriter) { + writer.deleteAll(rowsToDelete); +} +return eqWriter.toDeleteFile(); + } + + /** + * @param table The table to create the delete file for + * @param deleteFilePath The path where the delete file should be created, relative to the table location root + * @param fileFormat The file format that should be used for writing out the delete file + * @param partitionValues A map of partition values (partitionKey=partitionVal, ...) to be used for the delete file + * @param deletes The list of position deletes, each containing the data file path, the position of the row in the + *data file and the row itself that should be deleted + * @return The DeleteFile created + * @throws IOException If there is an error during DeleteFile write + */ + public static DeleteFile createPositionalDeleteFile(Table table, String deleteFilePath, FileFormat fileFormat, Review comment: Thx for the investigation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651644) Remaining Estimate: 0h Time Spent: 10m > Add tests for reading/writing Iceberg V2 tables with delete files > - > > Key: HIVE-25529 > URL: https://issues.apache.org/jira/browse/HIVE-25529 > Project: Hive > Issue Type: Task >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Since Iceberg V2 tables are now official, we can start testing out whether V2 > tables can be created/read/written by Hive. While Hive has no delete > statement yet on Iceberg tables, we can nonetheless use the Iceberg API to > create delete files manually and then check if Hive honors those deletes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651641=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651641 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 12:44 Start Date: 16/Sep/21 12:44 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710079309 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -369,32 +369,36 @@ public void setConf(Configuration conf){ this.conf = conf; synchronized (TxnHandler.class) { + int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); + long getConnectionTimeoutMs = 3; if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); + } + + if (connPoolMutex == null) { +/*the mutex pools should ideally be somewhat larger since some operations require 1 connection from each pool and we want to avoid taking a connection from primary pool and then blocking because mutex pool is empty. There is only 1 thread in any HMS trying to mutex on each MUTEX_KEY except MUTEX_KEY.CheckLock. The CheckLock operation gets a connection from connPool first, then connPoolMutex. All others, go in the opposite order (not very elegant...). So number of connection requests for connPoolMutex cannot exceed (size of connPool + MUTEX_KEY.values().length - 1).*/ - connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + MUTEX_KEY.values().length, getConnectionTimeoutMs); - dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); +connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + MUTEX_KEY.values().length, getConnectionTimeoutMs); + } + + if (dbProduct == null) { +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { determineDatabaseProduct(dbConn); - sqlGenerator = new SQLGenerator(dbProduct, conf); } catch (SQLException e) { String msg = "Unable to instantiate JDBC connection pooling, " + e.getMessage(); Review comment: Should we update the exception text here as you are handling pooling exceptions inside of setupJdbcConnectionPool, like Unable to determine database product ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651641) Time Spent: 1.5h (was: 1h 20m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651639=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651639 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 12:42 Start Date: 16/Sep/21 12:42 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710079309 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -369,32 +369,36 @@ public void setConf(Configuration conf){ this.conf = conf; synchronized (TxnHandler.class) { + int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); + long getConnectionTimeoutMs = 3; if (connPool == null) { -Connection dbConn = null; -// Set up the JDBC connection pool -try { - int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); - long getConnectionTimeoutMs = 3; - connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); - /*the mutex pools should ideally be somewhat larger since some operations require 1 +connPool = setupJdbcConnectionPool(conf, maxPoolSize, getConnectionTimeoutMs); + } + + if (connPoolMutex == null) { +/*the mutex pools should ideally be somewhat larger since some operations require 1 connection from each pool and we want to avoid taking a connection from primary pool and then blocking because mutex pool is empty. There is only 1 thread in any HMS trying to mutex on each MUTEX_KEY except MUTEX_KEY.CheckLock. The CheckLock operation gets a connection from connPool first, then connPoolMutex. All others, go in the opposite order (not very elegant...). So number of connection requests for connPoolMutex cannot exceed (size of connPool + MUTEX_KEY.values().length - 1).*/ - connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + MUTEX_KEY.values().length, getConnectionTimeoutMs); - dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED); +connPoolMutex = setupJdbcConnectionPool(conf, maxPoolSize + MUTEX_KEY.values().length, getConnectionTimeoutMs); + } + + if (dbProduct == null) { +try (Connection dbConn = getDbConn(Connection.TRANSACTION_READ_COMMITTED)) { determineDatabaseProduct(dbConn); - sqlGenerator = new SQLGenerator(dbProduct, conf); } catch (SQLException e) { String msg = "Unable to instantiate JDBC connection pooling, " + e.getMessage(); Review comment: Should we update the exception text here as you are handling pooling exceptions inside of setupJdbcConnectionPool? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651639) Time Spent: 1h 20m (was: 1h 10m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651638 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 12:40 Start Date: 16/Sep/21 12:40 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r71004 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection dbConn, List txnids) } } - private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException { + private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) { DataSourceProvider dsp = DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf); if (dsp != null) { - return dsp.create(conf); + try { +return dsp.create(conf); + } catch (SQLException e) { +String msg = "Unable to instantiate JDBC connection pooling, " + e.getMessage(); +LOG.error(msg); +throw new RuntimeException(e); Review comment: Could we please add msg to the exception, i.e. throw new RuntimeException(msg, e) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651638) Time Spent: 1h 10m (was: 1h) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2021-08-21T11:08:05,665 ERROR
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651634 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 12:37 Start Date: 16/Sep/21 12:37 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710052878 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection dbConn, List txnids) } } - private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException { + private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) { DataSourceProvider dsp = DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf); if (dsp != null) { - return dsp.create(conf); + try { +return dsp.create(conf); + } catch (SQLException e) { +String msg = "Unable to instantiate JDBC connection pooling, " + e.getMessage(); +LOG.error(msg); +throw new RuntimeException(e); Review comment: Why is this change required? There is handling of checked SQLException in setConf. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651634) Time Spent: 1h (was: 50m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195]
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651621 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 12:05 Start Date: 16/Sep/21 12:05 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710052878 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection dbConn, List txnids) } } - private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException { + private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) { DataSourceProvider dsp = DataSourceProviderFactory.tryGetDataSourceProviderOrNull(conf); if (dsp != null) { - return dsp.create(conf); + try { +return dsp.create(conf); + } catch (SQLException e) { +String msg = "Unable to instantiate JDBC connection pooling, " + e.getMessage(); +LOG.error(msg); +throw new RuntimeException(e); Review comment: Why is this change required? There is handling of checked SQLException in setConf. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651621) Time Spent: 50m (was: 40m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195]
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651619=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651619 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 12:01 Start Date: 16/Sep/21 12:01 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710050068 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -5567,10 +5570,16 @@ private void removeTxnsFromMinHistoryLevel(Connection dbConn, List txnids) } } - private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) throws SQLException { + private static synchronized DataSource setupJdbcConnectionPool(Configuration conf, int maxPoolSize, long getConnectionTimeoutMs) { Review comment: I don't think it needs to be static and synchronized. The only place where it's used is setConf method under the synchronized block -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651619) Time Spent: 40m (was: 0.5h) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer: > Error occurred during processing of message. > java.lang.NullPointerException: null > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > ~[hive-exec-3.1.2.jar:3.1.2] > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) >
[jira] [Work logged] (HIVE-25522) NullPointerException in TxnHandler
[ https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=651617=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651617 ] ASF GitHub Bot logged work on HIVE-25522: - Author: ASF GitHub Bot Created on: 16/Sep/21 11:50 Start Date: 16/Sep/21 11:50 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2647: URL: https://github.com/apache/hive/pull/2647#discussion_r710042962 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -369,32 +369,36 @@ public void setConf(Configuration conf){ this.conf = conf; synchronized (TxnHandler.class) { + int maxPoolSize = MetastoreConf.getIntVar(conf, ConfVars.CONNECTION_POOLING_MAX_CONNECTIONS); Review comment: minor: maxPoolSize & connectionTimeoutMs could me moved outside of synchronized block -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651617) Time Spent: 0.5h (was: 20m) > NullPointerException in TxnHandler > -- > > Key: HIVE-25522 > URL: https://issues.apache.org/jira/browse/HIVE-25522 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Affects Versions: 3.1.2, 4.0.0 >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg > issues a lot of lock() calls for commits. > We hit randomly a strange NPE that fails Iceberg commits. > {noformat} > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] > metastore.RetryingHMSHandler: java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) > at com.sun.proxy.$Proxy27.lock(Unknown Source) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111) > at > org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107) > at java.base/java.security.AccessController.doPrivileged(Native Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer: > Error occurred during processing of message. > java.lang.NullPointerException: null > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903) > ~[hive-exec-3.1.2.jar:3.1.2] > at > org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827) > ~[hive-exec-3.1.2.jar:3.1.2] > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217) > ~[hive-exec-3.1.2.jar:3.1.2] > at
[jira] [Assigned] (HIVE-25529) Add tests for reading/writing Iceberg V2 tables with delete files
[ https://issues.apache.org/jira/browse/HIVE-25529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marton Bod reassigned HIVE-25529: - > Add tests for reading/writing Iceberg V2 tables with delete files > - > > Key: HIVE-25529 > URL: https://issues.apache.org/jira/browse/HIVE-25529 > Project: Hive > Issue Type: Task >Reporter: Marton Bod >Assignee: Marton Bod >Priority: Major > > Since Iceberg V2 tables are now official, we can start testing out whether V2 > tables can be created/read/written by Hive. While Hive has no delete > statement yet on Iceberg tables, we can nonetheless use the Iceberg API to > create delete files manually and then check if Hive honors those deletes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25503) Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries
[ https://issues.apache.org/jira/browse/HIVE-25503?focusedWorklogId=651590=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651590 ] ASF GitHub Bot logged work on HIVE-25503: - Author: ASF GitHub Bot Created on: 16/Sep/21 10:25 Start Date: 16/Sep/21 10:25 Worklog Time Spent: 10m Work Description: deniskuzZ edited a comment on pull request #2612: URL: https://github.com/apache/hive/pull/2612#issuecomment-920777855 > Can we add test cases for this method? > Can we at least manually run these tests with the supported databases - the queries look scary - added unit test - performed manual test on all supported dbs using ITestDbTxnManager -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651590) Time Spent: 40m (was: 0.5h) > Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries > -- > > Key: HIVE-25503 > URL: https://issues.apache.org/jira/browse/HIVE-25503 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Performace improvement. Accumulated entries in COMPLETED_TXN_COMPONENTS can > lead to query performance degradation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25503) Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries
[ https://issues.apache.org/jira/browse/HIVE-25503?focusedWorklogId=651589=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651589 ] ASF GitHub Bot logged work on HIVE-25503: - Author: ASF GitHub Bot Created on: 16/Sep/21 10:24 Start Date: 16/Sep/21 10:24 Worklog Time Spent: 10m Work Description: deniskuzZ commented on pull request #2612: URL: https://github.com/apache/hive/pull/2612#issuecomment-920777855 > Can we add test cases for this method? > Can we at least manually run these tests with the supported databases - the queries look scary added unit test performed manual test on all supported dbs using ITestDbTxnManager -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651589) Time Spent: 0.5h (was: 20m) > Add cleanup for the duplicate COMPLETED_TXN_COMPONENTS entries > -- > > Key: HIVE-25503 > URL: https://issues.apache.org/jira/browse/HIVE-25503 > Project: Hive > Issue Type: Task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Performace improvement. Accumulated entries in COMPLETED_TXN_COMPONENTS can > lead to query performance degradation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down
[ https://issues.apache.org/jira/browse/HIVE-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis updated HIVE-25527: --- Summary: LLAP Scheduler task exits with fatal error if the executor node is down (was: LLAP Scheduler task exits with fatal error if the executor node is down.) > LLAP Scheduler task exits with fatal error if the executor node is down > --- > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.
[ https://issues.apache.org/jira/browse/HIVE-25527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17416034#comment-17416034 ] Stamatis Zampetakis commented on HIVE-25527: Hey [~maheshk114], can you please include also the error in the summary of this ticket. > LLAP Scheduler task exits with fatal error if the executor node is down. > > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25303) CTAS hive.create.as.external.legacy tries to place data files in managed WH path
[ https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=651556=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651556 ] ASF GitHub Bot logged work on HIVE-25303: - Author: ASF GitHub Bot Created on: 16/Sep/21 09:14 Start Date: 16/Sep/21 09:14 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2442: URL: https://github.com/apache/hive/pull/2442#discussion_r709936548 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java ## @@ -472,6 +474,28 @@ private void setLoadFileLocation( loc = cmv.getLocation(); } Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new Path(loc); +if (pCtx.getQueryProperties().isCTAS()) { + boolean isExternal = pCtx.getCreateTable().isExternal(); + boolean isAcid = pCtx.getCreateTable().getTblProps().getOrDefault( + hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, "false").equalsIgnoreCase("true") || + pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES); + if ((HiveConf.getBoolVar(conf, HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || isExternal) && !isAcid) { Review comment: 1. doesnt matter; if that will be a performance bottleneck later we will handle it - but as it is now; it only adds complexity/reduces readability...this stuff works incorrectly because its too complicated already; it would be better to stop add twists... 2. add a parameter to the rpc call and pass the value of `hive.create.as.external.legacy` over to the transformer so it will be able to handle that as well. 3. can't we keep a full table object in the `ctd` instead of just some parts of it? could that help overcome that issue? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651556) Time Spent: 3h 10m (was: 3h) > CTAS hive.create.as.external.legacy tries to place data files in managed WH > path > > > Key: HIVE-25303 > URL: https://issues.apache.org/jira/browse/HIVE-25303 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > Under legacy table creation mode (hive.create.as.external.legacy=true), when > a database has been created in a specific LOCATION, in a session where that > database is Used, tables are created using the following command: > {code:java} > CREATE TABLE AS SELECT {code} > should inherit the HDFS path from the database's location. Instead, Hive is > trying to write the table data into > /warehouse/tablespace/managed/hive// > +Design+: > In the CTAS query, first data is written in the target directory (which > happens in HS2) and then the table is created(This happens in HMS). So here > two decisions are being made i) target directory location ii) how the table > should be created (table type, sd e.t.c). > When HS2 needs a target location that needs to be set, it'll make create > table dry run call to HMS (where table translation happens) and i) and ii) > decisions are made within HMS and returns table object. Then HS2 will use > this location set by HMS for placing the data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25303) CTAS hive.create.as.external.legacy tries to place data files in managed WH path
[ https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=651554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651554 ] ASF GitHub Bot logged work on HIVE-25303: - Author: ASF GitHub Bot Created on: 16/Sep/21 09:09 Start Date: 16/Sep/21 09:09 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2442: URL: https://github.com/apache/hive/pull/2442#discussion_r709932488 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/TaskCompiler.java ## @@ -472,6 +474,32 @@ private void setLoadFileLocation( loc = cmv.getLocation(); } Path location = (loc == null) ? getDefaultCtasLocation(pCtx) : new Path(loc); +boolean isExternal = false; +boolean isAcid = false; +if (pCtx.getQueryProperties().isCTAS()) { + isExternal = pCtx.getCreateTable().isExternal(); + isAcid = pCtx.getCreateTable().getTblProps().getOrDefault( + hive_metastoreConstants.TABLE_IS_TRANSACTIONAL, "false").equalsIgnoreCase("true") || + pCtx.getCreateTable().getTblProps().containsKey(hive_metastoreConstants.TABLE_TRANSACTIONAL_PROPERTIES); + if ((HiveConf.getBoolVar(conf, HiveConf.ConfVars.CREATE_TABLE_AS_EXTERNAL) || (isExternal || !isAcid))) { Review comment: that seems to me premature optimization which may just hit back later...it would be simpler to run everything related to location thru the translator and even move the handling of `CREATE_TABLE_AS_EXTERNAL` to there - so that everything is on the same page. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651554) Time Spent: 3h (was: 2h 50m) > CTAS hive.create.as.external.legacy tries to place data files in managed WH > path > > > Key: HIVE-25303 > URL: https://issues.apache.org/jira/browse/HIVE-25303 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > Under legacy table creation mode (hive.create.as.external.legacy=true), when > a database has been created in a specific LOCATION, in a session where that > database is Used, tables are created using the following command: > {code:java} > CREATE TABLE AS SELECT {code} > should inherit the HDFS path from the database's location. Instead, Hive is > trying to write the table data into > /warehouse/tablespace/managed/hive// > +Design+: > In the CTAS query, first data is written in the target directory (which > happens in HS2) and then the table is created(This happens in HMS). So here > two decisions are being made i) target directory location ii) how the table > should be created (table type, sd e.t.c). > When HS2 needs a target location that needs to be set, it'll make create > table dry run call to HMS (where table translation happens) and i) and ii) > decisions are made within HMS and returns table object. Then HS2 will use > this location set by HMS for placing the data. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module
[ https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651541=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651541 ] ASF GitHub Bot logged work on HIVE-25317: - Author: ASF GitHub Bot Created on: 16/Sep/21 08:42 Start Date: 16/Sep/21 08:42 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2459: URL: https://github.com/apache/hive/pull/2459#discussion_r709911738 ## File path: llap-server/pom.xml ## @@ -38,6 +38,7 @@ org.apache.hive hive-exec ${project.version} + core Review comment: note: on branch2 guava is most likely not properly shaded away HIVE-22126 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651541) Time Spent: 3h 20m (was: 3h 10m) > Relocate dependencies in shaded hive-exec module > > > Key: HIVE-25317 > URL: https://issues.apache.org/jira/browse/HIVE-25317 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.8 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > When we want to use shaded version of hive-exec (i.e., w/o classifier), more > dependencies conflict with Spark. We need to relocate these dependencies too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25317) Relocate dependencies in shaded hive-exec module
[ https://issues.apache.org/jira/browse/HIVE-25317?focusedWorklogId=651539=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651539 ] ASF GitHub Bot logged work on HIVE-25317: - Author: ASF GitHub Bot Created on: 16/Sep/21 08:38 Start Date: 16/Sep/21 08:38 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2459: URL: https://github.com/apache/hive/pull/2459#discussion_r709908764 ## File path: llap-server/pom.xml ## @@ -38,6 +38,7 @@ org.apache.hive hive-exec ${project.version} + core Review comment: we should fix the issues with using the normal `hive-exec` artifact if there is any - loading the core jar could cause troubles... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651539) Time Spent: 3h 10m (was: 3h) > Relocate dependencies in shaded hive-exec module > > > Key: HIVE-25317 > URL: https://issues.apache.org/jira/browse/HIVE-25317 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.3.8 >Reporter: L. C. Hsieh >Assignee: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > When we want to use shaded version of hive-exec (i.e., w/o classifier), more > dependencies conflict with Spark. We need to relocate these dependencies too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651519=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651519 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 16/Sep/21 08:03 Start Date: 16/Sep/21 08:03 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #2645: URL: https://github.com/apache/hive/pull/2645#discussion_r709881332 ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java ## @@ -1820,6 +1830,8 @@ private static boolean removeFromRunningTaskMap(TreeMap LLAP Scheduler task exits with fatal error if the executor node is down. > > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25527) LLAP Scheduler task exits with fatal error if the executor node is down.
[ https://issues.apache.org/jira/browse/HIVE-25527?focusedWorklogId=651517=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651517 ] ASF GitHub Bot logged work on HIVE-25527: - Author: ASF GitHub Bot Created on: 16/Sep/21 08:01 Start Date: 16/Sep/21 08:01 Worklog Time Spent: 10m Work Description: pgaref commented on a change in pull request #2645: URL: https://github.com/apache/hive/pull/2645#discussion_r709880097 ## File path: llap-tez/src/java/org/apache/hadoop/hive/llap/tezplugins/LlapTaskSchedulerService.java ## @@ -1430,6 +1430,13 @@ private SelectHostResult selectHost(TaskInfo request, Map boolean requestedHostsWillBecomeAvailable = false; for (String host : requestedHosts) { prefHostCount++; + + // Check if the host is removed from the registry after availableHostMap is created. + Set activeInstancesByHost = activeInstances.getByHost(host); + if (activeInstancesByHost == null || activeInstancesByHost.isEmpty()) { +continue; + } + // Pick the first host always. Weak attempt at cache affinity. if (availableHostMap.containsKey(host)) { Review comment: I would avoid the continue statement above and modify the condition to: if (availableHostMap.containsKey(host) && activeInstancesByHost != null && !activeInstancesByHost.isEmpty()) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651517) Time Spent: 40m (was: 0.5h) > LLAP Scheduler task exits with fatal error if the executor node is down. > > > Key: HIVE-25527 > URL: https://issues.apache.org/jira/browse/HIVE-25527 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > In case the executor host has gone down, activeInstances will be updated with > null. So we need to check for empty/null values before accessing it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24263) Create an HMS endpoint to list partition locations
[ https://issues.apache.org/jira/browse/HIVE-24263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho resolved HIVE-24263. -- Resolution: Duplicate > Create an HMS endpoint to list partition locations > -- > > Key: HIVE-24263 > URL: https://issues.apache.org/jira/browse/HIVE-24263 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24263.patch > > Time Spent: 40m > Remaining Estimate: 0h > > In our company, we have a use-case to get quickly a list of partition > locations. Currently it is done via listPartitions, which is a very heavy > operation in terms of memory and performance. > This JIRA proposes an API: Map listPartitionLocations(String > db, String table, short max) that returns a map of partition names to > locations. > For example, we have an integration from output of a Hive pipeline to Spark > jobs that consume directly from HDFS. The Spark job scheduler needs to know > the partition paths that are available for consumption (the partition name is > not sufficient as it's input is HDFS path), and so we have to do heavy > listPartitions() for this. > Another use-case is for a HDFS data removal tool that does a nightly crawl to > see if there are associated hive partitions mapped to a given partition path. > The nightly crawling job could be much less resource-intensive if we had a > listPartitionLocations(). > As there is already an internal method in the ObjectStore for this done for > dropPartitions, it is only a matter of exposing this API to > HiveMetaStoreClient. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24263) Create an HMS endpoint to list partition locations
[ https://issues.apache.org/jira/browse/HIVE-24263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24263 started by Szehon Ho. > Create an HMS endpoint to list partition locations > -- > > Key: HIVE-24263 > URL: https://issues.apache.org/jira/browse/HIVE-24263 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24263.patch > > Time Spent: 40m > Remaining Estimate: 0h > > In our company, we have a use-case to get quickly a list of partition > locations. Currently it is done via listPartitions, which is a very heavy > operation in terms of memory and performance. > This JIRA proposes an API: Map listPartitionLocations(String > db, String table, short max) that returns a map of partition names to > locations. > For example, we have an integration from output of a Hive pipeline to Spark > jobs that consume directly from HDFS. The Spark job scheduler needs to know > the partition paths that are available for consumption (the partition name is > not sufficient as it's input is HDFS path), and so we have to do heavy > listPartitions() for this. > Another use-case is for a HDFS data removal tool that does a nightly crawl to > see if there are associated hive partitions mapped to a given partition path. > The nightly crawling job could be much less resource-intensive if we had a > listPartitionLocations(). > As there is already an internal method in the ObjectStore for this done for > dropPartitions, it is only a matter of exposing this API to > HiveMetaStoreClient. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24263) Create an HMS endpoint to list partition locations
[ https://issues.apache.org/jira/browse/HIVE-24263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-24263: - Status: Open (was: Patch Available) > Create an HMS endpoint to list partition locations > -- > > Key: HIVE-24263 > URL: https://issues.apache.org/jira/browse/HIVE-24263 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24263.patch > > Time Spent: 40m > Remaining Estimate: 0h > > In our company, we have a use-case to get quickly a list of partition > locations. Currently it is done via listPartitions, which is a very heavy > operation in terms of memory and performance. > This JIRA proposes an API: Map listPartitionLocations(String > db, String table, short max) that returns a map of partition names to > locations. > For example, we have an integration from output of a Hive pipeline to Spark > jobs that consume directly from HDFS. The Spark job scheduler needs to know > the partition paths that are available for consumption (the partition name is > not sufficient as it's input is HDFS path), and so we have to do heavy > listPartitions() for this. > Another use-case is for a HDFS data removal tool that does a nightly crawl to > see if there are associated hive partitions mapped to a given partition path. > The nightly crawling job could be much less resource-intensive if we had a > listPartitionLocations(). > As there is already an internal method in the ObjectStore for this done for > dropPartitions, it is only a matter of exposing this API to > HiveMetaStoreClient. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-20789: - Status: Open (was: Patch Available) > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 > URL: https://issues.apache.org/jira/browse/HIVE-20789 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-20789.2.patch, HIVE-20789.patch > > > We have had a scenario that health checks sending 0 bytes to HiveServer2 > sockets would DDOS the HiveServer2, if for some reason they hang or otherwise > don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will > block reading the socket. > This is the stack (we are running an older version of Hive here) > {noformat} > "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 > java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <23781b74> (a java.io.BufferedInputStream) > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) > at > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) > at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) > at > org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > Eventually HiveServer2 has no more free threads left. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-20789) HiveServer2 should have Timeouts against clients that never close sockets
[ https://issues.apache.org/jira/browse/HIVE-20789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho resolved HIVE-20789. -- Resolution: Won't Fix > HiveServer2 should have Timeouts against clients that never close sockets > - > > Key: HIVE-20789 > URL: https://issues.apache.org/jira/browse/HIVE-20789 > Project: Hive > Issue Type: Bug >Reporter: Szehon Ho >Assignee: Szehon Ho >Priority: Major > Attachments: HIVE-20789.2.patch, HIVE-20789.patch > > > We have had a scenario that health checks sending 0 bytes to HiveServer2 > sockets would DDOS the HiveServer2, if for some reason they hang or otherwise > don't send TCP FIN, then all HiveServer2 thrift thread-pool threads will > block reading the socket. > This is the stack (we are running an older version of Hive here) > {noformat} > "HiveServer2-Handler-Pool: Thread-2512239" - Thread t@2512239 > java.lang.Thread.State: RUNNABLE > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) > at java.io.BufferedInputStream.read(BufferedInputStream.java:345) > - locked <23781b74> (a java.io.BufferedInputStream) > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:346) > at > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:423) > at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:405) > at > org.apache.thrift.transport.TSaslServerTransport.read(TSaslServerTransport.java:41) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:27) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748){noformat} > Eventually HiveServer2 has no more free threads left. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25526) Run create_table Q test from TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-25526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17415936#comment-17415936 ] László Pintér commented on HIVE-25526: -- Submitted to master. Thanks for the patch, [~lvegh]! > Run create_table Q test from TestCliDriver > -- > > Key: HIVE-25526 > URL: https://issues.apache.org/jira/browse/HIVE-25526 > Project: Hive > Issue Type: Test > Components: Hive >Reporter: Laszlo Vegh >Assignee: Laszlo Vegh >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > create_table QTest should be picked up by the TestCliDriver. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25526) Run create_table Q test from TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-25526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Pintér resolved HIVE-25526. -- Resolution: Fixed > Run create_table Q test from TestCliDriver > -- > > Key: HIVE-25526 > URL: https://issues.apache.org/jira/browse/HIVE-25526 > Project: Hive > Issue Type: Test > Components: Hive >Reporter: Laszlo Vegh >Assignee: Laszlo Vegh >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > create_table QTest should be picked up by the TestCliDriver. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25526) Run create_table Q test from TestCliDriver
[ https://issues.apache.org/jira/browse/HIVE-25526?focusedWorklogId=651501=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651501 ] ASF GitHub Bot logged work on HIVE-25526: - Author: ASF GitHub Bot Created on: 16/Sep/21 07:28 Start Date: 16/Sep/21 07:28 Worklog Time Spent: 10m Work Description: lcspinter merged pull request #2643: URL: https://github.com/apache/hive/pull/2643 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 651501) Time Spent: 0.5h (was: 20m) > Run create_table Q test from TestCliDriver > -- > > Key: HIVE-25526 > URL: https://issues.apache.org/jira/browse/HIVE-25526 > Project: Hive > Issue Type: Test > Components: Hive >Reporter: Laszlo Vegh >Assignee: Laszlo Vegh >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > create_table QTest should be picked up by the TestCliDriver. -- This message was sent by Atlassian Jira (v8.3.4#803005)