[
https://issues.apache.org/jira/browse/HIVE-25522?focusedWorklogId=654194&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-654194
]
ASF GitHub Bot logged work on HIVE-25522:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 22/Sep/21 17:59
Start Date: 22/Sep/21 17:59
Worklog Time Spent: 10m
Work Description: szehon-ho opened a new pull request #2647:
URL: https://github.com/apache/hive/pull/2647
### What changes were proposed in this pull request?
* This fixes https://issues.apache.org/jira/browse/HIVE-25522
* There are two options, either make the initialization static and kill HMS
if there is an error, or keep it lazy. Went with second approach as there seem
to be db connections that are taken and don't need to be if nobody uses any
txnHandler methods.
* Make the initailization setConf method be idempotent by checking each of
the static variables it sets and able to resume if a particular variable is not
set.
* Also some refactor to push down verbose catch blocks as much as possible.
### Why are the changes needed?
* See https://issues.apache.org/jira/browse/HIVE-25522
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Running unit tests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 654194)
Time Spent: 4h (was: 3h 50m)
> NullPointerException in TxnHandler
> ----------------------------------
>
> Key: HIVE-25522
> URL: https://issues.apache.org/jira/browse/HIVE-25522
> Project: Hive
> Issue Type: Improvement
> Components: Standalone Metastore
> Affects Versions: 3.1.2, 4.0.0
> Reporter: Szehon Ho
> Assignee: Szehon Ho
> Priority: Major
> Labels: pull-request-available
> Time Spent: 4h
> Remaining Estimate: 0h
>
> Environment: Using Iceberg on Hive 3.1.2 standalone metastore. Iceberg
> issues a lot of lock() calls for commits.
> We hit randomly a strange NPE that fails Iceberg commits.
> {noformat}
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195]
> metastore.RetryingHMSHandler: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
> at
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
> at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown Source)
> at
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> at com.sun.proxy.$Proxy27.lock(Unknown Source)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> at java.base/java.security.AccessController.doPrivileged(Native Method)
> at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> 2021-08-21T11:08:05,665 ERROR [pool-6-thread-195] server.TThreadPoolServer:
> Error occurred during processing of message.
> java.lang.NullPointerException: null
> at
> org.apache.hadoop.hive.metastore.txn.TxnHandler.enqueueLockWithRetry(TxnHandler.java:1903)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at
> org.apache.hadoop.hive.metastore.txn.TxnHandler.lock(TxnHandler.java:1827)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.lock(HiveMetaStore.java:7217)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at jdk.internal.reflect.GeneratedMethodAccessor52.invoke(Unknown
> Source) ~[?:?]
> at
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:?]
> at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at com.sun.proxy.$Proxy27.lock(Unknown Source) ~[?:?]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18111)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$lock.getResult(ThriftHiveMetastore.java:18095)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:111)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:107)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at java.security.AccessController.doPrivileged(Native Method) ~[?:?]
> at javax.security.auth.Subject.doAs(Subject.java:423) ~[?:?]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
> ~[hadoop-common-3.1.4.jar:?]
> at
> org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:119)
> ~[hive-exec-3.1.2.jar:3.1.2]
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> [hive-exec-3.1.2.jar:3.1.2]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [?:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [?:?]
> at java.lang.Thread.run(Thread.java:834) [?:?]
> {noformat}
> It seems it's this line, though root cause is not deterined.
> https://github.com/apache/hive/blob/rel/release-3.1.2/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java#L1903
--
This message was sent by Atlassian Jira
(v8.3.4#803005)