[
https://issues.apache.org/jira/browse/HIVE-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558043#comment-14558043
]
Nemon Lou commented on HIVE-10817:
----------------------------------
Thread stack that waiting to lock :
{code}
"HiveServer2-Handler-Pool: Thread-22945" prio=10 tid=0x00007fa28c30b000
nid=0x6d5b waiting for monitor entry [0x00007fa291f7c000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.hive.service.AbstractService.getHiveConf(AbstractService.java:145)
- waiting to lock <0x0000000724944908> (a
org.apache.hive.service.cli.CLIService)
{code}
{code}
"HiveServer2-Handler-Pool: Thread-22856" prio=10 tid=0x00007fa28c248800
nid=0x1c9c waiting for monitor entry [0x00007fa29207d000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:432)
- waiting to lock <0x0000000724944908> (a
org.apache.hive.service.cli.CLIService)
{code}
Thread stack that holding the lock :
{code}
"HiveServer2-Handler-Pool: Thread-23178" prio=10 tid=0x00007fa28c374000
nid=0x2761 runnable [0x00007fa289edb000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
...
at
org.apache.hive.service.cli.CLIService.getDelegationTokenFromMetaStore(CLIService.java:440)
- locked <0x0000000724944908> (a org.apache.hive.service.cli.CLIService)
at
org.apache.hive.service.cli.thrift.ThriftCLIService.getDelegationToken(ThriftCLIService.java:449)
{code}
> Blacklist For Bad MetaStore
> ---------------------------
>
> Key: HIVE-10817
> URL: https://issues.apache.org/jira/browse/HIVE-10817
> Project: Hive
> Issue Type: Improvement
> Components: HiveServer2, Metastore
> Affects Versions: 1.2.0
> Reporter: Nemon Lou
> Assignee: Nemon Lou
>
> During a reliability test ,when one of MetaStore 's machine power down
> ,HiveServer2 then never submit jobs to YARN.
> There are 100 JDBC clients (Beeline) running concurrently.And all the
> 100 JDBC clients hangs.
> After checking HiveServer2's thread stack,i find that most of the threads
> waiting to lock AbstractService while the one holding it is trying to connect
> to
> the bad MetaStore which has been power down.When the thread which hold this
> lock finally return SocketTimeoutException and release this lock,another
> thread will hold this lock and again stuck until socket time out.
> Adding a new blacklist mechanism finally solved this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)