[
https://issues.apache.org/jira/browse/HIVE-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nemon Lou updated HIVE-10817:
-----------------------------
Attachment: HIVE-10817
> Blacklist For Bad MetaStore
> ---------------------------
>
> Key: HIVE-10817
> URL: https://issues.apache.org/jira/browse/HIVE-10817
> Project: Hive
> Issue Type: Improvement
> Components: HiveServer2, Metastore
> Affects Versions: 1.2.0
> Reporter: Nemon Lou
> Assignee: Nemon Lou
> Attachments: HIVE-10817
>
>
> During a reliability test ,when one of MetaStore 's machine power down
> ,HiveServer2 then never submit jobs to YARN.
> There are 100 JDBC clients (Beeline) running concurrently.And all the
> 100 JDBC clients hangs.
> After checking HiveServer2's thread stack,i find that most of the threads
> waiting to lock AbstractService while the one holding it is trying to connect
> to
> the bad MetaStore which has been power down.When the thread which hold this
> lock finally return SocketTimeoutException and release this lock,another
> thread will hold this lock and again stuck until socket time out.
> Adding a new blacklist mechanism finally solved this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)