[ https://issues.apache.org/jira/browse/HIVE-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nemon Lou updated HIVE-10817: ----------------------------- Attachment: HIVE-10817 > Blacklist For Bad MetaStore > --------------------------- > > Key: HIVE-10817 > URL: https://issues.apache.org/jira/browse/HIVE-10817 > Project: Hive > Issue Type: Improvement > Components: HiveServer2, Metastore > Affects Versions: 1.2.0 > Reporter: Nemon Lou > Assignee: Nemon Lou > Attachments: HIVE-10817 > > > During a reliability test ,when one of MetaStore 's machine power down > ,HiveServer2 then never submit jobs to YARN. > There are 100 JDBC clients (Beeline) running concurrently.And all the > 100 JDBC clients hangs. > After checking HiveServer2's thread stack,i find that most of the threads > waiting to lock AbstractService while the one holding it is trying to connect > to > the bad MetaStore which has been power down.When the thread which hold this > lock finally return SocketTimeoutException and release this lock,another > thread will hold this lock and again stuck until socket time out. > Adding a new blacklist mechanism finally solved this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)