[ 
https://issues.apache.org/jira/browse/HBASE-17704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15887093#comment-15887093
 ] 

Pankaj Kumar commented on HBASE-17704:
--------------------------------------

[~apurtell], Can we have some chore service which will try to recover those 
regions who are in transition for longer duration (say > 10 min)? 

I feel, in some situation this chore service will be useful to reassign the 
regions which are stuck in FAILED_OPEN/FAILED_CLOSE state infinitely. 
Like in this JIRA scenario, even after some time DNs came up but HM couldn't 
reassign them.

> Regions stuck in FAILED_OPEN when HDFS blocks are missing
> ---------------------------------------------------------
>
>                 Key: HBASE-17704
>                 URL: https://issues.apache.org/jira/browse/HBASE-17704
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 1.1.8
>            Reporter: Mathias Herberts
>
> We recently experienced the loss of a whole rack (6 DNs + RS) in a 120 node 
> cluster. This lead to the regions which were present on the 6 RS which became 
> unavailable to be reassigned to live RSs. When attempting to open some of the 
> reassigned regions, some RS encountered missing blocks and issued "No live 
> nodes contain current block Block locations" putting the regions in state 
> FAILED_OPEN.
> Once the disappeared DNs went back online, the regions were left in 
> FAILED_OPEN, needing a restart of all the affected RSs to solve the problem.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to