nkeywal created HBASE-6290:
------------------------------

             Summary: Add a function a mark a server as dead and start the 
recovery the process
                 Key: HBASE-6290
                 URL: https://issues.apache.org/jira/browse/HBASE-6290
             Project: HBase
          Issue Type: Improvement
          Components: monitoring
    Affects Versions: 0.96.0
            Reporter: nkeywal
            Assignee: nkeywal
            Priority: Minor


ZooKeeper is used a a monitoring tool: we use znode and we start the recovery 
process when a znode is deleted by ZK because it got a timeout. This timeout is 
defaulted to 90 seconds, and often set to 30s

However, some HW issues could be detected by specialized hw monitoring tools 
before the ZK timeout. For this reason, it makes sense to offer a very simple 
function to mark a RS as dead. This should not take in


It could be a hbase shell function such as
considerAsDead ipAddress|serverName

This would delete all the znodes of the server running on this box, starting 
the recovery process.


Such a function would be easily callable (at callers risk) by any fault 
detection tool... We could have issues to identify the right master & region 
servers around ipv4 vs ipv6 vs and multi networked boxes however.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to