[
https://issues.apache.org/jira/browse/HBASE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215785#comment-13215785
]
Jesse Yates commented on HBASE-5075:
------------------------------------
Haven't had a chance to look at the latest patch yet, but have read through the
docs. I have the same concern as Lars, namely,
bq. a bit worried about maintaining an additional process on every machine
What about doing something a bit simpler like adding a runtime shutdown hook to
the RS such that the region server will update ZK or the master when it decides
to bail out. Even something as simple as just removing your own znode on
failure would be sufficient to cover this use case, correct?
> regionserver crashed and failover
> ---------------------------------
>
> Key: HBASE-5075
> URL: https://issues.apache.org/jira/browse/HBASE-5075
> Project: HBase
> Issue Type: Improvement
> Components: monitoring, regionserver, replication, zookeeper
> Affects Versions: 0.92.1
> Reporter: zhiyuan.dai
> Fix For: 0.90.5
>
> Attachments: Degion of Failure Detection.pdf, HBase-5075-shell.patch,
> HBase-5075-src.patch
>
>
> regionserver crashed,it is too long time to notify hmaster.when hmaster know
> regionserver's shutdown,it is long time to fetch the hlog's lease.
> hbase is a online db, availability is very important.
> i have a idea to improve availability, monitor node to check regionserver's
> pid.if this pid not exsits,i think the rs down,i will delete the znode,and
> force close the hlog file.
> so the period maybe 100ms.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira