[
https://issues.apache.org/jira/browse/HBASE-22410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Shelukhin updated HBASE-22410:
-------------------------------------
Status: Patch Available (was: Open)
[[email protected]] [~busbey] I cloned HBASE-22107 to add a better
metric for the compute/etc scenarios... cleaning dead region server list after
a timeout still wouldn't provide a reliable number, although it could still be
done for maintainability in the original JIRA.
> add the notion of the expected # of servers for non-fixed server sets; report
> an alternative dead server metric
> ---------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-22410
> URL: https://issues.apache.org/jira/browse/HBASE-22410
> Project: HBase
> Issue Type: Improvement
> Components: Operability
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Priority: Major
>
> dead servers appear to only be cleaned up when a server comes up on the same
> host and port; however, if HBase is running on smth like YARN with many more
> hosts than RSes, RS may come up on a different server and the dead one will
> never be cleaned.
> The metric should be improved to account for that... it will potentially
> require configuring master with expected number of region servers, so that
> the metric could be output based on that.
> Dead server list should also be expired based on timestamp in such cases.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)