[jira] [Resolved] (ACCUMULO-2976) blacklist problematic tservers

Christopher Tubbs (Jira) Wed, 09 Jun 2021 13:39:11 -0700


     [ 
https://issues.apache.org/jira/browse/ACCUMULO-2976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Christopher Tubbs resolved ACCUMULO-2976.
-----------------------------------------
    Resolution: Not A Problem

Closing this stale issue. If this is still a problem, please create a new issue 
or PR at https://github.com/apache/accumulo

> blacklist problematic tservers
> ------------------------------
>
>                 Key: ACCUMULO-2976
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2976
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master
>            Reporter: Sean Busbey
>            Priority: Minor
>
> It would be nice if the master kept track of tservers that misbehave and 
> eventually blacklisted them, similar to how HDFS handles datanodes and 
> MapReduce/YARN handle trackers.
> Right now the closest we do is having the Master killing the zoolock for 
> tservers that are behaving poorly. This causes them to exit if they're not in 
> a zombie state.
> On deployments with a watchdog that relaunches failed processes, this doesn't 
> help much because the tserver comes back. In the case of i.e. flakey network 
> failures for the node this just means repeating the process and impacting 
> cluster performance while the master works out that it should kill the node 
> again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (ACCUMULO-2976) blacklist problematic tservers

Reply via email to