[ https://issues.apache.org/jira/browse/ACCUMULO-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Keith Turner updated ACCUMULO-1454: ----------------------------------- Attachment: ACCUMULO-1454-proposal-01.html ACCUMULO-1454-proposal-01.adoc Attached a design doc based on discussion on issue so far. I can post design doc on RB if anyone has feedback. > Need good way to perform a rolling restart of all tablet servers > ---------------------------------------------------------------- > > Key: ACCUMULO-1454 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1454 > Project: Accumulo > Issue Type: Improvement > Components: tserver > Affects Versions: 1.4.3, 1.5.0 > Reporter: Mike Drob > Attachments: ACCUMULO-1454-proposal-01.adoc, > ACCUMULO-1454-proposal-01.html > > > When needing to change a tserver parameter (e.g. java heap space) across the > entire cluster, there is not a graceful way to perform a rolling restart. > The naive approach of just killing tservers one at a time causes a lot of > churn on the cluster as tablets move around and zookeeper tries to maintain > current state. > Potential solutions might be via a fancy fate operation, with coordination by > the master. Ideally, the master would know which servers are 'safe' to > restart and could minimize overall impact during the operation. -- This message was sent by Atlassian JIRA (v6.2#6252)