[
https://issues.apache.org/jira/browse/KUDU-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764448#comment-17764448
]
Alexey Serbin commented on KUDU-2912:
-------------------------------------
Restarting all masters in Kudu cluster isn't the preferred way of achieving the
desired result once the {{kudu tserver unregister}} CLI tool has appeared (see
KUDU-2915 for details).
So, as of Kudu 1.16.0 release, the recommended workflow is running the {{kudu
tserver unregister}} CLI tool to remove/unregister decommissioned and "dead"
tablet servers. It's has been documented with [changelist
8cb4b6385|https://github.com/apache/kudu/commit/8cb4b6385f680e65be9702e30d7a709063999d81].
> Document zero-downtime workflow for 'forgetting' dead tservers
> --------------------------------------------------------------
>
> Key: KUDU-2912
> URL: https://issues.apache.org/jira/browse/KUDU-2912
> Project: Kudu
> Issue Type: Bug
> Components: documentation
> Affects Versions: 1.11.0
> Reporter: Adar Dembo
> Priority: Major
>
> This is a fairly useful workflow when the goal is to rebalance the cluster.
> All it takes is one dead tserver (supposing it's decommissioned and long
> gone) for rebalancing to refuse to run. As of 1.10.0 there's a CLI parameter
> that instructs the rebalancer to ignore certain tservers, but it's annoying
> to put together a UUID list when multiple tservers are dead.
> Anyway, the zero-downtime workflow is:
> # Restart all of the masters in the cluster one by one.
> # After each restart, wait for the restarted master to load its tablet and
> join consensus (ksck should be able to indicate when this was achieved).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)