[
https://issues.apache.org/jira/browse/KUDU-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17449978#comment-17449978
]
YifanZhang commented on KUDU-2915:
----------------------------------
I think it's good that we could introduce a tool to unregister a dead tablet
server from the master's in-memory state.
And on the other hand, I also want to know whether it is safe or reasonable to
make master take the initiative to forget a tablet server that have been in
'dead' state for 'a long time' and no replica is running on it. If the same
tablet server comes back again, the master re-register it in it's in-memory
state. Is there some problems?
> Support to delete dead tservers from CLI
> ----------------------------------------
>
> Key: KUDU-2915
> URL: https://issues.apache.org/jira/browse/KUDU-2915
> Project: Kudu
> Issue Type: Improvement
> Components: CLI, ops-tooling
> Affects Versions: 1.10.0
> Reporter: Hexin
> Assignee: Hexin
> Priority: Major
> Labels: supportability
>
> Sometimes the nodes in the cluster will crash due to machine problems such as
> disk corruption, which can be very common. However, if there are some dead
> tservers, ksck result will always show error (e.g. Not all Tablet Servers are
> reachable) although all tables have recovered to be healthy.
> The only way now to get the healthy status of ksck is to restart all masters
> one by one. In some cases, for example, if the machine has completely
> corrupted, we hope to get healthy status of ksck without restarting, since
> after restarting masters the cluster will take some time to recover, during
> which it will have influence on scanning or upsetting to tables. The recovery
> time can be long which mainly depends on the scale of cluster. This problem
> can be serious and annoying especially tservers crashed with high-frequency
> in a large cluster.
> It’s valuable if we have an easier way to delete dead tservers from master, I
> will support a kudu command to realize it.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)