[
https://issues.apache.org/jira/browse/KUDU-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
YifanZhang resolved KUDU-3341.
------------------------------
Fix Version/s: 1.16.0
Resolution: Fixed
> Catalog Manager should stop retrying DeleteTablet when receive
> WRONG_SERVER_UUID error
> --------------------------------------------------------------------------------------
>
> Key: KUDU-3341
> URL: https://issues.apache.org/jira/browse/KUDU-3341
> Project: Kudu
> Issue Type: Improvement
> Components: master
> Reporter: YifanZhang
> Assignee: YifanZhang
> Priority: Minor
> Fix For: 1.16.0
>
>
> Sometimes a tablet server could be shutdown because of detected disk
> failures, and this server would be re-added to the cluster with all data
> cleared.
> Replicas could be replicated afterÂ
> {{\-\-follower_unavailable_considered_failed_sec}} seconds. And then master
> send DeleteTablet RPCs to this tserver, but receive either a RPC
> failure(tserver was shutdown) or a WRONG_SERVER_UUID error(tserver started
> with a new uuid), and keep retrying to delete tablets after
> {{{}--unresponsive_ts_rpc_timeout_ms{}}}(default 1 hour).
> It's not so necessary to retry when receive WRONG_SERVER_UUID errors, because
> the server uuid could only be corrected by restarting the tablet server, at
> that time full tablet reports would sent to master and if any, outdated
> replicas could be deleted finally.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)