[
https://issues.apache.org/jira/browse/KUDU-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adar Dembo updated KUDU-1407:
-----------------------------
Summary: Leader should evict a failed follower stuck in the
TABLET_NOT_RUNNING state (was: Leader should not evict a follower when the
follower is in the process of starting up a tablet)
> Leader should evict a failed follower stuck in the TABLET_NOT_RUNNING state
> ---------------------------------------------------------------------------
>
> Key: KUDU-1407
> URL: https://issues.apache.org/jira/browse/KUDU-1407
> Project: Kudu
> Issue Type: Bug
> Components: consensus
> Affects Versions: 0.8.0
> Reporter: Todd Lipcon
> Assignee: Andrew Wong
> Priority: Critical
>
> It seems like, if the leader gets an error from one of its followers because
> the tablet is not running, it considers this replica to be 'unresponsive'. If
> this happens for 5 minutes, it will evict that follower to try to create a
> new replica.
> This can cause problems at cluster startup time when there is a lot of data
> and a cold disk cache - the startup bootstrap process might be more than five
> minutes and leaders might end up evicting followers that are perfectly
> healthy (just in the process of coming up).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)