[ https://issues.apache.org/jira/browse/KAFKA-15649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin McCabe updated KAFKA-15649: --------------------------------- Parent: (was: KAFKA-14127) Issue Type: Bug (was: Sub-task) > Handle directory failure timeout > --------------------------------- > > Key: KAFKA-15649 > URL: https://issues.apache.org/jira/browse/KAFKA-15649 > Project: Kafka > Issue Type: Bug > Reporter: Igor Soarez > Priority: Minor > > If a broker with an offline log directory continues to fail to notify the > controller of either: > * the fact that the directory is offline; or > * of any replica assignment into a failed directory > then the controller will not check if a leadership change is required, and > this may lead to partitions remaining indefinitely offline. > KIP-858 proposes that the broker should shut down after a configurable > timeout to force a leadership change. Alternatively, the broker could also > request to be fenced, as long as there's a path for it to later become > unfenced. > While this unavailability is possible in theory, in practice it's not easy to > entertain a scenario where a broker continues to appear as healthy before the > controller, but fails to send this information. So it's not clear if this is > a real problem. > -- This message was sent by Atlassian Jira (v8.20.10#820010)