[ 
https://issues.apache.org/jira/browse/KAFKA-16297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Soarez updated KAFKA-16297:
--------------------------------
    Description: 
KIP-858 proposed that when a directory failure occurs after changing the 
assignment of a replica that's moved between two directories in the same 
broker, but before the future replica promotion completes, the broker should 
reassign the replica to inform the controller of its correct status. But this 
hasn't yet been implemented, and without it this failure may lead to indefinite 
partition unavailability.

Example scenario:
 # A broker which leads partition P receives a request to alter the replica 
from directory A to directory B.
 # The broker creates a future replica in directory B and starts a replica 
fetcher.
 # Once the future replica first catches up, the broker queues a reassignment 
to inform the controller of the directory change.
 # The next time the replica catches up, the broker briefly blocks appends and 
promotes the replica. However, before the promotion is attempted, directory A 
fails.
 # The controller was informed that P in now in directory B before it received 
the notification that directory A has failed, so it does not elect a new 
leader, and as long as the broker is online, partition A remains unavailable.

 

 

  was:
KIP-858 proposed that when a directory failure occurs after changing the 
assignment of a replica that's moved between two directories in the same 
broker, but before the future replica promotion completes, the broker should 
reassign the replica to inform the controller of its correct status. But this 
hasn't yet been implemented, and without it this failure may lead to indefinite 
partition unavailability.

Example scenario:
 # A broker which leads partition P receives a request to alter the replica 
from directory A to directory B.
 # The broker creates a future replica in directory B and starts a replica 
fetcher.
 # Once the future replica first catches up, the broker queues a reassignment 
to inform the controller of the directory change.
 # The next time the replica catches up, the broker briefly blocks appends and 
promotes the replica. However, before the promotion is attempted, directory A 
fails.
 # The controller was informed that P in now in directory B before it received 
the notification that directory A has failed, so it does not elect a new 
leader, and as long as the broker is online, partition A remains unavailable.

As per KIP-858, the broker should detect this scenario and queue a reassignment 
of P into directory ID {{{}DirectoryId.LOST{}}}.

 


> Race condition while promoting future replica can lead to partition 
> unavailability.
> -----------------------------------------------------------------------------------
>
>                 Key: KAFKA-16297
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16297
>             Project: Kafka
>          Issue Type: Sub-task
>            Reporter: Igor Soarez
>            Assignee: Igor Soarez
>            Priority: Major
>
> KIP-858 proposed that when a directory failure occurs after changing the 
> assignment of a replica that's moved between two directories in the same 
> broker, but before the future replica promotion completes, the broker should 
> reassign the replica to inform the controller of its correct status. But this 
> hasn't yet been implemented, and without it this failure may lead to 
> indefinite partition unavailability.
> Example scenario:
>  # A broker which leads partition P receives a request to alter the replica 
> from directory A to directory B.
>  # The broker creates a future replica in directory B and starts a replica 
> fetcher.
>  # Once the future replica first catches up, the broker queues a reassignment 
> to inform the controller of the directory change.
>  # The next time the replica catches up, the broker briefly blocks appends 
> and promotes the replica. However, before the promotion is attempted, 
> directory A fails.
>  # The controller was informed that P in now in directory B before it 
> received the notification that directory A has failed, so it does not elect a 
> new leader, and as long as the broker is online, partition A remains 
> unavailable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to