[ 
https://issues.apache.org/jira/browse/KAFKA-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15694773#comment-15694773
 ] 

ASF GitHub Bot commented on KAFKA-4442:
---------------------------------------

GitHub user lindong28 opened a pull request:

    https://github.com/apache/kafka/pull/2167

    KAFKA-4442; Controller should grab lock when it is being initialized to 
avoid race condition

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lindong28/kafka KAFKA-4442

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2167.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2167
    
----
commit 16825e60963844ab0729bf290cfc9e6cee79932f
Author: Dong Lin <lindon...@gmail.com>
Date:   2016-11-25T04:07:09Z

    KAFKA-4442; Controller should grab lock when it is being initialized to 
avoid race condition

----


> Controller should grab lock when it is being initialized to avoid race 
> condition
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-4442
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4442
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>
> Currently controller will register broker change listener before sending send 
> LeaderAndIsrRequests to live replicas. The call path looks like this:
> - onControllerFailover()
>   - partitionStateMachine.startup()
>     - triggerOnlinePartitionStateChange()
>       - handleStateChange(partition, OnlinePartition)
>         - electLeaderForPartition(partition)
>           - determines live replicas for this partition (step a)
>           - add partition to controllerContext.partitionLeadershipInfo. (step 
> b)
>           - send LeaderAndIsrRequest to those live replics for this partition
> However, if a broker registers itself in zookeeper in between step (a) and 
> step (b), the onBrokerStartup() will not send LeaderAndIsrRequest to this 
> broker for this partition because the partition is not found in 
> controllerContext.partitionLeadershipInfo. Yet onControllerFailover() will 
> not send LeaderAndIsrRequest to this broker for this partition either before 
> the broker is not considered live in step (a).
> The root cause is that onBrokerStartup() should only be executed after 
> controller has finished onControllerFailover() and initialized its state. 
> Therefore controller should grab the lock controllerContext.controllerLock 
> during onControllerFailover().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to