[ 
https://issues.apache.org/jira/browse/KAFKA-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753257#comment-13753257
 ] 

Swapnil Ghike commented on KAFKA-1032:
--------------------------------------

The problem is that the leader that GC-ed did not receive become-follower 
request from controller soon enough, so it kept acting like a leader post GC 
for some time and appended new messages. These messages were lost when the 
affected broker became a follower.

The other approach to fix this could involve changing 
OfflinePartitionLeaderSelector to send LeaderAndIsrRequest to dead brokers, 
this will ensure that the old leader (if still alive) will stop acting like a 
leader much sooner. 
                
> Messages sent to the old leader will be lost on broker GC resulted failure
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-1032
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1032
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>
> As pointed out by Swapnil, today when a broker in on long GC, it will marked 
> by the controller as failed and trigger the onBrokerFailure function to 
> migrate leadership to other brokers. However, since the Controller does not 
> notify the broker with stopReplica request even after a new leader has been 
> elected for its partitions. The new leader will hence stop fetching from the 
> old leader while the old leader is not aware that he is no longer the leader. 
> And since the old leader is not really dead producers will not refresh their 
> metadata immediately and will continue sending messages to the old leader. 
> The old leader will only know it is no longer the leader when it gets 
> notified by controller in the onBrokerStartup function, and message sent 
> starting from the time the new leader is elected to the timestamp the old 
> leader realize it is no longer the leader will be lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to