[ 
https://issues.apache.org/jira/browse/KAFKA-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626995#comment-13626995
 ] 

Jun Rao commented on KAFKA-858:
-------------------------------

Thanks for the patch. Great catch of this bug.

Not sure about the change in ReplicaManager though. It seems that a simpler fix 
is to in stopReplica(), only remove a partition from allPartitions if 
deletePartition is true. A partition still exists if its fetcher is stopped. It 
only stops to exist if the controller decides to delete it (currently, only 
because of partition reassignment). At this point, it can be taken out of 
allPartitions.
                
> High watermark values can be overwritten during controlled shutdown
> -------------------------------------------------------------------
>
>                 Key: KAFKA-858
>                 URL: https://issues.apache.org/jira/browse/KAFKA-858
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Blocker
>              Labels: kafka-0.8, p1
>         Attachments: kafka-858.patch
>
>
> Race condition between controlled shutdown, actual process shutdown and high 
> watermark checkpoint thread frequency can cause high watermark values for a 
> subset of partitions to be overwritten. So even if the controller sends a 
> complete list of partitions to the broker, the highwatermark for some 
> partitions is still 0. This causes the follower to fetch from the leader's 
> start offset.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to