[ https://issues.apache.org/jira/browse/KAFKA-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13584521#comment-13584521 ]
Neha Narkhede commented on KAFKA-769: ------------------------------------- An easy way of resolving this would be to start the highwatermark thread only after the first leader and isr request is completed on a newly restarted broker. This is easy to keep track of since the initial leader and isr request has a special init flag turned on. This will ensure that there is no inconsistent state checkpointed to disk since we will wait until the replica manager has finished initializing the highwatermark for all its replicas from disk. Also, this logic will become trickier when we add the features to change the number of replicas online or change the number of partitions online, but we don't have to worry about that right now. > On startup, a brokers highwatermark for every topic partition gets reset to > zero > -------------------------------------------------------------------------------- > > Key: KAFKA-769 > URL: https://issues.apache.org/jira/browse/KAFKA-769 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8 > Reporter: Sriram Subramanian > Assignee: Sriram Subramanian > Priority: Blocker > Labels: p1 > Fix For: 0.8 > > > There is a race condition between the highwatermark thread and the > handleLeaderAndIsrRequest call of the request handler thread. When a broker > starts, the highwatermark thread tries to persist all the checkpoints of the > partitions in ReplicaManager. This partition map in ReplicaManager is > initially empty. When the leaderAndIsrRequest runs, it updates each partition > and if the highwatermark thread runs during this interval, it is essentially > going to overwrite the highwatermark file to an inconsistent state. The read > of the highwatermark reads from the file each time and hence would return the > inconsistent state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira