[ 
https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073449#comment-14073449
 ] 

Jun Rao commented on KAFKA-1211:
--------------------------------

The issue is that this problem not only affects ack>1, but only affects ack=-1. 
Suppose that you have 3 replicas A, B, and C and A is the leader initially. If 
A fails and B takes over as the new leader, C will first truncate its log, 
which could include committed data. Now, if immediately after the truncation, B 
fails, C has to be the new leader. Now, we may have lost previously committed 
messages, even though we had only 2 failures.

> Hold the produce request with ack > 1 in purgatory until replicas' HW has 
> larger than the produce offset
> --------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1211
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1211
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>             Fix For: 0.9.0
>
>
> Today during leader failover we will have a weakness period when the 
> followers truncate their data before fetching from the new leader, i.e., 
> number of in-sync replicas is just 1. If during this time the leader has also 
> failed then produce requests with ack >1 that have get responded will still 
> be lost. To avoid this scenario we would prefer to hold the produce request 
> in purgatory until replica's HW has larger than the offset instead of just 
> their end-of-log offsets.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to