[
https://issues.apache.org/jira/browse/KAFKA-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073449#comment-14073449
]
Jun Rao commented on KAFKA-1211:
--------------------------------
The issue is that this problem not only affects ack>1, but only affects ack=-1.
Suppose that you have 3 replicas A, B, and C and A is the leader initially. If
A fails and B takes over as the new leader, C will first truncate its log,
which could include committed data. Now, if immediately after the truncation, B
fails, C has to be the new leader. Now, we may have lost previously committed
messages, even though we had only 2 failures.
> Hold the produce request with ack > 1 in purgatory until replicas' HW has
> larger than the produce offset
> --------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-1211
> URL: https://issues.apache.org/jira/browse/KAFKA-1211
> Project: Kafka
> Issue Type: Bug
> Reporter: Guozhang Wang
> Assignee: Guozhang Wang
> Fix For: 0.9.0
>
>
> Today during leader failover we will have a weakness period when the
> followers truncate their data before fetching from the new leader, i.e.,
> number of in-sync replicas is just 1. If during this time the leader has also
> failed then produce requests with ack >1 that have get responded will still
> be lost. To avoid this scenario we would prefer to hold the produce request
> in purgatory until replica's HW has larger than the offset instead of just
> their end-of-log offsets.
--
This message was sent by Atlassian JIRA
(v6.2#6252)