[ https://issues.apache.org/jira/browse/KAFKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073326#comment-14073326 ]
Sudarshan Kadambi commented on KAFKA-1555: ------------------------------------------ Neha and Jun: I want to make sure we have agreement on the issue (both, that there is an issue and what the issue is :) ). We plan to use a replication factor of 3 and acks=2. We'd like this to mean that we are tolerant to the loss of 1 machine, without loss of published messages or the producer being blocked. Let's focus on the following scenario: Let's say, L, F1 and F2 are the leader and 2 followers in the ISR. With acks=2, let's say L and F1 have committed all published messages and F2 is up to replica.max.lag.messages behind. When L goes down, F2 is made the new leader and not F1, even though F1 is up to date with the leader. We need to be able to take into account how caught up a given broker is in the ISR, when electing a new leader. This is also unclean leader election, but of a different type than what we've been discussing. > provide strong consistency with reasonable availability > ------------------------------------------------------- > > Key: KAFKA-1555 > URL: https://issues.apache.org/jira/browse/KAFKA-1555 > Project: Kafka > Issue Type: Improvement > Components: controller > Affects Versions: 0.8.1.1 > Reporter: Jiang Wu > Assignee: Neha Narkhede > > In a mission critical application, we expect a kafka cluster with 3 brokers > can satisfy two requirements: > 1. When 1 broker is down, no message loss or service blocking happens. > 2. In worse cases such as two brokers are down, service can be blocked, but > no message loss happens. > We found that current kafka versoin (0.8.1.1) cannot achieve the requirements > due to its three behaviors: > 1. when choosing a new leader from 2 followers in ISR, the one with less > messages may be chosen as the leader. > 2. even when replica.lag.max.messages=0, a follower can stay in ISR when it > has less messages than the leader. > 3. ISR can contains only 1 broker, therefore acknowledged messages may be > stored in only 1 broker. > The following is an analytical proof. > We consider a cluster with 3 brokers and a topic with 3 replicas, and assume > that at the beginning, all 3 replicas, leader A, followers B and C, are in > sync, i.e., they have the same messages and are all in ISR. > According to the value of request.required.acks (acks for short), there are > the following cases. > 1. acks=0, 1, 3. Obviously these settings do not satisfy the requirement. > 2. acks=2. Producer sends a message m. It's acknowledged by A and B. At this > time, although C hasn't received m, C is still in ISR. If A is killed, C can > be elected as the new leader, and consumers will miss m. > 3. acks=-1. B and C restart and are removed from ISR. Producer sends a > message m to A, and receives an acknowledgement. Disk failure happens in A > before B and C replicate m. Message m is lost. > In summary, any existing configuration cannot satisfy the requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)