With ack=-1, the producer is guaranteed to receive an error when publishing m2 in the above case.
Thanks, Jun On Thu, Jul 24, 2014 at 9:46 AM, Scott Clasen <[email protected]> wrote: > Thanks for the Jira info. > > Just to clarify, in the case we are outlining above, would the producer > would have received an ack on m2 (with acks = -1) or not? If not, then I > have no concerns, if so, then how would the producer know to re-publish? > > > On Thu, Jul 24, 2014 at 9:38 AM, Jun Rao <[email protected]> wrote: > > > About re-publishing m2. it seems it's better to let the producer choose > > whether to do this or not. > > > > There is another known bug KAFKA-1211 that's not fixed yet. The situation > > when this can happen is relatively rare and the fix is slightly involved. > > So, it may not be addressed in 0.8.2. > > > > Thanks, > > > > Jun > > > > > > On Tue, Jul 22, 2014 at 8:36 PM, scott@heroku <[email protected]> wrote: > > > > > Thanks so much for the detailed explanation Jun, it pretty much lines > up > > > with my understanding. > > > > > > In the case below, if we didn't particularly care about ordering and > > > re-produced m2, it would then become m5, and in many use cases this > would > > > be ok? > > > > > > Perhaps a more direct question would be, once 0.8.2 is out and I have a > > > topic with unclean leader election disabled, and produce with acks = > -1, > > > are there any known series of events (other than disk failures on all > > > brokers) that would cause the loss of messages that a producer has > > received > > > an ack for? > > > > > > > > > > > > > > > > > > Sent from my iPhone > > > > > > > On Jul 22, 2014, at 8:17 PM, Jun Rao <[email protected]> wrote: > > > > > > > > They key point is that we have to keep all replicas consistent with > > each > > > > other such that no matter which replica a consumer reads from, it > > always > > > > reads the same data on a given offset. The following is an example. > > > > > > > > Suppose we have 3 brokers A, B and C. Let's assume A is the leader > and > > at > > > > some point, we have the following offsets and messages in each > replica. > > > > > > > > offset A B C > > > > 1 m1 m1 m1 > > > > 2 m2 > > > > > > > > Let's assume that message m1 is committed and message m2 is not. At > > > exactly > > > > this moment, replica A dies. After a new leader is elected, say B, > new > > > > messages can be committed with just replica B and C. Some point later > > if > > > we > > > > commit two more messages m3 and m4, we will have the following. > > > > > > > > offset A B C > > > > 1 m1 m1 m1 > > > > 2 m2 m3 m3 > > > > 3 m4 m4 > > > > > > > > Now A comes back. For consistency, it's important for A's log to be > > > > identical to B and C. So, we have to remove m2 from A's log and add > m3 > > > and > > > > m4. As you can see, whether you want to republish m2 or not, m2 > cannot > > > stay > > > > in its current offset, since in other replicas, that offset is > already > > > > taken by other messages. Therefore, a truncation of replica A's log > is > > > > needed to keep the replicas consistent. Currently, we don republish > > > > messages like m2 since (1) it's not necessary since it's never > > considered > > > > committed; (2) it will make our protocol more complicated. > > > > > > > > Thanks, > > > > > > > > Jun > > > > > > > > > > > > > > > > > > > >> On Tue, Jul 22, 2014 at 3:40 PM, scott@heroku <[email protected]> > > wrote: > > > >> > > > >> Thanks Jun > > > >> > > > >> Can you explain a little more about what an uncommitted message > means? > > > >> The messages are in the log so presumably? they have been acked at > > least > > > >> by the the local broker. > > > >> > > > >> I guess I am hoping for some intuition around why 'replaying' the > > > messages > > > >> in question would cause bad things. > > > >> > > > >> Thanks! > > > >> > > > >> > > > >> Sent from my iPhone > > > >>> On Jul 22, 2014, at 3:06 PM, Jun Rao <[email protected]> wrote: > > > >>> > > > >>> Scott, > > > >>> > > > >>> The reason for truncation is that the broker that comes back may > have > > > >> some > > > >>> un-committed messages. Those messages shouldn't be exposed to the > > > >> consumer > > > >>> and therefore need to be removed from the log. So, on broker > startup, > > > we > > > >>> first truncate the log to a safe point before which we know all > > > messages > > > >>> are committed. This broker will then sync up with the current > leader > > to > > > >> get > > > >>> the remaining messages. > > > >>> > > > >>> Thanks, > > > >>> > > > >>> Jun > > > >>> > > > >>> > > > >>>> On Tue, Jul 22, 2014 at 9:42 AM, Scott Clasen <[email protected]> > > > wrote: > > > >>>> > > > >>>> Ahh, yes that message loss case. I've wondered about that myself. > > > >>>> > > > >>>> I guess I dont really understand why truncating messages is ever > the > > > >> right > > > >>>> thing to do. As kafka is an 'at least once' system. (send a > > message, > > > >> get > > > >>>> no ack, it still might be on the topic) consumers that care will > > have > > > to > > > >>>> de-dupe anyhow. > > > >>>> > > > >>>> To the kafka designers: is there anything preventing > implementation > > > of > > > >>>> alternatives to truncation? when a broker comes back online and > > needs > > > to > > > >>>> truncate, cant it fire up a producer and take the extra messages > and > > > >> send > > > >>>> them back to the original topic or alternatively an error topic? > > > >>>> > > > >>>> Would love to understand the rationale for the current design, as > my > > > >>>> perspective is doubtfully as clear as the designers' > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> On Tue, Jul 22, 2014 at 6:21 AM, Jiang Wu (Pricehistory) > (BLOOMBERG/ > > > 731 > > > >>>> LEX -) <[email protected]> wrote: > > > >>>> > > > >>>>> kafka-1028 addressed another unclean leader election problem. It > > > >> prevents > > > >>>>> a broker not in ISR from becoming a leader. The problem we are > > facing > > > >> is > > > >>>>> that a broker in ISR but without complete messages may become a > > > leader. > > > >>>>> It's also a kind of unclean leader election, but not the one that > > > >>>>> kafka-1028 addressed. > > > >>>>> > > > >>>>> Here I'm trying to give a proof that current kafka doesn't > achieve > > > the > > > >>>>> requirement (no message loss, no blocking when 1 broker down) due > > to > > > >> its > > > >>>>> two behaviors: > > > >>>>> 1. when choosing a new leader from 2 followers in ISR, the one > with > > > >> less > > > >>>>> messages may be chosen as the leader > > > >>>>> 2. even when replica.lag.max.messages=0, a follower can stay in > ISR > > > >> when > > > >>>>> it has less messages than the leader. > > > >>>>> > > > >>>>> We consider a cluster with 3 brokers and a topic with 3 replicas. > > We > > > >>>>> analyze different cases according to the value of > > > request.required.acks > > > >>>>> (acks for short). For each case and it subcases, we find > situations > > > >> that > > > >>>>> either message loss or service blocking happens. We assume that > at > > > the > > > >>>>> beginning, all 3 replicas, leader A, followers B and C, are in > > sync, > > > >>>> i.e., > > > >>>>> they have the same messages and are all in ISR. > > > >>>>> > > > >>>>> 1. acks=0, 1, 3. Obviously these settings do not satisfy the > > > >> requirement. > > > >>>>> 2. acks=2. Producer sends a message m. It's acknowledged by A and > > B. > > > At > > > >>>>> this time, although C hasn't received m, it's still in ISR. If A > is > > > >>>> killed, > > > >>>>> C can be elected as the new leader, and consumers will miss m. > > > >>>>> 3. acks=-1. Suppose replica.lag.max.messages=M. There are two > > > >> sub-cases: > > > >>>>> 3.1 M>0. Suppose C be killed. C will be out of ISR after > > > >>>>> replica.lag.time.max.ms. Then the producer publishes M messages > > to A > > > >> and > > > >>>>> B. C restarts. C will join in ISR since it is M messages behind A > > and > > > >> B. > > > >>>>> Before C replicates all messages, A is killed, and C becomes > > leader, > > > >> then > > > >>>>> message loss happens. > > > >>>>> 3.2 M=0. In this case, when the producer publishes at a high > > speed, B > > > >> and > > > >>>>> C will fail out of ISR, only A keeps receiving messages. Then A > is > > > >>>> killed. > > > >>>>> Either message loss or service blocking will happen, depending on > > > >> whether > > > >>>>> unclean leader election is enabled. > > > >>>>> > > > >>>>> > > > >>>>> From: [email protected] At: Jul 21 2014 22:28:18 > > > >>>>> To: JIANG WU (PRICEHISTORY) (BLOOMBERG/ 731 LEX -), > > > >>>> [email protected] > > > >>>>> Subject: Re: how to ensure strong consistency with reasonable > > > >>>> availability > > > >>>>> > > > >>>>> You will probably need 0.8.2 which gives > > > >>>>> https://issues.apache.org/jira/browse/KAFKA-1028 > > > >>>>> > > > >>>>> > > > >>>>> On Mon, Jul 21, 2014 at 6:37 PM, Jiang Wu (Pricehistory) > > (BLOOMBERG/ > > > >> 731 > > > >>>>> LEX -) <[email protected]> wrote: > > > >>>>> > > > >>>>>> Hi everyone, > > > >>>>>> > > > >>>>>> With a cluster of 3 brokers and a topic of 3 replicas, we want > to > > > >>>> achieve > > > >>>>>> the following two properties: > > > >>>>>> 1. when only one broker is down, there's no message loss, and > > > >>>>>> procuders/consumers are not blocked. > > > >>>>>> 2. in other more serious problems, for example, one broker is > > > >> restarted > > > >>>>>> twice in a short period or two brokers are down at the same > time, > > > >>>>>> producers/consumers can be blocked, but no message loss is > > allowed. > > > >>>>>> > > > >>>>>> We haven't found any producer/broker paramter combinations that > > > >> achieve > > > >>>>>> this. If you know or think some configurations will work, please > > > post > > > >>>>>> details. We have a test bed to verify any given configurations. > > > >>>>>> > > > >>>>>> In addition, I'm wondering if it's necessary to open a jira to > > > require > > > >>>>> the > > > >>>>>> above feature? > > > >>>>>> > > > >>>>>> Thanks, > > > >>>>>> Jiang > > > >> > > > > > >
