Apurva, unclean leader election is disabled by default since


On 12 Aug 2017 8:06 pm, "Apurva Mehta" <apu...@confluent.io> wrote:

> I think the question of the default broker level configs is a good one. I
> don't think we need to touch the min.isr config or the replication factor
> to satisfy 'exactly-once' going by the definition laid out earlier. On the
> broker side, I think the only thing we should change is to disable unclean
> leader election.
> As I mentioned earlier, the guarantee I am proposing is that 'every
> acknowledged message appears in the log exactly once, in order per
> partition'.
> Now, if we have replication-factor=1, the acks setting makes no difference.
> In this case, there is no chance of leader failover and you are signing up
> for low availability. However, in the absence of losing the disk (and hence
> the actual log), every acknowledged message will appear in the log exactly
> once in order when the log is available.
> Now, assume we have disabled unclean leader election and have replication
> factor > 1.
> With an acks=1 setting, you can lose acknowledged messages during leader
> failover (which are not altogether rare). Particularly, the broker can fail
> after acknowledging but before replication occurs. The other replicas would
> still be in the ISR but without the acknowledged message, causing it to be
> lost.
> With acks=all, the above scenario is not possible.
> The min.isr setting only really helps protect your data in the event of
> hard failures where you actually lose a disk. If you have min.isr=1, you
> could keep producing even though only the leader is up. In this period, you
> lose availability if the leader fails. And you lose data if the leader's
> disk fails. However, once the replicas come online, they can only become
> leaders once they are in sync, so you can't lose data due to leader
> failover and bad truncation.
> So, enable.idempotence=true, acks=all, retries=MAX_INT, and
> unclean.leader.election.enable=false, are sufficient defaults to give
> strong delivery guarantees, though not strong durability guarantees. I
> think it is important to distinguish between the two.
> Some specific responses in line:
> From users' perspective, when idempotence=true and
> > max.in.flight.requests.per.connection > 0, ideally what acks=1 should
> > really mean is that "as long as there is no hardware failure, my message
> is
> > sent exactly once". Do you think this semantic is good enough as a
> default
> > configuration to ship? It is unfortunate this statement is not true today
> > as even when we do leader migration without any broker failure, the
> leader
> > will naively truncate the data that has not been replicated. It is a long
> > existing issue and we should try to fix that.
> I don't see what's wrong here. This is exactly what you should expect with
> acks=1, and having stronger behavior for acks=1 would be extremely hard (if
> not impossible) to achieve. Making sure acks=all and disabling unclean
> leader election is the right fix if you want stronger semantics where data
> integrity is guaranteed to be maintained across leader failover.
> For the max.in.flight.requests.per.connection, can you elaborate a little
> > on "Given the nature of the idempotence feature, we have to bound it.".
> > What is the concern here? It seems that when nothing wrong happens,
> > pipelining should just work. And the memory is bounded by the memory
> buffer
> > pool anyways. Sure one has to resend all the subsequent batches if one
> > batch is out of sequence, but that should be rare and we probably should
> > not optimize for that.
> The problem is described in https://issues.apache.org/
> jira/browse/KAFKA-5494
> .
> Specifically, when you have max.in.flight=N, you need to retain the
> offset/sequence/timestamp of the last N appended batches on the broker
> because you could get duplicates of any of those. That's why we have to
> bound the value of max.inflight when you enable idempotence, otherwise
> there is no way to handle duplicates correctly and efficiently on the
> broker.
> Thanks,
> Apurva
> On Sat, Aug 12, 2017 at 2:32 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> > Hi all,
> >
> > I will send a more detail email later, some quick comments:
> >
> > 1. It's unlikely that defaults will suit everyone. I think the question
> is:
> > what is the most likely configuration for a typical Kafka user _today_?
> > Kafka's usage is growing well beyond its original use cases and
> correctness
> > is often more important than achieving the maximum possible performance.
> > Large scale users have the time, knowledge and resources to tweak
> settings
> > so that they can achieve the latter. 1.0.0 is a good time to be thinking
> > about this.
> >
> > 2. This KIP focuses on producer configs. A few people (in the discussion
> > thread and offline) have asked whether the default min.insync.replicas
> > should be changed as well. I think that's a fair question and we should
> > address it in the KIP.
> >
> > 3. It is true that the current configs are generally too complicated. The
> > notion of "profiles" has been raised previously where one could state
> their
> > intent and the configs would be set to match. Becket suggested a
> semantics
> > config (exactly_once, at_most_once, at_least_once), which is similar to
> an
> > existing Kafka Streams config. The user could also specify if they would
> > like to optimise for latency or throughput, for example (this has been
> > raised before). I think we should pursue these ideas in a separate KIP.
> >
> > One comment inline.
> >
> > On Sat, Aug 12, 2017 at 1:26 AM, Becket Qin <becket....@gmail.com>
> wrote:
> >
> > > From users' perspective, when idempotence=true and
> > > max.in.flight.requests.per.connection > 0, ideally what acks=1 should
> > > really mean is that "as long as there is no hardware failure, my
> message
> > is
> > > sent exactly once".
> >
> >
> > I don't understand this statement. Hardware failures are common and we
> > should be striving for defaults that work correctly under failure.
> >
> > Ismael
> >

Reply via email to