Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Dong Lin Tue, 14 Mar 2017 15:30:22 -0700

Hey Ismael,

Thanks for the comment. Please see my reply below.

On Tue, Mar 14, 2017 at 10:31 AM, Ismael Juma <[email protected]> wrote:

> Thanks Dong. Comments inline.
>
> On Fri, Mar 10, 2017 at 6:25 PM, Dong Lin <[email protected]> wrote:
> >
> > I get your point. But I am not sure we should recommend user to simply
> > remove disk from the broker config. If user simply does this without
> > checking the utilization of good disks, replica on the bad disk will be
> > re-created on the good disk and may overload the good disks, causing
> > cascading failure.
> >
>
> Good point.
>
>
> >
> > I agree with you and Colin that slow disk may cause problem. However,
> > performance degradation due to slow disk this is an existing problem that
> > is not detected or handled by Kafka or KIP-112.
>
>
> I think an important difference is that a number of disk errors are
> currently fatal and won't be after KIP-112. So it introduces new scenarios
> (for example, bouncing a broker that is working fine although some disks
> have been marked bad).
>

Hmm.. I am still trying to understand why KIP-112 creates new scenarios.
Slow disk is not considered fatal error and won't be caught by either
existing Kafka design or this KIP. If any disk is marked bad, it means
broker encounters IOException when accessing disk, most likely the broker
will encounter IOException again when accessing this disk and mark this
disk as bad after bounce. I guess you are talking about the case that a
disk is marked bad, broker is bounced, then the disk provides degraded
performance without being marked bad, right? But this seems to be an
existing problem we already have today with slow disk.

Here are the possible scenarios with bad disk after broker bounce:

1) bad disk -> broker bounce -> good disk. This would be great.
2) bad disk -> broker bounce -> slow disk. Slow disk is an existing problem
that is not addressed by Kafka today.
3) bad disk -> broker bounce -> bad disk. This is handled by this KIP such
that only replicas on the bad disk become offline.

>
> > Detection and handling of
> > slow disk is a separate problem that needs to be addressed in a future
> KIP.
> > It is currently listed in the future work. Does this sound OK?
> >
>
> I'm OK with it being handled in the future. In the meantime, I was just
> hoping that we can make it clear to users about the potential issue of a
> disk marked as bad becoming good again after a bounce (which can be
> dangerous).
>
> The main benefit of creating the second topic after log directory goes
> > offline is that we can make sure the second topic is created on the good
> > log directory. I am not sure we can simply assume that the first topic
> will
> > always be created on the first log directory in the broker config and the
> > second topic will be created on the second log directory in the broker
> > config.
>
>
>
> > However, I can add this test in KIP-113 which allows user to
> > re-assign replica to specific log directory of a broker. Is this OK?
> >
>
> OK. Please add a note to KIP-112 about this as well (so that it's clear why
> we only do it in KIP-113).
>

Sure. Instead of adding note to KIP-112, I have added test in KIP-113 to
verify that bad log directories discovered during runtime would not affect
replicas on the good log directories. Does this address the problem?

> Ismael
>

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Reply via email to