Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Becket Qin Tue, 07 Mar 2017 13:26:34 -0800

Hi Dong,

Thanks for the KIP, a few more comments:


1. In the KIP wiki section "A log directory stops working on a broker
during runtime", the controller deletes the notification node right after
it reads the znode. It seems safer to do this at last so even though the
controller fails the new controller will still see the notification.
2. In the notification znode we have Event field as an integer. Can we
document what is the value of LogDirFailure? And also are there any other
possible values?

Thanks,

Jiangjie (Becket) Qin

On Tue, Mar 7, 2017 at 11:30 AM, Dong Lin <lindon...@gmail.com> wrote:

> Hey Ismael,
>
> Thanks much for taking time to review the KIP and read through all the
> discussion!
>
> Please see my reply inline.
>
> On Tue, Mar 7, 2017 at 9:47 AM, Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Hi Dong,
> >
> > It took me a while, but I finally went through the whole thread. I have a
> > few minor comments:
> >
> > 1. Regarding the metrics, can we include the full name (e.g.
> > kafka.cluster:type=Partition,name=InSyncReplicasCount,
> > topic={topic},partition={partition} was defined in KIP-96)?
> >
> > Certainly. I have updated the KIP to specify the full name.
>
>
> > 2. We talk about changes in operational procedures for people switching
> > from RAID to JBOD, but what about people who are already using JBOD?
> Since
> > disk failures won't necessarily cause broker failures, some adjustments
> may
> > be needed.
> >
>
> Good point. I indeed missed one operational procedure for both the existing
> RAID/JBOD user. I have updated the KIP to specify the following:
>
> Administrator will need to detect log directory failure by looking at
> OfflineLogDirectoriesCount. After log directory failure is detected,
> administrator needs to fix disks and reboot broker.
>
>
> >
> > 3. Another point regarding operational procedures, with a large enough
> > cluster, disk failures may not be that uncommon. It may be worth
> explaining
> > the recommended procedure if someone needs to do a rolling bounce of a
> > cluster with some bad disks. One option is to simply do the bounce and
> hope
> > that the bad disks are detected during restart, but we know that this is
> > not guaranteed to happen immediately. A better option may be to remove
> the
> > bad log dirs from the broker config until the disk is replaced.
> >
>
> I am not sure if I understand your suggestion here. I think user doesn't
> need to differentiate between log directory failure during rolling bounce
> and log directory failure during runtime. All they need to do is to detect
> and handle log directory failure specified above. And they don't have to
> remove the bad log directory immediately from broker config. The only
> drawback of keeping log directory there is that a new replica may not be
> created on the broker. But the chance of that happening is really low,
> since the controller has to fail in a small window after user initiated the
> topic creation but before it sends LeaderAndIsrRequest with
> is_new_replica=true to the broker. In practice this shouldn't matter.
>
>
> >
> > 4. The test plan doesn't mention the number of log directories per
> broker.
> > It could be good to specify this. Also, we seem to create one topic with
> > one partition, which means that only one log directory will be populated.
> > It seems like we should have partitions in more than one log directory to
> > verify that the failed log directory doesn't affect the ones that are
> still
> > good.
> >
>
> Sure. I have updated the test description to specify that each broker will
> have two log directories.
>
> The existing test case will actually create 2 topics to validate that
> failed log directory won't affect the good ones. You can find them after
> "Now validate that the previous leader can still serve replicas on the good
> log directories" and "Now validate that the follower can still serve
> replicas on the good log directories".
>
>
> >
> > 5. In the protocol definition, we have isNewReplica, but it should
> probably
> > be is_new_replica.
> >
>
> Good point. My bad. It is fixed now.
>
>
> >
> > Thanks,
> > Ismael
> >
> >
> > On Thu, Jan 12, 2017 at 6:46 PM, Dong Lin <lindon...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > We created KIP-112: Handle disk failure for JBOD. Please find the KIP
> > wiki
> > > in the link https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 112%3A+Handle+disk+failure+for+JBOD.
> > >
> > > This KIP is related to KIP-113
> > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 113%3A+Support+replicas+movement+between+log+directories>:
> > > Support replicas movement between log directories. They are needed in
> > order
> > > to support JBOD in Kafka. Please help review the KIP. You feedback is
> > > appreciated!
> > >
> > > Thanks,
> > > Dong
> > >
> >
>

Re: [DISCUSS] KIP-112: Handle disk failure for JBOD

Reply via email to