Hey Jun, Ismael, Thanks for all the review! Can you vote for KIP-112 if you are OK with the latest design doc?
Thanks, Dong On Thu, Mar 30, 2017 at 3:29 PM, Dong Lin <lindon...@gmail.com> wrote: > Hi all, > > Thanks for all the comments. I am going to open the voting thread if > there is no further concern with the KIP. > > Dong > > On Wed, Mar 15, 2017 at 5:25 PM, Ismael Juma <ism...@juma.me.uk> wrote: > >> Thanks for the updates Dong, they look good to me. >> >> Ismael >> >> On Wed, Mar 15, 2017 at 5:50 PM, Dong Lin <lindon...@gmail.com> wrote: >> >> > Hey Ismael, >> > >> > Sure, I have updated "Changes in Operational Procedures" section in >> KIP-113 >> > to specify the problem and solution with known disk failure. And I >> updated >> > the "Test Plan" section to note that we have test in KIP-113 to verify >> that >> > replicas already created on the good log directories will not be >> affected >> > by failure of other log directories. >> > >> > Please let me know if there is any other improvement I can make. Thanks >> for >> > your comment. >> > >> > Dong >> > >> > >> > On Wed, Mar 15, 2017 at 3:18 AM, Ismael Juma <ism...@juma.me.uk> wrote: >> > >> > > Hi Dong, >> > > >> > > Yes, that sounds good to me. I'd list option 2 first since that is >> safe >> > > and, as you said, no worse than what happens today. The file approach >> is >> > a >> > > bit hacky as you said, so it may be a bit fragile. Not sure if we >> really >> > > want to mention that. :) >> > > >> > > About the note in KIP-112 versus adding the test in KIP-113, I think >> it >> > > would make sense to add a short sentence stating that this scenario is >> > > covered in KIP-113. People won't necessarily read both KIPs at the >> same >> > > time and it's helpful to cross-reference when it makes sense. >> > > >> > > Thanks for your work on this. >> > > >> > > Ismael >> > > >> > > On Tue, Mar 14, 2017 at 11:00 PM, Dong Lin <lindon...@gmail.com> >> wrote: >> > > >> > > > Hey Ismael, >> > > > >> > > > I get your concern that it is more likely for a disk to be slow, or >> > > exhibit >> > > > other forms of non-fatal symptom, after some known fatal error. >> Then it >> > > is >> > > > weird for user to start broker with the likely-problematic disk in >> the >> > > > broker config. In that case, I think there are two things user can >> do: >> > > > >> > > > 1) Intentionally change the log directory in the config to point to >> a >> > > file. >> > > > This is a bit hacky but it works well before we make >> more-appropriate >> > > > long-term change in Kafka to handle this case. >> > > > 2) Just don't start broker with bad log directories. Always fix disk >> > > before >> > > > restarting the broker. This is a safe approach that is no worse than >> > > > current practice. >> > > > >> > > > Would this address your concern if I specify the problem and the two >> > > > solutions in the KIP? >> > > > >> > > > Thanks, >> > > > Dong >> > > > >> > > > On Tue, Mar 14, 2017 at 3:29 PM, Dong Lin <lindon...@gmail.com> >> wrote: >> > > > >> > > > > Hey Ismael, >> > > > > >> > > > > Thanks for the comment. Please see my reply below. >> > > > > >> > > > > On Tue, Mar 14, 2017 at 10:31 AM, Ismael Juma <ism...@juma.me.uk> >> > > wrote: >> > > > > >> > > > >> Thanks Dong. Comments inline. >> > > > >> >> > > > >> On Fri, Mar 10, 2017 at 6:25 PM, Dong Lin <lindon...@gmail.com> >> > > wrote: >> > > > >> > >> > > > >> > I get your point. But I am not sure we should recommend user to >> > > simply >> > > > >> > remove disk from the broker config. If user simply does this >> > without >> > > > >> > checking the utilization of good disks, replica on the bad disk >> > will >> > > > be >> > > > >> > re-created on the good disk and may overload the good disks, >> > causing >> > > > >> > cascading failure. >> > > > >> > >> > > > >> >> > > > >> Good point. >> > > > >> >> > > > >> >> > > > >> > >> > > > >> > I agree with you and Colin that slow disk may cause problem. >> > > However, >> > > > >> > performance degradation due to slow disk this is an existing >> > problem >> > > > >> that >> > > > >> > is not detected or handled by Kafka or KIP-112. >> > > > >> >> > > > >> >> > > > >> I think an important difference is that a number of disk errors >> are >> > > > >> currently fatal and won't be after KIP-112. So it introduces new >> > > > scenarios >> > > > >> (for example, bouncing a broker that is working fine although >> some >> > > disks >> > > > >> have been marked bad). >> > > > >> >> > > > > >> > > > > Hmm.. I am still trying to understand why KIP-112 creates new >> > > scenarios. >> > > > > Slow disk is not considered fatal error and won't be caught by >> either >> > > > > existing Kafka design or this KIP. If any disk is marked bad, it >> > means >> > > > > broker encounters IOException when accessing disk, most likely the >> > > broker >> > > > > will encounter IOException again when accessing this disk and mark >> > this >> > > > > disk as bad after bounce. I guess you are talking about the case >> > that a >> > > > > disk is marked bad, broker is bounced, then the disk provides >> > degraded >> > > > > performance without being marked bad, right? But this seems to be >> an >> > > > > existing problem we already have today with slow disk. >> > > > > >> > > > > Here are the possible scenarios with bad disk after broker bounce: >> > > > > >> > > > > 1) bad disk -> broker bounce -> good disk. This would be great. >> > > > > 2) bad disk -> broker bounce -> slow disk. Slow disk is an >> existing >> > > > > problem that is not addressed by Kafka today. >> > > > > 3) bad disk -> broker bounce -> bad disk. This is handled by this >> KIP >> > > > such >> > > > > that only replicas on the bad disk become offline. >> > > > > >> > > > > >> > > > >> >> > > > >> > Detection and handling of >> > > > >> > slow disk is a separate problem that needs to be addressed in a >> > > future >> > > > >> KIP. >> > > > >> > It is currently listed in the future work. Does this sound OK? >> > > > >> > >> > > > >> >> > > > >> I'm OK with it being handled in the future. In the meantime, I >> was >> > > just >> > > > >> hoping that we can make it clear to users about the potential >> issue >> > > of a >> > > > >> disk marked as bad becoming good again after a bounce (which can >> be >> > > > >> dangerous). >> > > > >> >> > > > >> The main benefit of creating the second topic after log directory >> > goes >> > > > >> > offline is that we can make sure the second topic is created on >> > the >> > > > good >> > > > >> > log directory. I am not sure we can simply assume that the >> first >> > > topic >> > > > >> will >> > > > >> > always be created on the first log directory in the broker >> config >> > > and >> > > > >> the >> > > > >> > second topic will be created on the second log directory in the >> > > broker >> > > > >> > config. >> > > > >> >> > > > >> >> > > > >> >> > > > >> > However, I can add this test in KIP-113 which allows user to >> > > > >> > re-assign replica to specific log directory of a broker. Is >> this >> > OK? >> > > > >> > >> > > > >> >> > > > >> OK. Please add a note to KIP-112 about this as well (so that it's >> > > clear >> > > > >> why >> > > > >> we only do it in KIP-113). >> > > > >> >> > > > > >> > > > > Sure. Instead of adding note to KIP-112, I have added test in >> KIP-113 >> > > to >> > > > > verify that bad log directories discovered during runtime would >> not >> > > > affect >> > > > > replicas on the good log directories. Does this address the >> problem? >> > > > > >> > > > > >> > > > >> Ismael >> > > > >> >> > > > > >> > > > > >> > > > >> > > >> > >> > >