Thanks for the vote! Discussed with Joel offline. I have updated the KIP to specify that controller will consider a replica to be offline if KafkaStorageException is specified for the replica in the LeaderAndIsrResponse. The other two improvements may be done in the future KIP.
On Wed, Apr 26, 2017 at 10:30 AM, Joel Koshy <jjkosh...@gmail.com> wrote: > +1 > > Discussed a few edits/improvements with Dong. > > - Rather than a blanket (Error != None) condition for detecting offline > replicas you probably want a storage exception-specific error code. > > - Definitely in favor of improvement #7 and it shouldn’t be too hard to do. > When bouncing with a log directory on a faulty disk, the condition may be > detected while loading logs and you may not have the full list of local > replicas. So a subsequent L&ISR request would recreate the replica on the > good disks (which may or may not be what the user wants). > > - Another improvement worth investigating is how best to support partition > reassignments even with a bad disk. The wiki hints that this is unnecessary > because reassignments being disallowed with an offline replica is similar > to the current state of handling an offline broker. With JBOD though the > broker with a bad disk does not have to be offline anymore so it should be > possible to support reassignments even with offline replicas. I'm not > suggesting this is trivial, but would better leverage JBOD. > > On Wed, Apr 5, 2017 at 5:46 PM, Becket Qin <becket....@gmail.com> wrote: > > > +1 > > > > Thanks for the KIP. Made a pass and had some minor change. > > > > On Mon, Apr 3, 2017 at 3:16 PM, radai <radai.rosenbl...@gmail.com> > wrote: > > > > > +1, LGTM > > > > > > On Mon, Apr 3, 2017 at 9:49 AM, Dong Lin <lindon...@gmail.com> wrote: > > > > > > > Hi all, > > > > > > > > It seems that there is no further concern with the KIP-112. We would > > like > > > > to start the voting process. The KIP can be found at > > > > *https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > 112%3A+Handle+disk+failure+for+JBOD > > > > <https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > 112%3A+Handle+disk+failure+for+JBOD>.* > > > > > > > > Thanks, > > > > Dong > > > > > > > > > >