Re: [DISCUSS] KIP-461 - Improving replica fetcher behavior in case of partition failures

2019-05-08 Thread Aishwarya Gune
Hi Jun! Yes, we should exclude. When a replica is deleted with StopReplicaRequest, the partition is removed from the set of failed partitions. Will update the KIP to mention it. On Wed, May 8, 2019 at 1:59 PM Jun Rao wrote: > Hi, Aishwarya, > > Thanks for the KIP. Looks good to me. Just one min

Re: [DISCUSS] KIP-461 - Improving replica fetcher behavior in case of partition failures

2019-05-08 Thread Jun Rao
Hi, Aishwarya, Thanks for the KIP. Looks good to me. Just one minor comment. If a replica is deleted on a broker (through a StopReplicaRequest) while it's in the failed partition set, should we exclude that partition from the set and the FailedPartitionsCount? Jun On Mon, May 6, 2019 at 1:21 PM

Re: [DISCUSS] KIP-461 - Improving replica fetcher behavior in case of partition failures

2019-05-08 Thread Jason Gustafson
Hey Aishwarya, Thanks for the KIP. I'd suggest we move to a vote since this is a straightforward improvement with a large impact. -Jason On Tue, May 7, 2019 at 3:02 PM Aishwarya Gune wrote: > Hi Colin! > > Whenever the thread has all of its partitions marked as failed (i.e. thread > is idle),

Re: [DISCUSS] KIP-461 - Improving replica fetcher behavior in case of partition failures

2019-05-08 Thread Colin McCabe
Thanks-- that makes sense. cheers, Colin On Tue, May 7, 2019, at 15:02, Aishwarya Gune wrote: > Hi Colin! > > Whenever the thread has all of its partitions marked as failed (i.e. thread > is idle), the thread would be shut down. > The errors that are not per-partition would probably retry or beh

Re: [DISCUSS] KIP-461 - Improving replica fetcher behavior in case of partition failures

2019-05-07 Thread Aishwarya Gune
Hi Colin! Whenever the thread has all of its partitions marked as failed (i.e. thread is idle), the thread would be shut down. The errors that are not per-partition would probably retry or behave just as before. On Tue, May 7, 2019 at 9:57 AM Colin McCabe wrote: > Hi Aishwarya, > > This looks

Re: [DISCUSS] KIP-461 - Improving replica fetcher behavior in case of partition failures

2019-05-07 Thread Colin McCabe
Hi Aishwarya, This looks like a great improvement! Will a fetcher thread exit if all of its partitions have been marked failed? Or will it continue to run? After this KIP is adopted, are there any remaining situations where we would exit a fetcher thread? I guess some errors are not per-part

[DISCUSS] KIP-461 - Improving replica fetcher behavior in case of partition failures

2019-05-06 Thread Aishwarya Gune
Hey All! I have created a KIP to improve the behavior of replica fetcher when partition failure occurs. Please do have a look at it and let me know what you think. KIP 461 - https://cwiki.apache.org/confluence/display/KAFKA/KIP-461+-+Improve+Replica+Fetcher+behavior+at+handling+partition+failure