Thanks for the KIP, this is very helpful I had an offline discussion with Jason and we discussed the semantics of the underMinIsr/atMinIsr metrics. The current proposal would expose a gap where we could report URP but no MinIsr. A brief example: original replica set = [0,1,2] new replica set = [3,4,5] isr = [0, 3, 4] config.minIsr = 3
As the KIP said > In other words, we will subtract the AddingReplica from both the total replicas and the current ISR when determining URP satisfaction. We would report URP=2 (1 and 2 are not in ISR) but not underMinIsr, as we have an ISR of 3. Technically, any produce requests with acks=all would succeed, so it would be false to report `underMinIsr`. We thought it'd be good to keep both metrics consistent, so a new proposal is to use the following algorithm: ``` isUrp == size(original replicas) - size(isr) > 0 ``` Taking that into account, +1 from me! (non-binding) On Fri, Aug 23, 2019 at 7:00 PM Colin McCabe <cmcc...@apache.org> wrote: > +1 (binding). > > cheers, > Colin > > On Tue, Aug 20, 2019, at 10:55, Jason Gustafson wrote: > > Hi All, > > > > I'd like to start a vote on KIP-352, which is a follow-up to KIP-455 to > > fix > > a long-known shortcoming of URP reporting and to improve reassignment > > monitoring: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-352%3A+Distinguish+URPs+caused+by+reassignment > > . > > > > Note that I have added one new metric following the discussion. It seemed > > useful to have a lag metric for reassigning partitions. > > > > Thanks, > > Jason > > > -- Best, Stanislav