Just wondering, for what particular Kafka version is this applicable? On Thu, Feb 23, 2017 at 2:38 AM, Guozhang Wang <wangg...@gmail.com> wrote:
> Hmm that is a very good question. It seems to me that we did not add the > corresponding metrics for it when we changed the mechanism. And your > observation is likely to happen, that lag-in-message will not be useful > enough to predict / explain why a follower has been kicked out of ISR. > > Could you file a JIRA for this? I think we can create a new metrics > recording (time.milliseconds - r.lastCaughtUpTimeMs) and deprecate the old > metrics. > > Guozhang > > > On Tue, Feb 21, 2017 at 5:47 PM, Jun MA <mj.saber1...@gmail.com> wrote: > > > Hi Guozhang, > > > > Thanks for pointing this out. I was actually looking at this before and > > that’s why I’m asking the question. This metric is 'lag in messages', and > > since now the ISR logic relies on lag in seconds, not lag in messages, > I’m > > not sure how useful this metrics is. In fact, we saw the value of this > > metrics been 0 all the time, even when there's ISR shrink/expand. I’d > > expect to see a increasing in lag when shrink/expand happens. Is there a > > metrics that can correctly represent the lag between followers and the > > leader? > > > > Thanks, > > Jun > > > > > On Feb 21, 2017, at 10:19 AM, Guozhang Wang <wangg...@gmail.com> > wrote: > > > > > > You can find them in https://kafka.apache.org/ > documentation/#monitoring > > > > > > I think this is the one you are looking for: > > > > > > Lag in messages per follower replica > > > kafka.server:type=FetcherLagMetrics,name= > ConsumerLag,clientId=([-.\w]+) > > ,topic=([-.\w]+),partition=([0-9]+) > > > lag > > > should be proportional to the maximum batch size of a produce request. > > > > > > On Mon, Feb 20, 2017 at 5:43 PM, Jun Ma <mj.saber1...@gmail.com> > wrote: > > > > > >> Hi Guozhang, > > >> > > >> Thanks for your replay. Could you tell me which one indicates the lag > > >> between follower and leader for a specific partition? > > >> > > >> Thanks, > > >> Jun > > >> > > >> On Mon, Feb 20, 2017 at 4:57 PM, Guozhang Wang <wangg...@gmail.com> > > wrote: > > >> > > >>> I don't think the metrics have been changed in 0.9.0.1, in fact even > in > > >>> 0.10.x they are still the same as stated in: > > >>> > > >>> https://kafka.apache.org/documentation/#monitoring > > >>> > > >>> The mechanism for determine which followers have been dropped out of > > ISR > > >>> has changed, but the metrics are not. > > >>> > > >>> > > >>> Guozhang > > >>> > > >>> > > >>> On Sun, Feb 19, 2017 at 7:56 PM, Jun MA <mj.saber1...@gmail.com> > > wrote: > > >>> > > >>>> Hi, > > >>>> > > >>>> I’m looking for the JMX metrics to represent replica lag time for > > >>> 0.9.0.1. > > >>>> Base on the documentation, I can only find kafka.server:type= > > >>>> ReplicaFetcherManager,name=MaxLag,clientId=Replica, which is max > lag > > >> in > > >>>> messages btw follower and leader replicas. But since in 0.9.0.1 lag > in > > >>>> messages is deprecated and replaced with lag time, I’m wondering > what > > >> is > > >>>> the corresponding metrics for this? > > >>>> > > >>>> Thanks, > > >>>> Jun > > >>> > > >>> > > >>> > > >>> > > >>> -- > > >>> -- Guozhang > > >>> > > >> > > > > > > > > > > > > -- > > > -- Guozhang > > > > > > > -- > -- Guozhang >