Re: JMX metrics for replica lag time

Mahendra Kariya Wed, 22 Feb 2017 20:10:00 -0800

Just wondering, for what particular Kafka version is this applicable?

On Thu, Feb 23, 2017 at 2:38 AM, Guozhang Wang <wangg...@gmail.com> wrote:


> Hmm that is a very good question. It seems to me that we did not add the
> corresponding metrics for it when we changed the mechanism. And your
> observation is likely to happen, that lag-in-message will not be useful
> enough to predict / explain why a follower has been kicked out of ISR.
>
> Could you file a JIRA for this? I think we can create a new metrics
> recording (time.milliseconds - r.lastCaughtUpTimeMs) and deprecate the old
> metrics.
>
> Guozhang
>
>
> On Tue, Feb 21, 2017 at 5:47 PM, Jun MA <mj.saber1...@gmail.com> wrote:
>
> > Hi Guozhang,
> >
> > Thanks for pointing this out. I was actually looking at this before and
> > that’s why I’m asking the question. This metric is 'lag in messages', and
> > since now the ISR logic relies on lag in seconds, not lag in messages,
> I’m
> > not sure how useful this metrics is. In fact, we saw the value of this
> > metrics been 0 all the time, even when there's ISR shrink/expand. I’d
> > expect to see a increasing in lag when shrink/expand happens. Is there a
> > metrics that can correctly represent the lag between followers and the
> > leader?
> >
> > Thanks,
> > Jun
> >
> > > On Feb 21, 2017, at 10:19 AM, Guozhang Wang <wangg...@gmail.com>
> wrote:
> > >
> > > You can find them in https://kafka.apache.org/
> documentation/#monitoring
> > >
> > > I think this is the one you are looking for:
> > >
> > > Lag in messages per follower replica
> > > kafka.server:type=FetcherLagMetrics,name=
> ConsumerLag,clientId=([-.\w]+)
> > ,topic=([-.\w]+),partition=([0-9]+)
> > > lag
> > > should be proportional to the maximum batch size of a produce request.
> > >
> > > On Mon, Feb 20, 2017 at 5:43 PM, Jun Ma <mj.saber1...@gmail.com>
> wrote:
> > >
> > >> Hi Guozhang,
> > >>
> > >> Thanks for your replay. Could you tell me which one indicates the lag
> > >> between follower and leader for a specific partition?
> > >>
> > >> Thanks,
> > >> Jun
> > >>
> > >> On Mon, Feb 20, 2017 at 4:57 PM, Guozhang Wang <wangg...@gmail.com>
> > wrote:
> > >>
> > >>> I don't think the metrics have been changed in 0.9.0.1, in fact even
> in
> > >>> 0.10.x they are still the same as stated in:
> > >>>
> > >>> https://kafka.apache.org/documentation/#monitoring
> > >>>
> > >>> The mechanism for determine which followers have been dropped out of
> > ISR
> > >>> has changed, but the metrics are not.
> > >>>
> > >>>
> > >>> Guozhang
> > >>>
> > >>>
> > >>> On Sun, Feb 19, 2017 at 7:56 PM, Jun MA <mj.saber1...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi,
> > >>>>
> > >>>> I’m looking for the JMX metrics to represent replica lag time for
> > >>> 0.9.0.1.
> > >>>> Base on the documentation, I can only find kafka.server:type=
> > >>>> ReplicaFetcherManager,name=MaxLag,clientId=Replica, which is max
> lag
> > >> in
> > >>>> messages btw follower and leader replicas. But since in 0.9.0.1 lag
> in
> > >>>> messages is deprecated and replaced with lag time, I’m wondering
> what
> > >> is
> > >>>> the corresponding metrics for this?
> > >>>>
> > >>>> Thanks,
> > >>>> Jun
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> -- Guozhang
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> >
> >
>
>
> --
> -- Guozhang
>

Re: JMX metrics for replica lag time

Reply via email to