Re: [DISCUSS] KIP-237: More Controller Health Metrics

Dong Lin Thu, 04 Jan 2018 11:41:49 -0800

Thanks for the explanation Ismael. Yeah I think it makes sense that we
shouldn't have to rely on class names for metrics.


It seems that the current metric framework in Kafka will use the the class
name in the metric name -- this logic is hard-coded in KafkaMetricsGroup
class. This is probably the reason why our existing metrics mostly rely on
the class name. It is possible to change it so that we can specify
arbitrary metric name when we instantiate the metrics. If the difference in
the metric names in this KIP is not that big, maybe we can do it in the
future when it is more useful? If you think it is necessary, I am open to
make the change this time as well.



On Thu, Jan 4, 2018 at 10:13 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Sorry, I thought I had replied to this. Relying on class names for metrics
> is an anti-pattern in my opinion. It couples metrics too tightly with the
> implementation and makes it harder to refactor. That was the reason for the
> suggestion.
>
> Ismael
>
> On Wed, Dec 13, 2017 at 1:51 AM, Dong Lin <lindon...@gmail.com> wrote:
>
> > Hey Ismael,
> >
> > Thanks for your comments. Yeah the set of metrics in the KIP includes the
> > two metrics discussed in the future work in KIP-143. This KIP was also
> > named after KIP-143.
> >
> > I think it may be better to use to use ControllerEventManager instead
> > of ControllerStats.
> > The metrics 1 and 2 are directly related to queue in
> ControllerEventManager
> > whereas the existing ControllerStats metrics are event-specific metrics.
> It
> > seems that for most requests which are not per-request or per-event,
> their
> > JMX type attribute is equivalent to the name of the Java class, e.g. the
> > TotalQueueSize in ControllerChannelManager.
> >
> > But I am open to change the type to ControllerStats if we decide that all
> > future metrics in Controller will have type ControllerStats. Do you think
> > this would be better?
> >
> > Thanks,
> > Dong
> >
> >
> >
> >
> > On Tue, Dec 12, 2017 at 1:20 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > > Thanks for the KIP, Dong. The general idea is good. In fact, two of the
> > > three metrics had been listed under future work for KIP-143:
> > >
> > > "KAFKA-5028 introduced a queue for Controller events. It would be
> useful
> > to
> > > have a gauge for the queue size and a histogram for how long an event
> > waits
> > > in the queue before being processed. However, we are in the process of
> > > making additional changes to improve the handling of soft failures and
> > > there's a possibility that the controller queue could be replaced by a
> > > broker queue for all ZK communication. We will see how that develops
> > before
> > > deciding which metrics should be exposed. In the meantime, the
> > > ControllerState and other metrics should provide enough information to
> > > issue an alert if the Controller is not healthy."
> > >
> > > It seems like the conclusion is that we won't have a global broker
> queue,
> > > but it would be good for Jun and Onur to confirm this.
> > >
> > > One minor comment:
> > >
> > > 1. For metric 1 and 2 in the KIP, do we want the type to be
> > > ControllerEventManager or should it be ControllerStats like many other
> > > Controller metrics?
> > >
> > > Ismael
> > >
> > > On Thu, Dec 7, 2017 at 4:21 AM, Dong Lin <lindon...@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > I have created KIP-237: More Controller Health Metrics
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 237%3A+More+Controller+Health+Metrics
> > > > .
> > > >
> > > > The KIP proposes to add a few more metrics to help monitor Kafka
> > > Controller
> > > > health. Feedback and suggestions are welcome!
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-237: More Controller Health Metrics

Reply via email to