Jeff Klukas created KAFKA-3714:
----------------------------------
Summary: Allow users greater access to register custom streams
metrics
Key: KAFKA-3714
URL: https://issues.apache.org/jira/browse/KAFKA-3714
Project: Kafka
Issue Type: Bug
Components: streams
Reporter: Jeff Klukas
Assignee: Guozhang Wang
Priority: Minor
Fix For: 0.10.1.0
Copying in some discussion that originally appeared in
https://github.com/apache/kafka/pull/1362#issuecomment-219064302
Kafka Streams is largely a higher-level abstraction on top of producers and
consumers, and it seems sensible to match the KafkaStreams interface to that of
KafkaProducer and KafkaConsumer where possible. For producers and consumers,
the metric registry is internal and metrics are only exposed as an unmodifiable
map. This allows users to access client metric values for use in application
health checks, etc., but doesn't allow them to register new metrics.
That approach seems reasonable if we assume that a user interested in defining
custom metrics is already going to be using a separate metrics library. In such
a case, users will likely find it easier to define metrics using whatever
library they're familiar with rather than learning the API for Kafka's Metrics
class. Is this a reasonable assumption?
If we want to expose the Metrics instance so that users can define arbitrary
metrics, I'd argue that there's need for documentation updates. In particular,
I find the notion of metric tags confusing. Tags can be defined in a
MetricConfig when the Metrics instance is constructed, StreamsMetricsImpl is
maintaining its own set of tags, and users can set tag overrides.
If a user were to get access to the Metrics instance, they would be missing the
tags defined in StreamsMetricsImpl. I'm imagining that users would want their
custom metrics to sit alongside the predefined metrics with the same tags, and
users shouldn't be expected to manage those additional tags themselves.
So, why are we allowing users to define their own metrics via the
StreamsMetrics interface in the first place? Is it that we'd like to be able to
provide a built-in latency metric, but the definition depends on the details of
the use case so there's no generic solution? That would be sufficient
motivation for this special case of addLatencySensor. If we want to continue
down that path and give users access to define a wider range of custom metrics,
I'd prefer to extend the StreamsMetrics interface so that users can call
methods on that object, automatically getting the tags appropriate for that
instance rather than interacting with the raw Metrics instance.
---
Guozhang had the following comments:
1) For the producer/consumer cases, all internal metrics are provided and
abstracted from users, and they just need to read the documentation to poll
whatever provided metrics that they are interested; and if they want to define
more metrics, they are likely to be outside the clients themselves and they can
use whatever methods they like, so Metrics do not need to be exposed to users.
2) For streams, things are a bit different: users define the computational
logic, which becomes part of the "Streams Client" processing and may be of
interests to be monitored by user themselves; think of a customized processor
that sends an email to some address based on a condition, and users want to
monitor the average rate of emails sent. Hence it is worth considering whether
or not they should be able to access the Metrics instance to define their own
along side the pre-defined metrics provided by the library.
3) Now, since the Metrics class was not previously designed for public usage,
it is not designed to be very user-friendly for defining sensors, especially
the semantics differences between name / scope / tags. StreamsMetrics tries to
hide some of these semantics confusion from users, but it still expose tags and
hence is not perfect in doing so. We need to think of a better approach so
that: 1) user defined metrics will be "aligned" (i.e. with the same name prefix
within a single application, with similar scope hierarchy definition, etc) with
library provided metrics, 2) natural APIs to do so.
I do not have concrete ideas about 3) above on top of my head, comments are
more than welcomed.
---
I'm not sure that I agree that 1) and 2) are truly different situations. A user
might choose to send email messages within a bare consumer rather than a
streams application, and still want to maintain a metric of sent emails. In
this bare consumer case, we'd expect the user to define that email-sent metric
outside of Kafka's metrics machinery.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)