Re: [PR] [improve] PIP 342: Support OpenTelemetry metrics in Pulsar client [pulsar]

via GitHub Sun, 10 Mar 2024 06:29:19 -0700


asafm commented on code in PR #22178:
URL: https://github.com/apache/pulsar/pull/22178#discussion_r1518852220



##########
pip/pip-342 OTel client metrics support.md:
##########
@@ -0,0 +1,201 @@
+# PIP 342: Support OpenTelemetry metrics in Pulsar client
+
+## Motivation
+
+Current support for metric instrumentation in Pulsar client is very limited 
and poses a lot of
+issues for integrating the metrics into any telemetry system.
+
+We have 2 ways that metrics are exposed today:
+
+1. Printing logs every 1 minute: While this is ok as it comes out of the box, 
it's very hard for
+   any application to get the data or use it in any meaningful way.
+2. `producer.getStats()` or `consumer.getStats()`: Calling these methods will 
get access to
+   the rate of events in the last 1-minute interval. This is problematic 
because out of the
+   box the metrics are not collected anywhere. One would have to start its own 
thread to
+   periodically check these values and export them to some other system.
+
+Neither of these mechanism that we have today are sufficient to enable 
application to easily
+export the telemetry data of Pulsar client SDK.
+
+## Goal
+
+Provide a good way for applications to retrieve and analyze the usage of 
Pulsar client operation,
+in particular with respect to:
+
+1. Maximizing compatibility with existing telemetry systems
+2. Minimizing the effort required to export these metrics
+
+## Why OpenTelemetry?
+
+[OpenTelemetry](https://opentelemetry.io/) is quickly becoming the de-facto 
standard API for metric and
+tracing instrumentation. In fact, as part of 
[PIP-264](https://github.com/apache/pulsar/blob/master/pip/pip-264.md),
+we are already migrating the Pulsar server side metrics to use OpenTelemetry.
+
+For Pulsar client SDK, we need to provide a similar way for application 
builder to quickly integrate and
+export Pulsar metrics.
+
+### Why exposing OpenTelemetry directly in Pulsar API
+
+When deciding how to expose the metrics exporter configuration there are 
multiple options: 
+
+ 1. Accept an `OpenTelemetry` object directly in Pulsar API
+ 2. Build a pluggable interface that describe all the Pulsar client SDK events 
and allow application to
+    provide an implementation, perhaps providing an OpenTelemetry included 
option.
+
+For this proposal, we are following the (1) option. Here are the reasons:
+
+ 1. In a way, OpenTelemetry can be compared to 
[SLF4J](https://www.slf4j.org/), in the sense that it provides an API
+    on top of which different vendor can build multiple implementations. 
Therefore, there is no need to create a new
+    Pulsar-specific interface
+ 2. OpenTelemetry has 2 main artifacts: API and SDK. For the context of Pulsar 
client, we will only depend on its
+    API. Applications that are going to use OpenTelemetry, will include the 
OTel SDK
+ 3. Providing a custom interface has several drawbacks:
+     1. Applications need to update their implementations every time a new 
metric is added in Pulsar SDK
+     2. The surface of this plugin API can become quite big when there are 
several metrics
+     3. If we imagine an application that uses multiple libraries, like Pulsar 
SDK, and each of these has its own
+        custom way to expose metrics, we can see the level of integration 
burden that is pushed to application
+        developers
+ 4. It will always be easy to use OpenTelemetry to collect the metrics and 
export them using a custom metrics API. There
+    are several examples of this in OpenTelemetry documentation.
+
+## Public API changes
+
+### Enabling OpenTelemetry
+
+When building a `PulsarClient` instance, it will be possible to pass an 
`OpenTelemetry` object:
+
+```java
+interface ClientBuilder {
+    // ...
+    ClientBuilder openTelemetry(io.opentelemetry.api.OpenTelemetry 
openTelemetry);
+
+    ClientBuilder openTelemetryMetricsCardinality(MetricsCardinality 
metricsCardinality);

Review Comment:
   They would define a view for that instrument, and in it they would override 
the attributes that would be recorded.
   They have 2 ways. One is programmatic, if they created the SDK them selfs 
completetly:
   ```
                   .registerView(
                           InstrumentSelector.builder()
                                   .setMeterName("hikari")
                                   .setName("http.request.latency")
                                   .build(),
                           View.builder()
                                   .setAttributeFilter(attrName -> 
attrName.equals("statusCode"))
                                   .build())
   ```
   
   The AttributeFilter is a Predicate that dictates which attributes to record.
   You use `record(attr, 13)`, and say `attr` is `(tenant, namespace, topic)`, 
then when filter is applied for this instrument, each `attr` will be passed 
through the filter which will produce a new `Attributes` containing only those 
attribute keys which passed the filter. 
   I never tested this from a performance perspective. I presume since it's on 
the client side, it might be negligible. In theory this area of the OTel SDk 
can be improved by introducing some caching mechanism.
   
   A second option, which I believe is more likely to be used: configuration 
file.
   The AutoConfigured SDK builder can read a configuration file and configure 
it self according to it. It will replace ENV variables. The JSON schema for 
this file is defined a shared repository all SDK uses. Here's an example in 
which you can see ho a view is defined: 
https://github.com/open-telemetry/opentelemetry-configuration/blob/main/examples/kitchen-sink.yaml
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [improve] PIP 342: Support OpenTelemetry metrics in Pulsar client [pulsar]

Reply via email to