Re: Using Prometheus Client Metrics in Flink

Meissner, Dylan Sun, 28 Feb 2021 06:02:34 -0800

Hi Rion,

Regarding the question about adding Prometheus labels out of the box. This is 
common ask of all exporters, but Prometheus philosophy sees this as an 
"anti-pattern" as the metrics source can often be ambivalent about context. See 
[0] for example of such a discussion.


Instead, we can establish context during service discovery. If, for example, we 
run clusters for tenants on Kubernetes, then within the kubernetes_sd_config 
[1] labelling rules we can instruct Prometheus to add the Kubernetes labels 
from the pods, such as "tenant-id: foo" and "environment: staging" to each 
incoming metric it processes.

This isn't limited to Kubernetes; each of the service discovery configs 
designed to accomodate translating metadata from context into metric labels.

If this doesn't work for you, then consider encoding tenant identifier into job 
names, and extract this identifier in a metric_relabel_config [2]

[0]: https://github.com/prometheus/node_exporter/issues/319
[1]: 
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
[2]: 
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs


________________________________
From: Rion Williams <[email protected]>
Sent: Sunday, February 28, 2021 12:46 AM
To: Prasanna kumar <[email protected]>
Cc: user <[email protected]>
Subject: Re: Using Prometheus Client Metrics in Flink

Hi Prassana,

Thanks for that. It’s what I was doing previously as a workaround however I was 
just curious if there was any Flink-specific functionality to handle this prior 
to Prometheus.

Additionally from the docs on metrics [0], it seems that there’s a pattern in 
place to use supported third-party metrics such as those from 
CodeHale/DropWizard via a Maven package (flink-metrics-dropwizard). I do see a 
similarly named package for Prometheus which may be what I’m looking for as 
it’s similarly named (flink-metrics-prometheus), so I may give that a try.

Thanks,

Rion

[0]: https://ci.apache.org/projects/flink/flink-docs-stable/ops/metrics.html

On Feb 28, 2021, at 12:20 AM, Prasanna kumar <[email protected]> 
wrote:


Rion,

Regarding the second question , you can aggregate by using sum function  
sum(metric_name{jobb_name="JOBNAME"}) .  This works is you are using the metric 
counter.

Prasanna.

On Sat, Feb 27, 2021 at 9:01 PM Rion Williams 
<[email protected]<mailto:[email protected]>> wrote:
Hi folks,

I’ve just recently started working with Flink and I was in the process of 
adding some metrics through my existing pipeline with the hopes of building 
some Grafana dashboards with them to help with observability.

Initially I looked at the built-in Flink metrics that were available, but I 
didn’t see an easy mechanism for setting/using labels with them. Essentially, I 
have two properties for my messages coming through the pipeline that I’d like 
to be able to keep track of (tenant/source) across several metrics (e.g. 
total_messages with tenant / source labels, etc.). I didn’t see an easy way to 
adjust this out of the box, or wasn’t aware of a good pattern for handling 
these.

I had previously used the Prometheus Client metrics [0] to accomplish this in 
the past but I wasn’t entirely sure how it would/could mesh with Flink. Does 
anyone have experience in working with these or know if they are supported?

Secondly, when using the Flink metrics, I noticed I was receiving a separate 
metric for each task that was being spun up. Is there an “easy button” to 
handle aggregating these to ensure that a single metric (e.g. total_messages) 
reflects the total processed across all of the tasks instead of each individual 
one?

Any recommendations / resources / advice would be greatly appreciated!

Thanks,

Rion

[0] : https://prometheus.io/docs/instrumenting/clientlibs/

Re: Using Prometheus Client Metrics in Flink

Reply via email to