Tom-Goong commented on a change in pull request #7820:
[FLINK-11742][Metrics]Push metrics to Pushgateway without "instance"
URL: https://github.com/apache/flink/pull/7820#discussion_r261221674
##########
File path:
flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusPushGatewayReporter.java
##########
@@ -73,7 +77,7 @@ public void open(MetricConfig config) {
@Override
public void report() {
try {
- pushGateway.push(CollectorRegistry.defaultRegistry,
jobName);
+ pushGateway.push(CollectorRegistry.defaultRegistry,
jobName, instance);
Review comment:
> In Prometheus terms, an endpoint you can scrape is called an instance,
usually corresponding to a single process. A collection of instances with the
same purpose, a process replicated for scalability or reliability for example,
is called a job.
> For example, an API server job with four replicated instances:
job: api-server
-- instance 1: 1.2.3.4:5670
-- instance 2: 1.2.3.4:5671
-- instance 3: 5.6.7.8:5670
-- instance 4: 5.6.7.8:5671
https://prometheus.io/docs/concepts/jobs_instances/#jobs-and-instances
I think a Flink job corresponds to a Prometheus job, and taskmanager and
jobmanager correspond to different instances. If the jobName is used as the
instance label, the same metrics of different tasksmanages will conflict, and
operations such as sum will fail.
I failed to use ip:port to form an instance label, because I am not
familiar enough with the Flink source. So use this the easiest way
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services