Mohsen Rezaei created FLINK-38704:
-------------------------------------
Summary: Metrics reporter setup does not load Prometheus with
correct configs/port
Key: FLINK-38704
URL: https://issues.apache.org/jira/browse/FLINK-38704
Project: Flink
Issue Type: Bug
Components: Runtime / Metrics
Affects Versions: 2.1.1, 2.0.1
Reporter: Mohsen Rezaei
Something that was working in 1.x releases, but it doesn't load the correct
config in 2.x.
Runtime Flink configurations loaded:
{code}
2025-11-20 04:33:51.737 [main] INFO
org.apache.flink.configuration.GlobalConfiguration - Loading configuration
property: metrics.reporter.prom.port, 9999
2025-11-20 04:33:51.738 [main] INFO
org.apache.flink.configuration.GlobalConfiguration - Loading configuration
property: metrics.reporter.prom.factory.class,
org.apache.flink.metrics.prometheus.PrometheusReporterFactory
{code}
But the reporter setup [loads the default
port](https://github.com/apache/flink/blob/45ab6c816465e717d0eef2ad6672cbb0c1a73a7e/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporterFactory.java#L33):
{code}
2025-11-20 04:33:55.520 [main] INFO
org.apache.flink.metrics.prometheus.PrometheusReporter - Started
PrometheusReporter HTTP server on port 9249.
{code}
and only vending metrics from 9249:
{code}
flink@jm-0:~$ curl localhost:9999/metrics
curl: (7) Failed to connect to localhost port 9999 after 0 ms: Couldn't connect
to server
flink@jm-0:~$ curl localhost:9249/metrics
# HELP flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond
TimeMsPerSecond (scope: jobmanager_Status_JVM_GarbageCollector_Copy)
# TYPE flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond gauge
flink_jobmanager_Status_JVM_GarbageCollector_Copy_TimeMsPerSecond{host="10_155_60_8",}
0.0
...
{code}
This is potentially affecting all the reporters loaded via their factory in
{{ReporterSetup}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)