Thank you, This is following https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#prometheus-orgapacheflinkmetricsprometheusprometheusreporter . What might I be doing wrong ?
metrics.reporters: prom metrics.reporter.prom.port: 9610 . metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter and uses ServiceMonitor apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor endpoints: - port: metrics // *named port exposed in the k8s service and is 9610* scheme: http path: /metrics interval: 60s scrapeTimeout: 59s selector:.... Regards. On Fri, Mar 22, 2019 at 3:05 PM Chesnay Schepler <ches...@apache.org> wrote: > Since you're using Prometheus I would recommend setting up a > PrometheusReporter as described in the metrics documentation and scrape > each JM/TM individually. Scraping through the REST API is more expensive > and you loose out on a lot of features. > The REST API calls are primarily aimed at the WebUI. > > Regardless, as of right now I would doubt that this is a Flink issue, and > would recommend heading to the prometheus mailing lists. > > On 22/03/2019 17:55, Vishal Santoshi wrote: > > A simple query, Is the route to /metrics execute an access to an in > memory registry of stats collected OR does it contend with access from JM > or do expensive access or computation. I see occasionally our Prometheus > scrape fail with the error pasted below. We have had the scrapper do much > more elaborate scrape on other systems we maintain so was curious. The > server did not have any logs related to the exception and the scraper is . > ServiceMonitor from k8s and of course these TMs are hosted no k8s as well > > Get http://10.246.254.84:9610/metrics: EOF > > >