Hello Everyone, We have integrated Ceph with Prometheus. In Ceph, Ceph MGR Service is exporting metrics at Port 9283( refer below Prometheus config) *********************
rule_files: - /etc/prometheus/alerting/* scrape_configs: - job_name: prometheus static_configs: - targets: - localhost:9092 - honor_labels: true job_name: ceph static_configs: - labels: instance: ceph_cluster targets: - storagenode1:9283 - labels: instance: ceph_cluster targets: - storagenode2:9283 - labels: instance: ceph_cluster targets: - storagenode3:9283 ***************************** We have three nodes of Ceph-mgr of which one is active at a time and two are at stnadby: we can verify this from ceph health: [ansible@storagenode1 ~]$ sudo ceph -s cluster: id: 78dbd380-03e0-48e9-a8c6-d560be215788 health: HEALTH_OK services: mgr: storagenode2(active, since 3h) ************************* The above output shows that ceph-mgr is active on storage node2, from which Prometheus should effectively scrape. But When I go and see the Prometheus dashboard: it shows down for all nodes, including the ones it should show as up. Issue: On the Prometheus dashboard, we should see the ceph-mgr service status as in sync with ceph health. Please suggest any reason/possible cause. Prometheus Version: v2.7.2 Best Regards, Lokendra -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/dbb5e124-4c83-4c49-9cee-bb42c79e24e3n%40googlegroups.com.