[prometheus-users] Calculating Cluster uptime % for two node cluster

Shubham Shrivastav Fri, 29 Jul 2022 20:21:17 -0700

Hey guys, 

I have custom metrics enabled for individual nodes of cluster.


# HELP platform_uptime_state  Overall platform status is 1 when up, 0 
otherwise
# TYPE platform_uptime_state gauge
platform_uptime_state{platform_version="6.4", node_id="101", cluster_id="1" 
} 1
platform_uptime_state{platform_version="6.4", node_id="102", cluster_id="1" 
} 0

Each cluster has two nodes.

Using ranged vector I could derive something like this to calculate uptime 
for an individual node.

sum_over_time((platform_uptime_state{node_id ="101"})[1h:15s]) / 
count_over_time((platform_uptime_state{node_id ="101"})[1h:15s])

But here's the formula, I'm trying to implement: 





*count of time series when the cluster has at least 1 node up over 1d(eg. 
sum by (cluster_id) (platform_uptime_state) == 0)/count of total cluster ts 
over 1d*
But this doesn't work, 
Is there a better way to do this?

TIA,
Shubham

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/63c34897-a6f0-4d93-be49-b61400abb1cbn%40googlegroups.com.

[prometheus-users] Calculating Cluster uptime % for two node cluster

Reply via email to