The query syntax would not be drastically impacted (though I understand 
there is still a change). The significance would be a 75% reduction in the 
number of series generated by these metrics. Less to store and compute.

*> "what fraction of nodes is down"*

Current:
    count (consul_health_node_status{status!="passing"} == 1)
    /
    count (consul_health_node_status)

Proposed:
    count (consul_health_node_status == 0)
    /
    count (consul_health_node_status)

*> "which nodes have multiple services down?"*

Current:
    count by (node) (consul_health_service_status{status!="passing"} == 1) 
> 1

Proposed:
    count by (node) (consul_health_service_status == 0) > 1

*> What service checks are critical?*

Current:
    consul_health_service_status{status="critical"} == 1

Proposed:
    consul_health_service_status == 0


On Tuesday, August 17, 2021 at 5:01:27 AM UTC-4 matt...@prometheus.io wrote:

> What would some common queries be that this affects, and how would they 
> look in the future? For example, "what fraction of nodes is down" "which 
> nodes have multiple services down?"
>
> /MR
>
> On Mon, Aug 16, 2021, 22:31 Matt Russi <mru...@gmail.com> wrote:
>
>> Currently, the consul_exporter exposes 4 series per health_node and 
>> health_service status check. Each with a label indicating the status 
>> (maintenance, warning, critical, or passing). In larger environments, this 
>> creates quite a few extra series. 
>>
>> As somewhat of a precedent, the status is already being mapped to a value 
>> for the consul_serf_lan_member_status metric (as Consul's API provides this 
>> mapping).
>> # HELP consul_serf_lan_member_status Status of member in the cluster. 
>> 1=Alive, 2=Leaving, 3=Left, 4=Failed.
>>
>> I wanted to get some thoughts around this before pursuing a PR.
>>
>> In my example, I used -2=maintenance, -1=warning, 0=critical, and 
>> 1=passing to fall in line with the Prometheus paradigm of up=0 (down) and 
>> up=1 (up). Since we have two additional values, the negative numbers play 
>> more nicely when trying to do a value mapping in Grafana. Not married to 
>> the values themselves though. :) 
>>
>> Present Example:
>> consul_health_node_status{check="serfHealth",node="example_node",status="critical"}
>>  
>> 0
>> consul_health_node_status{check="serfHealth",node="example_node",status="maintenance"}
>>  
>> 0
>> consul_health_node_status{check="serfHealth",node="example_node",status="passing"}
>>  
>> 1
>> consul_health_node_status{check="serfHealth",node="example_node",status="warning"}
>>  
>> 0
>>
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="critical"}
>>  
>> 0
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="maintenance"}
>>  
>> 0
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="passing"}
>>  
>> 1
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="warning"}
>>  
>> 0
>>
>> Proposed Example:
>> # HELP consul_health_node_status Status of health checks associated with 
>> a node. -2=maintenance, -1=warning, 0=critical, 1=passing
>> consul_health_node_status{check="serfHealth",node="example_node"} 1
>>
>> # HELP consul_health_service_status Status of health checks associated 
>> with a service. -2=maintenance, -1=warning, 0=critical, 1=passing 
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service"}
>>  
>> 1
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-devel...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-developers/9bb6b446-728d-47d9-8a08-355dec88d572n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-developers/9bb6b446-728d-47d9-8a08-355dec88d572n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/420999dd-e14a-43f2-b296-b5a3da23dd0dn%40googlegroups.com.

Reply via email to