Re: [prometheus-users] Removing namespace component in metric's keys

Stuart Clark Mon, 01 Mar 2021 01:58:25 -0800

On 01/03/2021 06:44, Manjula Amunugama wrote:

Hi all,
In our environment for monitoring about 200 micro-services, we usePrometheus & Grafana.
In one application to another, developers used different differentstrings as the namespace component.i.e. we have used Prometheus keys like"booking_engine_driver_eta_location_service_outboundcall_latency_microseconds_count"to count the latency from "BookingEngine.Driver-ETA" to"Location-Service"In this "Booking Engine" is the "Service Group" and "Driver-ETA" isthe service and "Location-Service" is the outbound service
In monitoring its a must to monitor "Inbound Request Rates byEndpoint", "Inbound Request Error Rates by Endpoint", "ProcessingLatency by Endpoint", "Outbound Request Rates by Endpoint", "OutboundRequest Rates by Endpoint", "Outbound Request Error Rates by Endpoint"for API based requests.
We can monitor all the services with about 3 dashboards "InboundService Monitor Rates", "Outbound Service Monitor Rates", "ProcessingLatencies" we know the Prometheus keys used.
So we wanted to standardize the Prometheus Keys as the following
- We use namespace to define the "Development Team"
- Application Name will be a label in the key - i.e. label will be "app"
- Endpoint also will be a label in the key
- Error will be a label in the key
So the previous key with labels will be changed to"outboundcall_latency_microseconds_count{app="booking_engine_driver_eta_location_service"}"
Doing this we can automate most of the things related Dashboarding andAlerting.
By doing this about 200 time series-es will be grouped into about 4groups and hence 200 time series into 4 time series.
Doing so, will there be a big hit for Prometheus performance?

A time series is different to a metric.

A metric has a name and an optional selection of labels.

A time series is one specific metric & label combination.

So, for example, a metric could be called "requests_count", but two timeseries could be "requests_count{response_code='200'}" or"requests_count{system='frontend',authenticated='false'}".

As a result, in terms of the number of time series there is nodifference between 100 metrics with no labels and a single metric with alabel with 100 values.

How the difference affects performance will depend on how things arebeing used. There is likely to be little difference in performanceduring scraping, but query usage could make a bigger difference. Ametric with labels is expected to be aggregatable, so it would makesense to arrange the data in that way if that would be true. If you wereto sum together all the different label combinations of a particularmetrics would the result make sense? An example, a metrics which countsrequests and has labels for error code would still make sense if yousummed everything together (rather than requests per code you would havetotal number of requests).

Would it make sense in your case to use labels within a single metric?If the different systems are completely unrelated that might not be thecase - a sum wouldn't mean anything and an average would be equallyuseless as the different systems do a totally different selection ofwork. However if you are looking at latencies end-to-end across multiplesystems in a flow, or have multiple instances of a system, then it doessound like the use of labels would make more sense - sum would give youthe overall end-to-end latency or you could produce averages for aparticular system across instances.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/10c3f2ab-170c-eec5-5449-56ba6c84e340%40Jahingo.com.

Re: [prometheus-users] Removing namespace component in metric's keys

Reply via email to