Hi, By reading this [1] and this [2] having the feeling that the reader, or maybe only me, can have some troubles about when this rule can be applied and under what circumstances this should be applied and how.
>From my understanding, correct me if I'm wrong, Prometheus is encouraging the use of labels for slicing your metrics [2], like for example for identifying what service is the owner of a time series. Considering the following HTTP metrics http_api_requests, would be fine having different time series for the same metric name identified with the following label values http_api_requests service_name=foo, status_code=200 http_api_requests service_name=foo, status_code=500 http_api_requests service_name=bar, status_code=200 http_api_requests service_name=bar, status_code=500 And in the use case of having not 2 services but 1K different services, this would be still fine since the total number of metrics would be still manageable. >From what can be read in [1], this could be misunderstood > As a general guideline, try to keep the cardinality of your metrics below 10, > and for metrics that exceed that, aim to limit them to a handful across your > whole system. The vast majority of your metrics should have no labels. Looking at the previous example and the general guideline someone could understand that adding the service_name as a label name is breaking that rule. >From my understanding, correct me if I'm wrong, what this general guideline is should be circumscribed on the side effect of adding a label with a large cardinality, or by adding one that thought not having a large cardinality once it's added together with another label implies an explosion with the number of the metrics. For example, let's consider the previous example of the http_api_requests, what would happen if we would add the resource path as a metric variable? having something like this http_api_requests service_name=foo, status_code=200, resource_path="/a" http_api_requests service_name=foo, status_code=500, resource_path="/b" http_api_requests service_name=bar, status_code=200, resource_path="/c" http_api_requests service_name=bar, status_code=500, resource_path="/d" This will become an issue? having the feeling that it would depend, depend on how the query is done. If the query would be done also narrowing by service name this should not be a problem since the total number of time series should be still a manageable number, while the total number of time series if the query was not filtered by service name will be most likely unmanageable. If this is true, and most likely the second query wouldn't make any sense, why not prefix the metric name by the service name for avoiding future queries that by mistake could break the system? Another example, lets consider that we add as a label the pod id, which can have thousands of different values but they are in somehow stable during a window time, the metric will look like this http_api_requests service_name=foo, status_code=200, resource_path="/a", pod_name="1ef" http_api_requests service_name=foo, status_code=500, resource_path="/b", pod_name="2ef" http_api_requests service_name=bar, status_code=200, resource_path="/c", pod_name="3ef" http_api_requests service_name=bar, status_code=500, resource_path="/d", pod_name="4ef" The query that we will be running typically won't be using any pod slicing, but we will still do a narrowing by service name. Let's consider a scenario where we do have more or less a stable number of 500 pods in a window time, would be the query still manageable by PrometheusIO? Looking at the example that you provide about node_exporter seems fine to me since we will still narrow the query always to one specific service which will reduce dramatically the number of time series involved during the query. am I missing something in my rationale? If not, would it make sense on rewording a bit the following message: >> As a general guideline, try to keep the cardinality of your metrics below >> 10, and for metrics that exceed that, aim to limit them to a handful across >> your whole system. The vast majority of your metrics should have no labels. Should be used as a rule of thumb the number of time series involved during a query, where this number should be < X? Thanks! [1] https://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels [2] https://www.robustperception.io/target-labels-not-metric-name-prefixes -- --pau -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CA%2BULCcF3h%3DvEvsVPZs-2zC2xNrd60tz6vZMMN4aN-6LwEdz75A%40mail.gmail.com.

