Hello
*Using the cAdvisor metrics scraper from a Kubelet's prometheus metrics
endpoint*
I originally wrote the following query to compute the memory usage of the
20 most-memory consuming Kubernetes pod in a cluster:
topk(20, sum(container_memory_working_set_bytes{cluster_id="$region",
kubernetes_node_name=~"$node",namespace!="",container!="",image!=""}) by (
pod))
However, I noticed that at some time intervals, the memory of some pods was
increasing in frequent bump.
I finally correlated that with the restart of those pods, and understood
that the serie *container_memory_working_set_bytes* contains a "id" label
that change when a pod restart, which created an entorely new serie, which
produced the values before and after each restart to be summed tohether.
I later rewrote the query like this:
topk(20, avg(container_memory_working_set_bytes{cluster_id="$region",
kubernetes_node_name=~"$node",namespace!="",container!="",image!=""}) by (
pod))
And the results were close to the output of the *docker stats* command, but
it's still averaged values.
I am wondering if I'm doing it right, or if there is a better/more accurate
solution.
Thanks
William
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/42ac520a-c5ef-4f28-ad81-e2446c99cb44%40googlegroups.com.