Hi everyone, I am new here.

I would like to seek some advice on the design approach we should take.
With the given problem below, in terms of cost, how can we set up
Prometheus with a large cluster.

*Variables:*
*Installation: *Kube-stack-prometheus helm chart.
*Autoscale*: yes
*No. of Nodes*: 1000 up to 1300
*Mesh*: Istio
*Memory Usage:* 50GB (Still gets OOM)
*Installed: *1 Prometheus, 1 Kiali, 1 Grafana and 1 Jaeger

*Issue:*
1. We cannot expand a larger node for Prometheus as 60GB memory is already
expensive.  (cost not approved by management)
2. Removing unnecessary metrics is not yet advised because we do not know
which metrics of istio, jaeger and kiali are needed.

*Tried solution:*
We have federated the single instance of prometheus with Thanos Receivers,
however, the issue is still there because kiali queries its data directly
from prometheus which eventually gets OOM.

*Question:*
We are thinking of firing up multiple prometheus for each namespace and
adding thanos-sidecar with the same scrape config since thanos will
deduplicate all duplicated metrics. This approach would solve the issue in
Grafana queries but not in Kiali.

How can we set up a multiple prometheus (low cost) but single instance
prometheus for kiali (whole cluster)?

Appreciate any help. Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAAMbZt_L8DUGjWrCpbqhR%2Bkrp-Yee6HvqfiZs7J_DNZC-DhEtQ%40mail.gmail.com.

Reply via email to