Please find replies inline. 

On Friday, 7 October, 2022 at 1:25:27 pm UTC+5:30 Stuart Clark wrote:

> On 07/10/2022 04:09, Muthuveerappan Periyakaruppan wrote: 
> > we have a situation , where we have 8 to 15 million head series in 
> > each Prometheus and we have 7 instance of them (federated). Our 
> > prometheus are in a constant flooded situation handling the incoming 
> > metrics and back end recording rules. 
>
> 8-15 million time series on a single Prometheus instance is pretty high. 
> What spec machine/pod are these? 
>
> 90gb ram, 5000 millicores. 
 

> When you say "flooded" what are you meaning? 
>
 
Always high usage of ram,  no oom , although missing metrics, average 
scrape duration like 35 seconds ... (may be due to no of targets/metrics)
cpu demand/usage is not that high


> > One thought which came to was - do we have something similar to log 
> > level for prometheus metrics ? If its there then... we can benefit 
> > from it .... by configuring to run all targets in error level in 
> > production and in debug/info level in development... This will help 
> > control flooding of metrics. 
> > 
> I'm not sure what I understand what you are suggesting. What would be 
> the difference between setting this hypothetical "error" and "debug" 
> levels? Are you meaning some metrics would only be exposed on some 
> environments? 
>
> Lets say every pod has close to 100 metrics , we may not need all of them 
in production ... 
A developer before logging a metric can access on how useful this metric 
will be in production ...what indicators does it have - Utilization, 
Saturation, and Errors (USE) / Rate, Errors, and Duration (RED) ... based 
on this he can choose the metric level. 
Based on the level of metric,  only few can be enabled (ERROR / SEVERE 
level) in production the rest can be enabled (INFO /DEBUG Level) in 
development / testing / staging environments. 
few metrics should / are enough to troubleshoot and on demand we should 
have the option to change the metric level ...like log level at runtime to 
get more metrics 

-- 
> Stuart Clark 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/577a43f4-3e8d-4c16-9061-3ba35699bd41n%40googlegroups.com.

Reply via email to