Re: [prometheus-users] Custom Threshold for a particular instance.

[email protected] Fri, 03 Jul 2020 00:13:06 -0700

This seems like an interesting approach. If possible can you please give 
some more insight into this approach?


On Friday, July 3, 2020 at 10:56:01 AM UTC+5:30 [email protected] 
wrote:

> The other proper way is to dynamically generate alerts where you hardcode 
> the thresholds based on labels.
> Like using a combo of yaml/jinja to store the thresholds in a 
> maintainable format and have one command to regenerate everything.
> Every time you want to change a value you just regenerate the alerts.
>
> On Thu, Jul 2, 2020 at 8:38 PM Yagyansh S. Kumar <[email protected]> 
> wrote:
>
>> Also, currently, I have only tried a single way to give custom threshold 
>> i.e based on the component name. For example, for all the targets under 
>> Comp-A have a threshold of 99.9 and all the targets under Comp-B have a 
>> threshold of 95.
>> But now, I have to give a common custom threshold let say 98 to 5 
>> different targets, all of which belong to 5 different components and all 
>> the 5 components have more than 1 target but I want the custom threshold to 
>> be applied for only a single target from each component.
>>
>> On Fri, Jul 3, 2020 at 12:02 AM Yagyansh S. Kumar <[email protected]> 
>> wrote:
>>
>>> Hi Christian,
>>>
>>> Actually, I want to another if there is any better way to define the 
>>> threshold for my 5 new servers that belong to 5 different components. Is 
>>> writing 5 different recording rules with the same name, and different 
>>> instance and component labels only way to proceed here? Won't that be a 
>>> little too dirty to maintain? What if it was 20 servers all belonging to a 
>>> different component?
>>>
>>> On Tue, Jun 30, 2020 at 11:43 AM Christian Hoffmann <
>>> [email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> On 6/24/20 8:09 PM, [email protected] wrote:
>>>> > Hi. Currently I am using a custom threshold in case of my Memory 
>>>> alerts.
>>>> > I have 2 main labels for my every node exporter target - cluster and
>>>> > component.
>>>> > My custom threshold till now has been based on the component as I had 
>>>> to
>>>> > define that particular custom threshold for all the servers of the
>>>> > component. But now, I have 5 instances, all from different components
>>>> > and I have to set the threshold as 97. How do approach this?
>>>> > 
>>>> > My typical node exporter job.
>>>> >   - job_name: 'node_exporter_JOB-A'
>>>> >     static_configs:
>>>> >     - targets: [ 'x.x.x.x:9100' , 'x.x.x.x:9100']
>>>> >       labels:
>>>> >         cluster: 'Cluster-A'
>>>> >         env: 'PROD'
>>>> >         component: 'Comp-A'
>>>> >     scrape_interval: 10s
>>>> > 
>>>> > Recording rule for custom thresholds.
>>>> >   - record: abcd_critical
>>>> >     expr: 99.9
>>>> >     labels:
>>>> >       component: 'Comp-A'
>>>> > 
>>>> >   - record: xyz_critical
>>>> >     expr: 95
>>>> >     labels:
>>>> >       node: 'Comp-B'
>>>> > 
>>>> > The expression for Memory Alert.
>>>> > ((node_memory_MemTotal_bytes - node_memory_MemFree_bytes -
>>>> > node_memory_Cached_bytes) / node_memory_MemTotal_bytes * 100) *
>>>> > on(instance) group_left(nodename) node_uname_info > on(component)
>>>> > group_left() (*abcd_critical* or *xyz_critical* or on(node) count by
>>>> > (component)((node_memory_MemTotal_bytes - node_memory_MemFree_bytes -
>>>> > node_memory_Cached_bytes) / node_memory_MemTotal_bytes * 100) * 0 + 
>>>> 90)
>>>> > 
>>>> > Now, I have 5 servers with different components. How to include that 
>>>> in
>>>> > the most optimized manner?
>>>>
>>>> This looks almost like the pattern described here:
>>>> https://www.robustperception.io/using-time-series-as-alert-thresholds
>>>>
>>>> It looks like you already tried to integrate the two different ways to
>>>> specific thresholds, right? Is there any specific problem with it?
>>>>
>>>> Sadly, this pattern quickly becomes complex, especially if nested (like
>>>> you would need to do) and if combined with an already longer query (like
>>>> in your case).
>>>>
>>>> I can only suggest to try to move some of the complexity out of the
>>>> query (e.g. by moving the memory calculation to a recording rule 
>>>> instead).
>>>>
>>>> You can also split the rule into multiple rules (with the same name).
>>>> You will just have to ensure that they only ever fire for a subset of
>>>> your instances (e.g. the first variant would only fire for
>>>> compartment-based thresholds, the second only for instance-based
>>>> thresholds).
>>>>
>>>> Hope this helps.
>>>>
>>>> Kind regards,
>>>> Christian
>>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/CAFGi5vB8S0_Gi03HSS%2BUFnQ%3DmWrWVwoBSAxJDhS3ed9r4QcTEA%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/CAFGi5vB8S0_Gi03HSS%2BUFnQ%3DmWrWVwoBSAxJDhS3ed9r4QcTEA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a0526d89-d0f9-4f39-b319-741323e65885n%40googlegroups.com.

Re: [prometheus-users] Custom Threshold for a particular instance.

Reply via email to