Re: [prometheus-users] Re: Alert Query

'Brian Candler' via Prometheus Users Wed, 14 Feb 2024 01:20:37 -0800

(max_over_time(kube_pod_status_ready{condition="true"}[10m]) == 0))


That won't solve your problem, because if there were only one initial 
sample with value zero, it would trigger immediately, just as you do 
today.  You would need to make this more complex, for example with 
count_over_time as well.

Be very careful with "and" and "or" operators. They do not work like 
booleans in normal languages. They are combining vectors, matching the 
label sets of each element in the vector.

I recommend you go for the simplest expression which meets your needs 
sufficiently well.

On Wednesday 14 February 2024 at 03:46:33 UTC sri L wrote:

> Thanks Brian Candler.
>
> I am thinking of combining two conditions.
>
> ((kube_pod_status_ready{condition="true"} == 0 and 
> max_over_time(kube_pod_status_ready{condition="true"}[10m]) == 1) or 
> (max_over_time(kube_pod_status_ready{condition="true"}[10m]) == 0))
>
> Expecting this expression to alert if pod was up in the last 10 mins and 
> currently unreachable or pod is unreachable from last 10mins or more.
>
> Please correct if there is any better way
>
>
> On Wed, Feb 14, 2024 at 12:32 AM 'Brian Candler' via Prometheus Users <
> promethe...@googlegroups.com> wrote:
>
>> I guess it goes through non-ready states while it's starting up.
>>
>> A simple approach is to put "for: 3m" on the alert so that it doesn't 
>> fire an alert until it has been in the down state for 3 minutes.
>>
>> Another approach would be:
>>
>> kube_pod_status_ready{condition="true"} == 0 and 
>> max_over_time(kube_pod_status_ready{condition="true"}[10m]) == 1
>>
>> This will fire if the pod was ready at any time in the last 10 minutes, 
>> but is not ready now. This does mean that the alert will clear after 10 
>> minutes of error condition, though.
>>
>> On Tuesday 13 February 2024 at 17:18:37 UTC sri L wrote:
>>
>>> Hi all,
>>>
>>> I am trying to create an alert rule for pod unreachable condition. Below 
>>> expression I used but alert was triggering whenever new pod got created, we 
>>> want alert only when the previous state of a pod was in the ready state and 
>>> then went to unreachable/terminating/pending states.
>>>
>>> kube_pod_status_ready{condition="true"} == 0
>>>
>>> Please suggest if we have any suitable alert expression for the above 
>>> requirement.
>>>
>>> Thanks
>>>
>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/7728c736-797f-4771-b809-24e5f6b3931dn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/7728c736-797f-4771-b809-24e5f6b3931dn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/82297f3b-1579-41b9-94df-b14bc4eb3f72n%40googlegroups.com.

Re: [prometheus-users] Re: Alert Query

Reply via email to