Just be aware that you can end up with very noisy data. Something which looks 
like a failure could easily be due to transient issues - failed scrapes, etc. 

On 22 June 2020 20:46:28 BST, "Sébastien Dionne" <[email protected]> 
wrote:
>thanks
>
>in my case, the alerts will be send to our healthManager and update the
>
>states of our application in the database.  No human interaction.
>
>
>I though of using a script with a liveness probe and the script could
>sent 
>a POST into our healthManager.. but at the end, it's the same thing
>because 
>the livenessprobe will run like each 5 seconds.  So I prefer to use the
>
>metrics that Prometheus will scrap anyway.
>
>
>
>
>On Monday, June 22, 2020 at 3:42:14 PM UTC-4, Stuart Clark wrote:
>>
>> While it is definitely possible to have very low scrape intervals and
>very 
>> sensitive alerts often that results in poor outcomes.
>>
>> The reality is that reaction times to alerts are generally fairly
>long - 
>> an alert outside of office hours could easily take 30 minutes or
>longer to 
>> respond to. I'd suggest being very careful about such short "for" 
>> intervals. You can very easily end up with a lot of false positives,
>with 
>> alerts which fire then resolve, fire then resolve.
>>
>> But technically you can have scrape intervals of a second or less,
>and 
>> "for"s of a few seconds. 
>>
>> On 22 June 2020 20:08:08 BST, "Sébastien Dionne"
><[email protected] 
>> <javascript:>> wrote:
>>>
>>> I want to use Prometheus + alertmanager for health manager.  I want
>to 
>>> know what is the lowest value I can use for scraping metrics (I hope
>that I 
>>> can have a config for particuliar rules) and send alert as soon as
>there 
>>> are alerts.  I need almost realtime.  Is it possible in Prometheus +
>
>>> alertmanager ?  
>>>
>>>
>>> I have a sample config that works now, but is it possible to have 1s
>are 
>>> something that prometheus send alert as soon as the metric is read ?
>>>
>>> serverFiles:
>>>   alerts:
>>>     groups:
>>>       - name: Instances
>>>         rules:
>>>           - alert: InstanceDown
>>>             expr: up == 0
>>>             for: 10s
>>>             labels:
>>>               severity: page
>>>             annotations:
>>>               description: '{{ $labels.instance }} of job {{
>$labels.job 
>>> }} has been down for more than 1 minute.'
>>>               summary: 'Instance {{ $labels.instance }} down'
>>>               
>>> alertmanagerFiles:
>>>   alertmanager.yml:
>>>     route:
>>>       receiver: default-receiver
>>>       group_wait: 5s
>>>       group_interval: 10s
>>>
>>>     receivers:
>>>       - name: default-receiver
>>>         webhook_configs:
>>>           - url: "
>>> https://webhook.site/815a0b0b-f40c-4fc2-984d-e29cb9606840";
>>>               
>>>
>>>
>> -- 
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>>
>
>-- 
>You received this message because you are subscribed to the Google
>Groups "Prometheus Users" group.
>To unsubscribe from this group and stop receiving emails from it, send
>an email to [email protected].
>To view this discussion on the web visit
>https://groups.google.com/d/msgid/prometheus-users/f1787055-a91f-491b-8eaf-0a8fec9aca00o%40googlegroups.com.

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/C88F5644-BCDD-47F5-86BD-38D7B9CEC263%40Jahingo.com.

Reply via email to