I don't think your use case is something that AM or Prometheus is looking
to solve.

The way I see it :

   - Prometheus has metrics and alarm patterns.
   - It triggers an alarm and sends it to AM.
   - AM receives the alarm and does some basic routing based on labels.
   - Once the Prometheus pattern becomes false, a recovery is sent.

That's pretty much it. There is no concept of escalation, end to end
service recovery, or service mapping inside AM or Prometheus.

In theory, you could have a "fake" alarm, where you send some json to AM,
with specific flags and that triggers a specific route to send the
SMS/email to the appropriate recipients. But I don't think it's really part
of the core purpose of AM.

It's one of the values of a service like PagerDuty. But that still relies
on Prometheus metric --> Prometheus alert triggered --> AM alert received
--> AM sends the alert somewhere.

Just my 2 cents :)

On Wed, Dec 16, 2020 at 11:21 AM Al <[email protected]> wrote:

> Thanks for the quick response Stuart.  One of our specific use cases
> (although there will be more over time) would be something where a first or
> second level support team escalates an issue they can't solve t to the
> engineers responsible for the product.   In this case, there would be no
> metric as this is an event that could happen at any time for which we don't
> really want a metric.  Triggering the alert via  alertmanager seemed a
> logical choice as it already handles the logic of the routing to the
> necessary destinations (email, webhook, victorops, etc).  All the user
> would have to do us run the amtool command, with the necessary labels and
> wouldn't have to worry about any other specifics.
>
> Based on your explanation, I now understand alertmanager can't really be
> used that way.    Could you show me where in the AM sourcecode that it will
> close an alert unless it is continuously notified by Prometheus?  I'd like
> to know for my own personal knowledge.
>
> Now having considered these facts,  do you have any suggestions based on
> this example?  Is this just something we should handle separately with
> another custom application?  If that's the case, it's a bit discouraging as
> now that means we have to handle the logic of alert routing in more than
> one location.
>
>
>
> Al
>
> On Monday, December 14, 2020 at 12:52:53 PM UTC-5 Stuart Clark wrote:
>
>> On 2020-12-14 17:05, Al wrote:
>> > Hi
>> >
>> > I realize alert conditions in a Promertheus ecosystem should be
>> > triggered from a prometheus instance itself although there is the
>> > "amtool alert add" command that can be used to manually trigger an
>> > alert. Is this something which is commonly used in production
>> > use-cases? I can see a benefit to using this command as I could still
>> > allow users to trigger alerts in a standardized way, but without
>> > having to have specific pre-defined alerting conditions. There may
>> > also be situations where there is no metric collected but only an
>> > alert to be triggered in the situation a specific event occurs.
>> >
>> > From my understanding, when prometheus fires an alert, it will send
>> > the payload to all instances of alert manager with in the cluster and
>> > then they will handle which instance will actually route the alert to
>> > the final destination (e.g.: Victorops, email, webook, etc). If this
>> > is in fact correct, does this mean that amtool should also send the
>> > alert to all alertmanager instances within the cluster?
>> >
>> > I appreciate any clarification you can provide me with.
>> >
>>
>> That command is only intended for testing. Alerts aren't a one-off API
>> call from Prometheus to Alertmanager. Instead Prometheus will repeatedly
>> call every single Alertmanager periodically until the alert is cleared.
>> If Alertmanager stops receiving these updates it will mark the alert as
>> resolved.
>>
>> Alerts in the Prometheus world are triggered based on the evaluation of
>> alerting rules, which themselves are queries which interrogate metrics.
>> Therefore every alert would be based on some sort of source metric
>> (there are a few exceptions, such as having an alert which always fires
>> to check the alerting pipeline for example).
>>
>> For one of the example use cases you gave you said an alert should be
>> triggered if an event happens. Prometheus itself isn't an event system,
>> but you can create metrics from events. So in that case you'd have a
>> metric that could be a counter of the number of events that have
>> happened. Then your alert would fire when that value increases (for
>> example).
>>
>> Are you able to give some more information on what use cases you are
>> trying to handle?
>>
>> --
>> Stuart Clark
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/74029e3f-d642-4174-9a19-646c23618430n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/74029e3f-d642-4174-9a19-646c23618430n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAOAKi8yRQht7B4u47cXWE0YsX-PFcp3y4ges-kNZo6DxsDf53Q%40mail.gmail.com.

Reply via email to