Hi Brain,
Thanks for your reply.
1)Below is the log message of the error message. I have severity set up in
alert manager config.yml
- name: 'pagerduty_prod_default'
pagerduty_configs:
- send_resolved: true
routing_key: ${PAGERDUTY_PROD_DEFAULT_KEY}
description: '{{ template "pagerduty.default.description" .}}'
severity: '{{ .CommonLabels.severity }}'
details:
summary: |-
{{ range .Alerts }}{{ .Annotations.summary }}
{{ end }}
severity: '{{ .CommonLabels.severity }}'
status: '{{ .Status }}'
level=error ts=2021-10-01T12:52:34.264Z caller=dispatch.go:309
component=dispatcher msg="Notify for alerts failed" num_alerts=31
err="pagerduty_prod_default/pagerduty[0]: notify retry canceled due to
unrecoverable error after 1 attempts: unexpected status code 400: Event
object is invalid: 'payload.severity' is missing or blank"
The severity comes from the alert. but I would like to know if there is a
global way of setting up severity on override this per specific alert, this
would reduce a lot of redundant lines in my config yml.
Yes it will. The thing which triggers the alert is the presence of any
timeseries with any value, i.e. a non-empty instant vector. Even if there
are no labels, the timeseries still exists.
2) I'm not sure why you want to sum() over count() though. Unless you're
doing "count by" then you'll only get a single count, and summing a single
value just gives that value. The expression
count(up{job="node"})
already returns a timeseries with no labels.
When the timeseries doesn't have any labels, how does the grouping of these
alerts are handled.
Thanks
Eswar
On Monday, 4 October 2021 at 15:12:05 UTC+2 Brian Candler wrote:
> On Monday, 4 October 2021 at 12:17:20 UTC+1 [email protected] wrote:
>
>> I have prom with alertmanager set up working fine. But some alerts though
>> being fired are not being ending up in PagerDuty. A look at the logs say it
>> is to do with the severity not being set. I want to know if there is a
>> global way of setting of severity at global level in alert.rules.yml and
>> then overwrite it wherever it is required?
>>
>
> I don't use pagerduty, but as far as I can see, you set the severity
> statically in the pagerduty_config
> <https://prometheus.io/docs/alerting/latest/configuration/#pagerduty_config>
> in alertmanager; if you don't set it, it defaults to "error". So I can't
> see how it's possible to send an alert to pagerduty without a severity.
> Perhaps you could show the actual logs? Maybe your alertmanager
> configuration is using label 'severity' as a routing key, which would be
> something else (not related to pagerduty).
>
>
>>
>> Also, when my prom query only outputs a number without any labels (for
>> example sum(count(up{job=~"traefikv2",origin="k3s"}))) , the query returns
>> only the value without any labels. does this still trigger an alert?
>>
>>
> Yes it will. The thing which triggers the alert is the presence of any
> timeseries with any value, i.e. a non-empty instant vector. Even if there
> are no labels, the timeseries still exists.
>
> I'm not sure why you want to sum() over count() though. Unless you're
> doing "count by" then you'll only get a single count, and summing a single
> value just gives that value. The expression
> count(up{job="node"})
> already returns a timeseries with no labels.
>
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/c4d4a36c-3986-4000-9add-264b6963ddffn%40googlegroups.com.