I'm having som troubles setting up the alertmanager. I have set up a rules file in prometheus (see blow) and a setting file for alertmanager (aslo below) In Alertmanager i see the active alert for swapusage java
instance="lsrv0008"+ 1 alert - 06:49:37, 2020-04-07 (UTC)InfoSource <http://lsrv2289.linux.rabobank.nl:9090/graph?g0.expr=swapusage_stats%7Bapplication%3D%22java%22%7D+%3E+500000&g0.tab=1> Silence <http://lsrv2289.linux.rabobank.nl:9093/#/silences/new?filter=%7Balertname%3D%22swap_usage_java_high%22%2C%20application%3D%22java%22%2C%20exportertype%3D%22node_exporter%22%2C%20host%3D%22lsrv0008%22%2C%20instance%3D%22lsrv0008%22%2C%20job%3D%22PROD%22%2C%20monitor%3D%22codelab-monitor%22%2C%20quantity%3D%22kB%22%2C%20severity%3D%22warning%22%7D> alertname="swap_usage_java_high"+ application="java"+ exportertype="node_exporter"+ host="lsrv0008"+ job="PROD"+ monitor="codelab-monitor"+ quantity="kB"+ severity="warning"+ But the mail is not send by alertmanager…. what am i missing? Prometheus rules file groups: - name: targets rules: - alert: monitor_service_down expr: up == 0 for: 40s labels: severity: critical annotations: summary: "Monitor service non-operational" description: "Service {{ $labels.instance }} is down." - alert: server_down expr: probe_success == 0 for: 30s labels: severity: critical annotations: summary: "Server is down (no probes are up)" description: "Server {{ $labels.instance }} is down." - alert: loadbalancer_down expr: loadbalancer_stats < 1 for: 30s labels: severity: critical annotations: summary: "A loadbalancer is down" description: "Loadbalancer for {{ $labels.instance }} is down." - name: host rules: - alert: high_cpu_load1 expr: node_load1 > 8.0 for: 300s labels: severity: warning annotations: summary: "Server under high load (load 1m) for 5 minutes" description: "Host is under high load, the avg load 1m is at {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job }}." - alert: high_cpu_load5 expr: node_load5 > 5.0 for: 600s labels: severity: warning annotations: summary: "Server under high load (load 5m) for 10 minutes." description: "Host is under high load, the avg load 5m is at {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job }}." - alert: high_cpu_load15 expr: node_load15 > 4.5 for: 900s labels: severity: critical annotations: summary: "Server under high load (load 15m) for 15 minutes." description: "Host is under high load, the avg load 15m is at {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job }}." - alert: high_volume_workers_prod expr: sum(apache_workers{job="Apache PROD"}) by (instance) > 325 for: 30s labels: severity: warning annotations: summary: "Number of workers above 325 for 30s" description: "The Apache workers are over 325 for 30s. Current value is {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job }}." - alert: medium_volume_workers_prod expr: sum(apache_workers{job="Apache PROD"}) by (instance) > 300 for: 30s labels: severity: warning annotations: summary: "Number of workers above 300 for 30s" description: "The Apache workers are over 300 for 30s. Current value is {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job }}." - alert: swap_usage_java_high expr: swapusage_stats{application="java"} > 500000 for: 300s labels: severity: warning annotations: summary: "Swap usage for Java is high for the last 5 minutes" description: "The swap usage for the java process are hig. Current value is {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job }}." Alertmanager setupfile global: resolve_timeout: 5m http_config: {} smtp_from: [email protected] smtp_hello: localhost smtp_smarthost: localhost:25 smtp_require_tls: true pagerduty_url: https://events.pagerduty.com/v2/enqueue hipchat_api_url: https://api.hipchat.com/ opsgenie_api_url: https://api.opsgenie.com/ wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/ victorops_api_url: https: //alert.victorops.com/integrations/generic/20131114/alert/ route: receiver: default group_by: - instance routes: - receiver: mail match: severity: warning - receiver: all match: severity: critical group_wait: 1s group_interval: 1s receivers: - name: default - name: mail email_configs: - send_resolved: true to: [email protected] from: [email protected] hello: localhost smarthost: localhost:25 headers: From: [email protected] Subject: '{{ template "email.default.subject" . }}' To: [email protected] html: '{{ template "email.default.html" . }}' require_tls: false - name: all email_configs: - send_resolved: true to: [email protected] from: [email protected] hello: localhost smarthost: localhost:25 headers: From: [email protected] Subject: '{{ template "email.default.subject" . }}' To: [email protected] html: '{{ template "email.default.html" . }}' require_tls: false - send_resolved: true to: [email protected] from: [email protected] hello: localhost smarthost: localhost:25 headers: From: [email protected] Subject: '{{ template "email.default.subject" . }}' To: [email protected] html: '{{ template "email.default.html" . }}' require_tls: false - name: webhook webhook_configs: - send_resolved: true http_config: {} url: http://127.0.0.1:9000 templates: [] -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7cbb3a17-bf66-4530-9d2c-344549c5cbb3%40googlegroups.com.

