Okay i think i got some log. Just not sure what it means....

level=debug ts=2020-04-08T05:08:37.628Z caller=dispatch.go:104 
component=dispatcher msg="Received alert" 
alert=swap_usage_java_high[d346adb][active]
level=debug ts=2020-04-08T05:08:37.628Z caller=dispatch.go:432 
component=dispatcher 
aggrGroup="{}/{severity=\"warning\"}:{instance=\"lsrv0008\"}" msg=flushing 
alerts=[swap_usage_java_high[d346adb][active]]
level=debug ts=2020-04-08T05:08:38.630Z caller=dispatch.go:432 
component=dispatcher 
aggrGroup="{}/{severity=\"warning\"}:{instance=\"lsrv0008\"}" msg=flushing 
alerts=[swap_usage_java_high[d346adb][active]]
level=debug ts=2020-04-08T05:08:39.630Z caller=dispatch.go:432 
component=dispatcher 
aggrGroup="{}/{severity=\"warning\"}:{instance=\"lsrv0008\"}" msg=flushing 
alerts=[swap_usage_java_high[d346adb][active]]
level=debug ts=2020-04-08T05:08:40.630Z caller=dispatch.go:432 
component=dispatcher 
aggrGroup="{}/{severity=\"warning\"}:{instance=\"lsrv0008\"}" msg=flushing 
alerts=[swap_usage_java_high[d346adb][active]]
and this last line keeps comming.

Op dinsdag 7 april 2020 18:00:59 UTC+2 schreef Matthias Rampke:
>
> What do the alertmanager logs say? If you don't see anything, increase 
> verbosity until you can see Alertmanager receiving the alert and trying to 
> send the notification. At sufficient verbosity, you should be able to trace 
> exactly what it is trying and/or failing to do.
>
> /MR
>
> On Tue, Apr 7, 2020 at 8:52 AM Danny de Waard <[email protected] 
> <javascript:>> wrote:
>
>> I'm having som troubles setting up the alertmanager.
>>
>> I have set up a rules file in prometheus (see blow) and a setting file 
>> for alertmanager (aslo below)
>> In Alertmanager i see the active alert for swapusage java
>>
>> instance="lsrv0008"+
>> 1 alert
>>
>>    - 06:49:37, 2020-04-07 (UTC)InfoSource 
>>    
>> <http://lsrv2289.linux.rabobank.nl:9090/graph?g0.expr=swapusage_stats%7Bapplication%3D%22java%22%7D+%3E+500000&g0.tab=1>
>>    Silence 
>>    
>> <http://lsrv2289.linux.rabobank.nl:9093/#/silences/new?filter=%7Balertname%3D%22swap_usage_java_high%22%2C%20application%3D%22java%22%2C%20exportertype%3D%22node_exporter%22%2C%20host%3D%22lsrv0008%22%2C%20instance%3D%22lsrv0008%22%2C%20job%3D%22PROD%22%2C%20monitor%3D%22codelab-monitor%22%2C%20quantity%3D%22kB%22%2C%20severity%3D%22warning%22%7D>
>>    alertname="swap_usage_java_high"+
>>    application="java"+
>>    exportertype="node_exporter"+
>>    host="lsrv0008"+
>>    job="PROD"+
>>    monitor="codelab-monitor"+
>>    quantity="kB"+
>>    severity="warning"+
>>    
>> But the mail is not send by alertmanager…. what am i missing?
>>
>> Prometheus rules file
>> groups:
>> - name: targets
>>   rules:
>>   - alert: monitor_service_down
>>     expr: up == 0
>>     for: 40s
>>     labels:
>>       severity: critical
>>     annotations:
>>       summary: "Monitor service non-operational"
>>       description: "Service {{ $labels.instance }} is down."
>>   - alert: server_down
>>     expr: probe_success == 0
>>     for: 30s
>>     labels:
>>       severity: critical
>>     annotations:
>>       summary: "Server is down (no probes are up)"
>>       description: "Server {{ $labels.instance }} is down."
>>   - alert: loadbalancer_down
>>     expr: loadbalancer_stats < 1
>>     for: 30s
>>     labels:
>>       severity: critical
>>     annotations:
>>       summary: "A loadbalancer is down"
>>       description: "Loadbalancer for {{ $labels.instance }} is down."
>> - name: host
>>   rules:
>>   - alert: high_cpu_load1
>>     expr: node_load1 > 8.0
>>     for: 300s
>>     labels:
>>       severity: warning
>>     annotations:
>>       summary: "Server under high load (load 1m) for 5 minutes"
>>       description: "Host is under high load, the avg load 1m is at {{ 
>> $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job 
>> }}."
>>   - alert: high_cpu_load5
>>     expr: node_load5 > 5.0
>>     for: 600s
>>     labels:
>>       severity: warning
>>     annotations:
>>       summary: "Server under high load (load 5m) for 10 minutes."
>>       description: "Host is under high load, the avg load 5m is at {{ 
>> $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job 
>> }}."
>>   - alert: high_cpu_load15
>>     expr: node_load15 > 4.5
>>     for: 900s
>>     labels:
>>       severity: critical
>>     annotations:
>>       summary: "Server under high load (load 15m) for 15 minutes."
>>       description: "Host is under high load, the avg load 15m is at {{ 
>> $value}}. Reported by instance {{ $labels.instance }} of job {{ $labels.job 
>> }}."
>>   - alert: high_volume_workers_prod
>>     expr: sum(apache_workers{job="Apache PROD"}) by (instance) > 325
>>     for: 30s
>>     labels:
>>       severity: warning
>>     annotations:
>>       summary: "Number of workers above 325 for 30s"
>>       description: "The Apache workers are over 325 for 30s. Current 
>> value is {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ 
>> $labels.job }}."
>>   - alert: medium_volume_workers_prod
>>     expr: sum(apache_workers{job="Apache PROD"}) by (instance) > 300
>>     for: 30s
>>     labels:
>>       severity: warning
>>     annotations:
>>       summary: "Number of workers above 300 for 30s"
>>       description: "The Apache workers are over 300 for 30s. Current 
>> value is {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ 
>> $labels.job }}."
>>   - alert: swap_usage_java_high
>>     expr: swapusage_stats{application="java"} > 500000
>>     for: 300s
>>     labels:
>>       severity: warning
>>     annotations:
>>       summary: "Swap usage for Java is high for the last 5 minutes"
>>       description: "The swap usage for the java process are hig. Current 
>> value is {{ $value}}. Reported by instance {{ $labels.instance }} of job {{ 
>> $labels.job }}."
>>
>>
>>
>> Alertmanager setupfile
>> global:
>>   resolve_timeout: 5m
>>   http_config: {}
>>   smtp_from: [email protected]
>>   smtp_hello: localhost
>>   smtp_smarthost: localhost:25
>>   smtp_require_tls: true
>>   pagerduty_url: https://events.pagerduty.com/v2/enqueue
>>   hipchat_api_url: https://api.hipchat.com/
>>   opsgenie_api_url: https://api.opsgenie.com/
>>   wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
>>   victorops_api_url: https://
>> alert.victorops.com/integrations/generic/20131114/alert/
>> route:
>>   receiver: default
>>   group_by:
>>   - instance
>>   routes:
>>   - receiver: mail
>>     match:
>>       severity: warning
>>   - receiver: all
>>     match:
>>       severity: critical
>>   group_wait: 1s
>>   group_interval: 1s
>> receivers:
>> - name: default
>> - name: mail
>>   email_configs:
>>   - send_resolved: true
>>     to: [email protected]
>>     from: [email protected]
>>     hello: localhost
>>     smarthost: localhost:25
>>     headers:
>>       From: [email protected]
>>       Subject: '{{ template "email.default.subject" . }}'
>>       To: [email protected]
>>     html: '{{ template "email.default.html" . }}'
>>     require_tls: false
>> - name: all
>>   email_configs:
>>   - send_resolved: true
>>     to: [email protected]
>>     from: [email protected]
>>     hello: localhost
>>     smarthost: localhost:25
>>     headers:
>>       From: [email protected]
>>       Subject: '{{ template "email.default.subject" . }}'
>>       To: [email protected]
>>     html: '{{ template "email.default.html" . }}'
>>     require_tls: false
>>   - send_resolved: true
>>     to: [email protected]
>>     from: [email protected]
>>     hello: localhost
>>     smarthost: localhost:25
>>     headers:
>>       From: [email protected]
>>       Subject: '{{ template "email.default.subject" . }}'
>>       To: [email protected]
>>     html: '{{ template "email.default.html" . }}'
>>     require_tls: false
>> - name: webhook
>>   webhook_configs:
>>   - send_resolved: true
>>     http_config: {}
>>     url: http://127.0.0.1:9000
>> templates: []
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/7cbb3a17-bf66-4530-9d2c-344549c5cbb3%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-users/7cbb3a17-bf66-4530-9d2c-344549c5cbb3%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/890fe817-5cdb-44ef-8446-70b9a0e93e76%40googlegroups.com.

Reply via email to