Hey all. Pretty new to TICK but I have a problem that I can't wrap my head
around.
I am monitoring multiple servers all sending data to one influxdb database
and using the 'host' tag to separate the servers in the DB
My 'disk' measurement is taking in mulitiple disk paths from the servers
(HOSTS) which each have a respective 'PATH' tag.
So basically each server is assigned a HOST tag and each HOST has multiple
PATH tags.
EXPECTED FUNCTIONALITY: kapacitor should alert upon state change of a
HOST's PATH if that path is within the alerting Lambda.
PROBLEM: When I start the kapacitor service, it looks like it's sensing a
state change any time it sees another host/path with a opposite status.
This is a simplified example of the alerts I am getting:
Host: host1 Path: /path1 Status: UP
Host: host1 Path: /path2 Status: DOWN
Host: host1 Path: /path3 Status: UP
Host: host2 Path: /path1 Status: DOWN
Host: host2 Path: /path2 Status: UP
These alerts happen once for each host/path combination and then the
service performs as expected, alerting properly when lambda is achieved.
The result of this is that I receive a slew of up/down alerts every time I
restart the kapacitor service
Here is my current tick:
var data = stream
|from()
.measurement('disk')
.groupBy('host','path')
|alert()
.message('{{ .ID }} Server:{{ index .Tags "host" }} Path: {{ index
.Tags "path" }} USED PERCENT: {{ index .Fields "used_percent" }}')
.warn(lambda: "used_percent" >= 80)
.id('DISK SPACE WARNING')
.email($DISK_WARN_GRP)
And the corresponding DOT
ID: disk_alert_warn
Error:
Template:
Type: stream
Status: enabled
Executing: true
Created: 17 Feb 17 22:27 UTC
Modified: 17 Feb 17 22:27 UTC
LastEnabled: 17 Feb 17 22:27 UTC
Databases Retention Policies: ["main"."autogen"]
TICKscript:
var data = stream
|from()
.measurement('disk')
.groupBy('host', 'path')
|alert()
.message('{{ .ID }} Server:{{ index .Tags "host" }} Path: {{ index
.Tags "path" }} USED PERCENT: {{ index .Fields "used_percent" }}')
.warn(lambda: "used_percent" >= 80)
.id('DISK SPACE WARNING')
.email()
DOT:
digraph disk_alert_warn {
graph [throughput="38.00 points/s"];
stream0 [avg_exec_time_ns="0s" ];
stream0 -> from1 [processed="284"];
from1 [avg_exec_time_ns="3.9µs" ];
from1 -> alert2 [processed="284"];
alert2 [alerts_triggered="14" avg_exec_time_ns="72.33µs"
crits_triggered="0" infos_triggered="0" oks_triggered="7"
warns_triggered="7" ];
}
As you can see, I get 7 oks triggered (for host/path groups that are not in
alert range) and 7 warns triggered (for the 7 host/path groups that are
within the alert range) upon start up.
Then it behaves as normal.
I understand that it should be alerting for the 7 host/path groups that are
over 80 but why follow it with an alert about the ok groups?
MORE INFO: When I raise the lambda to 90% (out of range for all host/paths)
I get no alerts at all (which is expected)
Thanks to anyone who can help me understand this
--
Remember to include the version number!
---
You received this message because you are subscribed to the Google Groups
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit
https://groups.google.com/d/msgid/influxdb/ffddabe4-cfda-4fa6-815d-62e185e4e3fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.