I'm trying to create an alert when a specific field has been in a certain state
for too long.
Currently my tick script looks like this:
var data = batch | query('select active from
telegraf.autogen.postgresql_replication_slots')
.period(4h)
.every(10s)
.groupBy('host','slot_name')
var data_last = data|last('active').as('active')
var data_last_active = data|where(lambda: "active" ==
TRUE)|last('active').as('active')
var data_union = data_last|union(data_last_active)
var data_elapsed = data_union|elapsed('active',1s)|log()
var data_count = data_union|count('active').as('count')|log()
data_elapsed|join(data_count).as('elapsed','count').tolerance(10s).fill('none')|log()
The idea is that `data_last` will be the last data point. `data_last_active`
will be the last data point where `active == true`. We would then calculate the
time difference between these 2 points. If the difference is greater than X,
generate an alert.
But we also want to handle the case where `data_last_active` is empty (no match
within time period), so we get the count of points, which in this case would be
1.
However there are numerous problems with this:
1. `elapsed()` is including the data points from the previous batch period,
instead of just within the batch. So if batch.period is 60s, then one of the
elapsed values is going to be 60s.
2. `elapsed()` won't emit anything at all if there is no previous data point,
thus breaking the case where `data_last_active` is empty.
3. `count()` is buffering, and doesn't release the data points until the next
batch comes in.
Here's an example of what the above generates:
[test:log10] 2016/11/11 12:31:09 I!
{"Name":"postgresql_replication_slots","Database":"","RetentionPolicy":"","Group":"host=fll2gdbs01qa,slot_name=fll2gbar01stg","Dimensions":{"ByName":false,"TagNames":["host","slot_name"]},"Tags":{"host":"fll2gdbs01qa","slot_name":"fll2gbar01stg"},"Fields":{"count":2},"Time":"2016-11-11T12:30:59.785563762-05:00"}
[test:log10] 2016/11/11 12:31:09 I!
{"Name":"postgresql_replication_slots","Database":"","RetentionPolicy":"","Group":"host=fll2gdbs01qa,slot_name=fll2gdbs01qa","Dimensions":{"ByName":false,"TagNames":["host","slot_name"]},"Tags":{"host":"fll2gdbs01qa","slot_name":"fll2gdbs01qa"},"Fields":{"count":1},"Time":"2016-11-11T12:30:59.785563762-05:00"}
[test:log8] 2016/11/11 12:31:09 I!
{"Name":"postgresql_replication_slots","Database":"","RetentionPolicy":"","Group":"host=fll2gdbs01qa,slot_name=fll2gbar01stg","Dimensions":{"ByName":false,"TagNames":["host","slot_name"]},"Tags":{"host":"fll2gdbs01qa","slot_name":"fll2gbar01stg"},"Fields":{"elapsed":10},"Time":"2016-11-11T17:31:09.785572403Z"}
[test:log8] 2016/11/11 12:31:09 I!
{"Name":"postgresql_replication_slots","Database":"","RetentionPolicy":"","Group":"host=fll2gdbs01qa,slot_name=fll2gbar01stg","Dimensions":{"ByName":false,"TagNames":["host","slot_name"]},"Tags":{"host":"fll2gdbs01qa","slot_name":"fll2gbar01stg"},"Fields":{"elapsed":0},"Time":"2016-11-11T17:31:09.785572403Z"}
[test:log8] 2016/11/11 12:31:09 I!
{"Name":"postgresql_replication_slots","Database":"","RetentionPolicy":"","Group":"host=fll2gdbs01qa,slot_name=fll2gdbs01qa","Dimensions":{"ByName":false,"TagNames":["host","slot_name"]},"Tags":{"host":"fll2gdbs01qa","slot_name":"fll2gdbs01qa"},"Fields":{"elapsed":10},"Time":"2016-11-11T17:31:09.785572403Z"}
Any suggestions to get this working?
-Patrick
--
Remember to include the version number!
---
You received this message because you are subscribed to the Google Groups
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit
https://groups.google.com/d/msgid/influxdb/606d0410-7979-4162-8de7-d1da25067025%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.