Kapacitor process crashes after sending email :
*Following is the kapacitor log :*
panic: 421 Timeout waiting for data from client.
goroutine 19 [running]:
panic(0xc69800, 0xc8204f6ac0)
/usr/local/go/src/runtime/panic.go:481 +0x3e6
github.com/influxdata/kapacitor/services/smtp.(*Service).runMailer(
0xc8201732c0)
/root/go/src/github.com/influxdata/kapacitor/services/smtp/service.go:91
+0x7ad
created by github.com/influxdata/kapacitor/services/smtp.(*Service).Open
/root/go/src/github.com/influxdata/kapacitor/services/smtp/service.go:36
+0x238
*Following is the taks configured to send out emails : *
ID: cpu_alert
Error:
Type: stream
Status: enabled
Executing: true
Created: 11 Aug 16 11:45 UTC
Modified: 14 Aug 16 10:42 UTC
LastEnabled: 14 Aug 16 10:42 UTC
Databases Retention Policies: ["telegraf"."default"]
TICKscript:
stream
|from()
.measurement('cpu')
.where(lambda: "host" == 'testubuntu')
.groupBy('service')
|window()
.period(1m)
.every(1m)
|default()
.field('usage_user', 0.0)
.tag('host', '')
|alert()
.message('{{ .Level}}: {{ .Name }}/{{ index .Tags "host" }} has
high cpu usage: {{ index .Fields "usage_user" }}')
.warn(lambda: "usage_user" > 60.0)
.crit(lambda: "usage_user" > 85.0)
.log('/tmp/high_cpu.log')
.id('{{ .Name }}')
// Email subject
// .meassage('{{ .ID }}:{{ .Level }}')
// Email body as HTML
.details('''
<h1>{{ .ID }}</h1>
<b>{{ .Message }}</b>
Value: {{ index .Fields "usage_user" }}
''')
.email()
i am using Amazon SES service to send out email emails , the email
send/receive is working well as expected but the problem is that the
kapacitor process is getting failed and stopped after sending out email.
Is this a problem with Kapacitor agent or with my configuration ? Please
let me know
On Sun, Aug 14, 2016 at 11:01 AM, Unni Sathyarajan <[email protected]>
wrote:
> Sorry Ross, it worked this time. I forgot to reload the kapacitor tasks
> and restart the process.
>
> Also is it possible to configure advance configurations in alerting such
> as flap detection, muting alerts for certain period , escalation etc.
>
> Thanks
> Unni
>
> On Sun, Aug 14, 2016 at 10:49 AM, Unni Sathyarajan <[email protected]
> > wrote:
>
>> Thank you very much for your response Ross, however that did not solve my
>> problem. I still get the same error here .
>>
>> Pasting the new script and error log
>>
>> *The Tick script :*
>>
>> stream
>> |from()
>> .measurement('cpu')
>> .where(lambda: "host"== 'testubuntu')
>> .groupBy('service')
>> |window()
>> .period(1m)
>> .every(1m)
>> |default()
>> .field('usage_user', 0.0)
>> .tag('host', '')
>> |alert()
>> .message('{{ .Level}}: {{ .Name }}/{{ index .Tags "host" }} has
>> high cpu usage: {{ index .Fields "usage_user" }}')
>> .info(lambda: TRUE)
>> .warn(lambda: "usage_user" > 60.0)
>> .crit(lambda: "usage_user" > 85.0)
>> .log('/tmp/high_cpu.log')
>>
>>
>>
>>
>> *The Kapacitor Error Log : *
>>
>> [cpu_alert:mean3] 2016/08/14 06:46:21 E! invalid field type: <nil>
>> [task_master] 2016/08/14 06:46:21 E! Stopped task: cpu_alert invalid
>> field type: <nil>
>> [task_store] 2016/08/14 06:46:21 E! task cpu_alert finished with error:
>> invalid field type: <nil>
>>
>>
>>
>>
>>
>>
>> I am new to the TICK stack , my intention is to setup a monitoring
>> solution out of the TICK Stack.
>>
>>
>>
>> Thanks
>> Unni
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Aug 11, 2016 at 7:19 PM, Ross McDonald <[email protected]> wrote:
>>
>>> The error here:
>>>
>>> > [cpu_alert:mean3] 2016/08/11 13:34:01 E! invalid field type: <nil>
>>>
>>> Is referring to the `|mean('value')` call, which is receiving a null
>>> value for the "value" field. To fix that you'll either need to:
>>>
>>> * Remove it, since it doesn't look like you're using it anywhere anyways
>>> (and Telegraf no longer uses "value" as a field to my knowledge)
>>>
>>> * Add a `|default().field('value', 0.0)` call before taking the mean of
>>> value so that it defaults to 0 if it is not provided in the stream
>>>
>>> I hope that helps!
>>>
>>> Thanks,
>>> Ross
>>>
>>> On Thu, Aug 11, 2016 at 8:39 AM, <[email protected]> wrote:
>>>
>>>> [edge:cpu_alert|mean3->default4] 2016/08/11 13:34:01 D! closing c: 0
>>>> e: 0
>>>> [edge:cpu_alert|window2->mean3] 2016/08/11 13:34:01 I! aborting c: 1
>>>> e: 1
>>>> [cpu_alert:mean3] 2016/08/11 13:34:01 E! invalid field type: <nil>
>>>> [edge:cpu_alert|default4->eval5] 2016/08/11 13:34:01 D! closing c: 0
>>>> e: 0
>>>> [edge:cpu_alert|eval5->alert6] 2016/08/11 13:34:01 D! closing c: 0 e: 0
>>>> [edge:cpu_alert|stream->stream0] 2016/08/11 13:34:01 D! closing c: 129
>>>> e: 129
>>>> [edge:cpu_alert|stream0->from1] 2016/08/11 13:34:01 D! closing c: 129
>>>> e: 129
>>>> [edge:cpu_alert|from1->window2] 2016/08/11 13:34:01 D! closing c: 129
>>>> e: 129
>>>> [edge:cpu_alert|window2->mean3] 2016/08/11 13:34:01 D! closing c: 1 e:
>>>> 1
>>>> [task_master] 2016/08/11 13:34:01 E! Stopped task: cpu_alert invalid
>>>> field type: <nil>
>>>> [task_store] 2016/08/11 13:34:01 E! task cpu_alert finished with error:
>>>> invalid field type: <nil>
>>>>
>>>>
>>>>
>>>>
>>>> Getting this error , what should I do ??
>>>>
>>>>
>>>> Following is the output of my show task :
>>>>
>>>>
>>>>
>>>>
>>>> kapacitor show cpu_alert
>>>> ID: cpu_alert
>>>> Error:
>>>> Type: stream
>>>> Status: enabled
>>>> Executing: true
>>>> Created: 11 Aug 16 11:45 UTC
>>>> Modified: 11 Aug 16 13:32 UTC
>>>> LastEnabled: 11 Aug 16 13:32 UTC
>>>> Databases Retention Policies: ["telegraf"."default"]
>>>> TICKscript:
>>>> stream
>>>> |from()
>>>> .measurement('cpu')
>>>> .groupBy('host')
>>>> |window()
>>>> .period(1m)
>>>> .every(1m)
>>>> |mean('value')
>>>> |default()
>>>> .field('usage_user', 0.0)
>>>> .tag('host', '')
>>>> |eval(lambda: 100.0 - "mean")
>>>> .as('used')
>>>> |alert()
>>>> .message('{{ .Level}}: {{ .Name }}/{{ index .Tags "host" }} has
>>>> high cpu usage: {{ index .Fields "usage_user" }}')
>>>> .info(lambda: TRUE)
>>>> .warn(lambda: "usage_user" > 60.0)
>>>> .crit(lambda: "usage_user" > 85.0)
>>>> .log('/tmp/high_cpu.log')
>>>>
>>>>
>>>> DOT:
>>>> digraph cpu_alert {
>>>> graph [throughput="0.00 points/s"];
>>>>
>>>> stream0 [avg_exec_time_ns="0" ];
>>>> stream0 -> from1 [processed="20"];
>>>>
>>>> from1 [avg_exec_time_ns="1.156µs" ];
>>>> from1 -> window2 [processed="20"];
>>>>
>>>> window2 [avg_exec_time_ns="0" ];
>>>> window2 -> mean3 [processed="0"];
>>>>
>>>> mean3 [avg_exec_time_ns="0" ];
>>>> mean3 -> default4 [processed="0"];
>>>>
>>>> default4 [avg_exec_time_ns="0" fields_defaulted="0" tags_defaulted="0"
>>>> ];
>>>> default4 -> eval5 [processed="0"];
>>>>
>>>> eval5 [avg_exec_time_ns="0" eval_errors="0" ];
>>>> eval5 -> alert6 [processed="0"];
>>>>
>>>> alert6 [alerts_triggered="0" avg_exec_time_ns="0" crits_triggered="0"
>>>> infos_triggered="0" oks_triggered="0" warns_triggered="0" ];
>>>> }
>>>>
>>>> --
>>>> Remember to include the InfluxDB version number with all issue reports
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "InfluxDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/influxdb/9d919bf7-86e4-41ab-b9c7-0d57605c6f0b%40googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>>> Remember to include the InfluxDB version number with all issue reports
>>> ---
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "InfluxDB" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>> pic/influxdb/W7q-AFiX-nk/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/influxdb.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/influxdb/CAD8sRLAdyLFUQpqE_92BpSa0Me%3DmGCu9%2BVb-xrAXv%
>>> 3DYHD-vXfQ%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/influxdb/CAD8sRLAdyLFUQpqE_92BpSa0Me%3DmGCu9%2BVb-xrAXv%3DYHD-vXfQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>
--
Remember to include the InfluxDB version number with all issue reports
---
You received this message because you are subscribed to the Google Groups
"InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit
https://groups.google.com/d/msgid/influxdb/CAEtPgHM5BnhKKpP1bDGqYaf%2BNMBG0a8qcP9xrjvvJM06EmmkAQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.