Hello Martin,
This is my conf for the process:
check process Process1-M-aaa matching "/home/user/binn/GK/Process1
-f/home/user/binn/GK/Process1.aaa.properties --dispatch=M --daeMn"
start program "/home/user/binn/GK/Process1
-f/home/user/binn/GK/Process1.aaa.properties --dispatch=M --pidfile
/tmp/Process1-M.aaa.pid" as uid user and gid user
stop program "/bin/bash -c '/bin/kill -9 `/bin/cat
/tmp/Process1-M.aaa.pid`'" as uid user and gid user
if does not exist then exec "/bin/bash -c
'/etc/monit.d/scripts/nagios_cmd.sh proc_m_aaa_crit'" as uid user and gid
user
else if succeeded then exec "/bin/bash -c
'/etc/monit.d/scripts/nagios_cmd.sh proc_m_aaa_ok'"
if uptime > 15 seconds then exec "/bin/bash -c
'/etc/monit.d/scripts/nagios_cmd.sh proc_m_aaa_ok'"
group gateway
My ideia is:
If the process goes down, monit will execute nagios_cmd.sh script, it will
trigger a critical Alarm on nagios and will also start de process, I
couldnt get monit starting the process and executing the script, so I had
to insert both actions together in the script.
If the process is running, it will trigger an OK signal do Nagios.
To keep the OK Signal do nagios, I did the uptime thing, because I need to
keep sending OK to Nagios.
Here is the status:
Process 'Process1-M-aaa'
status Uptime failed
monitoring status Monitored
pid 10601
parent pid 1
uid 1000
effective uid 1000
gid 1000
uptime 14h 51m
children 0
memory kilobytes 20064
memory kilobytes total 20064
memory percent 0.2%
memory percent total 0.2%
cpu percent 0.1%
cpu percent total 0.1%
data collected Tue, 03 Jun 2014 11:43:02
The service is up, fit the uptime condition ( > 15 seconds), but I get this
status "uptime failed".
Regards,
Att,
Stephan Gomes Higuti
On 3 June 2014 11:37, Martin Pala <[email protected]> wrote:
> Hello,
>
> can you please send full Monit configuration for the given service?
>
> Is there only one uptime test or two?
>
> Regards,
> Martin
>
>
> On 02 Jun 2014, at 15:09, Stephan Gomes Higuti <[email protected]>
> wrote:
>
> Hello Martin.
>
> Thank you.
> Yes, I got about the action, but I still have no clue why the process has
> a Fail Status, since the uptime is greater then 15 seconds.
>
> [BRT Jun 2 09:39:25] debug : 'Process1' uptime check succeeded
> [current uptime=247571 seconds]
>
> 247571 seconds fits my condition, so it shouldn't get an "uptime failed"
> status.
>
> Regards,
>
> Stephan Gomes Higuti
>
>
> On 2 June 2014 09:59, Martin Pala <[email protected]> wrote:
>
>> Hello,
>>
>> testing rules define a failure condition and the given action is executed
>> if matched ... in your case the script will be executed ca. every 15
>> seconds.
>>
>> Regars,
>> Martin
>>
>>
>> On 02 Jun 2014, at 14:55, Stephan Gomes Higuti <[email protected]>
>> wrote:
>>
>> Hello,
>>
>> I'm having troubles with uptime pid testing.
>> I'm getting fail status, and I just understand why.
>> My configuration is:
>>
>> if uptime > 15 seconds then exec "/bin/bash -c
>> '/etc/monit.d/scripts/script.sh'"
>>
>> However, looking into my log, I get:
>>
>> [BRT Jun 2 09:39:25] debug : 'Process1' zombie check succeeded
>> [status_flag=0000]
>> [BRT Jun 2 09:39:25] debug : 'Process1' uptime check succeeded
>> [current uptime=247571 seconds]
>> [BRT Jun 2 09:39:25] error : Process1' uptime test failed for
>> /home/user/binn/GW/Process1 -f/home/user/Process1.properties -- current
>> uptime is 247571 seconds
>> [BRT Jun 2 09:39:25] debug :
>> -------------------------------------------------------------------------------
>> [BRT Jun 2 09:39:25] debug : monit() [0x41bc93]
>> [BRT Jun 2 09:39:25] debug : monit(LogError+0x9f) [0x41c3ef]
>> [BRT Jun 2 09:39:25] debug : monit(Event_post+0x206) [0x418826]
>> [BRT Jun 2 09:39:25] debug : monit(check_process+0x2d4) [0x42aeb4]
>> [BRT Jun 2 09:39:25] debug : monit(validate+0x22e) [0x42ab1e]
>> [BRT Jun 2 09:39:25] debug : monit(main+0x527) [0x415fb7]
>> [BRT Jun 2 09:39:25] debug :
>> /lib64/libc.so.6(__libc_start_main+0xfd) [0x7fcf020e1b7d]
>> [BRT Jun 2 09:39:25] debug : monit() [0x40c539]
>> [BRT Jun 2 09:39:25] debug :
>> -------------------------------------------------------------------------------
>> [BRT Jun 2 09:39:25] debug : M/Monit: event message sent to
>> http://172.xxx.yyy.zzz:8080/collector
>> [BRT Jun 2 09:39:25] info : 'Process1 exec: /bin/bash
>>
>> Seems weird, because it fits my condition that the uptime is > than 15
>> seconds, however I get a fail status on mmonit.
>> Any ideas of why this is happening?
>>
>>
>>
>> Regards,
>>
>> Stephan Gomes Higuti
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>>
>>
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general