Hello Martin,

This is my conf for the process:

check process Process1-M-aaa matching "/home/user/binn/GK/Process1
-f/home/user/binn/GK/Process1.aaa.properties --dispatch=M --daeMn"
    start program "/home/user/binn/GK/Process1
-f/home/user/binn/GK/Process1.aaa.properties --dispatch=M --pidfile
/tmp/Process1-M.aaa.pid" as uid user and gid user
    stop program "/bin/bash -c '/bin/kill -9 `/bin/cat
/tmp/Process1-M.aaa.pid`'" as uid user and gid user
    if does not exist then exec "/bin/bash -c
'/etc/monit.d/scripts/nagios_cmd.sh proc_m_aaa_crit'" as uid user and gid
user
    else if succeeded then exec "/bin/bash -c
'/etc/monit.d/scripts/nagios_cmd.sh proc_m_aaa_ok'"
    if uptime > 15 seconds then exec "/bin/bash -c
'/etc/monit.d/scripts/nagios_cmd.sh proc_m_aaa_ok'"
    group gateway

My ideia is:
If the process goes down, monit will execute nagios_cmd.sh script, it will
trigger a critical Alarm on nagios and will also start de process, I
couldnt get monit starting the process and executing the script, so I had
to insert both actions together in the script.
If the process is running, it will trigger an OK signal do Nagios.
To keep the OK Signal do nagios, I did the uptime thing, because I need to
keep sending OK to Nagios.


Here is the status:

Process 'Process1-M-aaa'
  status                            Uptime failed
  monitoring status                 Monitored
  pid                               10601
  parent pid                        1
  uid                               1000
  effective uid                     1000
  gid                               1000
  uptime                            14h 51m
  children                          0
  memory kilobytes                  20064
  memory kilobytes total            20064
  memory percent                    0.2%
  memory percent total              0.2%
  cpu percent                       0.1%
  cpu percent total                 0.1%
  data collected                    Tue, 03 Jun 2014 11:43:02


The service is up, fit the uptime condition ( > 15 seconds), but I get this
status "uptime failed".

Regards,

Att,

Stephan Gomes Higuti


On 3 June 2014 11:37, Martin Pala <[email protected]> wrote:

> Hello,
>
> can you please send full Monit configuration for the given service?
>
> Is there only one uptime test or two?
>
> Regards,
> Martin
>
>
> On 02 Jun 2014, at 15:09, Stephan Gomes Higuti <[email protected]>
> wrote:
>
> Hello Martin.
>
> Thank you.
> Yes, I got about the action, but I still have no clue why the process has
> a Fail Status, since the uptime is greater then 15 seconds.
>
> [BRT Jun  2 09:39:25] debug    : 'Process1' uptime check succeeded
> [current uptime=247571 seconds]
>
> 247571 seconds fits my condition, so it shouldn't get an "uptime failed"
> status.
>
> Regards,
>
> Stephan Gomes Higuti
>
>
> On 2 June 2014 09:59, Martin Pala <[email protected]> wrote:
>
>> Hello,
>>
>> testing rules define a failure condition and the given action is executed
>> if matched ... in your case the script will be executed ca. every 15
>> seconds.
>>
>> Regars,
>> Martin
>>
>>
>> On 02 Jun 2014, at 14:55, Stephan Gomes Higuti <[email protected]>
>> wrote:
>>
>> Hello,
>>
>> I'm having troubles with uptime pid testing.
>> I'm getting fail status, and I just understand why.
>> My configuration is:
>>
>> if uptime > 15 seconds then exec "/bin/bash -c
>> '/etc/monit.d/scripts/script.sh'"
>>
>> However, looking into my log, I get:
>>
>> [BRT Jun  2 09:39:25] debug    : 'Process1' zombie check succeeded
>> [status_flag=0000]
>> [BRT Jun  2 09:39:25] debug    : 'Process1' uptime check succeeded
>> [current uptime=247571 seconds]
>> [BRT Jun  2 09:39:25] error    : Process1' uptime test failed for
>> /home/user/binn/GW/Process1 -f/home/user/Process1.properties -- current
>> uptime is 247571 seconds
>> [BRT Jun  2 09:39:25] debug    :
>> -------------------------------------------------------------------------------
>> [BRT Jun  2 09:39:25] debug    :     monit() [0x41bc93]
>> [BRT Jun  2 09:39:25] debug    :     monit(LogError+0x9f) [0x41c3ef]
>> [BRT Jun  2 09:39:25] debug    :     monit(Event_post+0x206) [0x418826]
>> [BRT Jun  2 09:39:25] debug    :     monit(check_process+0x2d4) [0x42aeb4]
>> [BRT Jun  2 09:39:25] debug    :     monit(validate+0x22e) [0x42ab1e]
>> [BRT Jun  2 09:39:25] debug    :     monit(main+0x527) [0x415fb7]
>> [BRT Jun  2 09:39:25] debug    :
>> /lib64/libc.so.6(__libc_start_main+0xfd) [0x7fcf020e1b7d]
>> [BRT Jun  2 09:39:25] debug    :     monit() [0x40c539]
>> [BRT Jun  2 09:39:25] debug    :
>> -------------------------------------------------------------------------------
>> [BRT Jun  2 09:39:25] debug    : M/Monit: event message sent to
>> http://172.xxx.yyy.zzz:8080/collector
>> [BRT Jun  2 09:39:25] info     : 'Process1 exec: /bin/bash
>>
>> Seems weird, because it fits my condition that the uptime is > than 15
>> seconds, however I get a fail status on mmonit.
>> Any ideas of why this is happening?
>>
>>
>>
>> Regards,
>>
>> Stephan Gomes Higuti
>>  --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>>
>>
>> --
>> To unsubscribe:
>> https://lists.nongnu.org/mailman/listinfo/monit-general
>>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
>
>
> --
> To unsubscribe:
> https://lists.nongnu.org/mailman/listinfo/monit-general
>
--
To unsubscribe:
https://lists.nongnu.org/mailman/listinfo/monit-general

Reply via email to