Hello
again,
Since we've
upgraded to Mon v1.1.0Pre1 dependency checks continually fail and all alerts for
services that have any dependencies whatsoever are
supressed.
Here's a slice of my
config:
----------------------------------------------------
### global
options
cfbasedir = /etc/mon
pidfile = /var/run/mon.pid
statedir = /var/lib/mon/state.d
logdir = /var/lib/mon/log.d
dtlogfile = /var/lib/mon/log.d/downtime.log
dtlogging = 1
alertdir = /usr/lib/mon/alert.d
mondir = /usr/lib/mon/mon.d
maxprocs = 50
histlength = 1000
randstart = 60s
authtype = userfile
userfile = /etc/mon/userfile
syslog_facility = local1
trapbind = 127.0.0.1
serverbind = 127.0.0.1
dep_behavior= a
dep_memory = 1s
cfbasedir = /etc/mon
pidfile = /var/run/mon.pid
statedir = /var/lib/mon/state.d
logdir = /var/lib/mon/log.d
dtlogfile = /var/lib/mon/log.d/downtime.log
dtlogging = 1
alertdir = /usr/lib/mon/alert.d
mondir = /usr/lib/mon/mon.d
maxprocs = 50
histlength = 1000
randstart = 60s
authtype = userfile
userfile = /etc/mon/userfile
syslog_facility = local1
trapbind = 127.0.0.1
serverbind = 127.0.0.1
dep_behavior= a
dep_memory = 1s
watch
DTNIQ-EOC-Mon
service ping
description ICMP Checker
interval 60s
monitor ping.monitor -m 10.2.12.229:6000 -i 60
period wd {Mon-Sun}
alertafter 2
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service snmp
description SNMP Service Monitor
interval 60s
randskew 10s
monitor process.monitor -c Ops-Inet -p snmp.exe
depend SELF:ping
period wd {Mon-Sun}
alertafter 2
alertevery 3m
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service if_err
description Network Interface Error Monitor via SNMP
interval 300s
randskew 10s
monitor iferror.monitor -w 5 -a 10 -c Ops-Inet
depend SELF:ping && SELF:snmp
period wd {Mon-Sun}
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service vnc
description VNC Port Monitor
interval 600s
randskew 10s
monitor tcpch.monitor -p 5900 -t 5 -e '^RFB'
depend SELF:ping
period wd {Mon-Sun}
alertevery 30m
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service memory
description Memory Check via SNMP
interval 60s
randskew 10s
monitor memory.monitor -t 95 -c Ops-Inet
depend SELF:snmp && SELF:ping
period wd {Mon-Sun}
alertafter 2
alertevery 5m
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service disks
service ping
description ICMP Checker
interval 60s
monitor ping.monitor -m 10.2.12.229:6000 -i 60
period wd {Mon-Sun}
alertafter 2
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service snmp
description SNMP Service Monitor
interval 60s
randskew 10s
monitor process.monitor -c Ops-Inet -p snmp.exe
depend SELF:ping
period wd {Mon-Sun}
alertafter 2
alertevery 3m
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service if_err
description Network Interface Error Monitor via SNMP
interval 300s
randskew 10s
monitor iferror.monitor -w 5 -a 10 -c Ops-Inet
depend SELF:ping && SELF:snmp
period wd {Mon-Sun}
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service vnc
description VNC Port Monitor
interval 600s
randskew 10s
monitor tcpch.monitor -p 5900 -t 5 -e '^RFB'
depend SELF:ping
period wd {Mon-Sun}
alertevery 30m
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service memory
description Memory Check via SNMP
interval 60s
randskew 10s
monitor memory.monitor -t 95 -c Ops-Inet
depend SELF:snmp && SELF:ping
period wd {Mon-Sun}
alertafter 2
alertevery 5m
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service disks
description
Disk Used Check via SNMP
interval 1800s
randskew 10s
monitor diskused.monitor -t 90 -c Ops-Inet
depend SELF:snmp && SELF:ping
period wd {Mon-Sun}
alertevery 2h
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service cpu_load
description CPU Load Monitor via SNMP
interval 300s
randskew 10s
monitor cpuidleload.monitor -i 3 -c
depend SELF:snmp && SELF:ping
period wd {Mon-Sun}
alertafter 2
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service Event_Log
description Event Log Checker
randskew 10s
interval 60s
monitor eventlog.monitor
period wd {Mon-Sun}
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
----------------------------------------------------
interval 1800s
randskew 10s
monitor diskused.monitor -t 90 -c Ops-Inet
depend SELF:snmp && SELF:ping
period wd {Mon-Sun}
alertevery 2h
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service cpu_load
description CPU Load Monitor via SNMP
interval 300s
randskew 10s
monitor cpuidleload.monitor -i 3 -c
depend SELF:snmp && SELF:ping
period wd {Mon-Sun}
alertafter 2
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
service Event_Log
description Event Log Checker
randskew 10s
interval 60s
monitor eventlog.monitor
period wd {Mon-Sun}
alert ipnotify.alert -m 10.2.12.229:6000
alert db.alert -m 10.2.12.229:6000
upalert db.alert -m 10.2.12.229:6000
----------------------------------------------------
...and some pertinent syslogs
from mon -d:
----------------------------------------------------
Nov 10 15:33:42
ops-inet-mon mon[23679]: PID 2713 (DTNIQ-EOC-Mon/ping) exited with [0]
Nov 10 15:33:42 ops-inet-mon mon[23679]: process_event type=m group=DTNIQ-EOC-Mon service=ping exitval=0 output=[]
Nov 10 15:33:42 ops-inet-mon mon[23679]: process_event type=m group=DTNIQ-EOC-Mon service=ping exitval=0 output=[]
Nov 10 15:33:42
ops-inet-mon mon[23679]: depend = ; dep_behavior = a;
Nov 10 15:33:42 ops-inet-mon mon[23679]: After Dependencies Check
Nov 10 15:33:42 ops-inet-mon mon[23679]: After Dependencies Check
[...clip...]
Nov 10 15:33:44
ops-inet-mon mon[23679]: PID 2873 (DTNIQ-EOC-Mon/vnc) exited with [1]
Nov 10 15:33:44 ops-inet-mon mon[23679]: process_event type=m group=DTNIQ-EOC-Mon service=vnc exitval=1 output=[[DTNIQ-EOC-Mon] Errors retrieving data on port 5900 DTNIQ-EOC-Mon critical TCP_PORT: could not connect to port :5900: Connection refused ]
Nov 10 15:33:44 ops-inet-mon mon[23679]: depend = DTNIQ-EOC-Mon:ping; dep_behavior = a;
Nov 10 15:33:44 ops-inet-mon mon[23679]: Made it inside
Nov 10 15:33:44 ops-inet-mon mon[23679]: After Dependencies Check
Nov 10 15:33:44 ops-inet-mon mon[23679]: failure for DTNIQ-EOC-Mon vnc 1131654824 [DTNIQ-EOC-Mon] Errors retrieving data on port 5900
Nov 10 15:33:44 ops-inet-mon mon[23679]: do_alert flags=1
Nov 10 15:33:44 ops-inet-mon mon[23679]: alert for DTNIQ-EOC-Mon,vnc supressed because of dep fail
Nov 10 15:33:44 ops-inet-mon mon[23679]: process_event type=m group=DTNIQ-EOC-Mon service=vnc exitval=1 output=[[DTNIQ-EOC-Mon] Errors retrieving data on port 5900 DTNIQ-EOC-Mon critical TCP_PORT: could not connect to port :5900: Connection refused ]
Nov 10 15:33:44 ops-inet-mon mon[23679]: depend = DTNIQ-EOC-Mon:ping; dep_behavior = a;
Nov 10 15:33:44 ops-inet-mon mon[23679]: Made it inside
Nov 10 15:33:44 ops-inet-mon mon[23679]: After Dependencies Check
Nov 10 15:33:44 ops-inet-mon mon[23679]: failure for DTNIQ-EOC-Mon vnc 1131654824 [DTNIQ-EOC-Mon] Errors retrieving data on port 5900
Nov 10 15:33:44 ops-inet-mon mon[23679]: do_alert flags=1
Nov 10 15:33:44 ops-inet-mon mon[23679]: alert for DTNIQ-EOC-Mon,vnc supressed because of dep fail
----------------------------------------------------
...Just to
verify:
----------------------------------------------------
[EMAIL PROTECTED] log]# mon -v
$Id: mon,v 1.10 2004/11/15 14:45:16 vitroth Exp $
$Name: mon-1-1-0pre1 $
----------------------------------------------------
$Id: mon,v 1.10 2004/11/15 14:45:16 vitroth Exp $
$Name: mon-1-1-0pre1 $
----------------------------------------------------
As far as I can
tell, I'm doing what I should be doing. The dependencies worked before I
upgraded. Any suggestions or ideas?
_______________________________________________ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon