If a check is not OK, then it's down, that's indeed the way it works.  The
"down" word should be interpreted as being " no confirmation of it being
UP". 
Within the alert using the %e parameter will show the reason of the down.
Then you could see (for example) that an NT service is seen as being down
because of "Access Denied" (meaning that at the time of the check we got a
access denied back from the OS and as such can't confirm that the service is
running).


Dirk. 

-----Original Message-----
From: Stephen Ryan [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 11, 2006 8:45 AM
To: Servers Alive Discussion List
Subject: [SA-list] False positives


One frustration I have is that checks report a DOWN status if they
experience any error whatsoever, a timeout for exapmle. One example is the
"CountFiles" external COM check, which checks for the existence of certain
files, and in my case should alarm if any (count > 0) files are found.

What it does is alarm if the files are "not not found", i.e. if for any
reason it can't count the files it thinks "Aha! DOWN" and also sends an
alarm. NTProcess checks seem to be like this too, if they fail to disprove
the negative they interpret this as a positive and send an alarm, rather
than handle the error (timeout, logon failure, etc).

Sometimes I don't have a couple of weeks to run a new check in the test
envinronment before it is needed in production - any ideas on how to avoid
false positives with CountFiles, or even generally? 

To pick up on yesterday's thread of ideas around alarm management etc in
future versions, it might be useful to make communicating via the same alarm
mechanism possible. False positives have a large "crying wolf"
impact on the credibility of alarms, which reduces the reaction of the team
and the effectiveness of Servers Alive. Sometimes I rely more on external
scripts returning errorlevels, which I can tune more finely, even if the
check type is built in. 

//Steve
To unsubscribe send a message with UNSUBSCRIBE as subject to
[email protected]
To unsubscribe send a message with UNSUBSCRIBE as subject to [email protected]

Reply via email to