Hi,

On Tue, 18 Nov 2003 22:02:01 +0100 "Dirk Bulinckx" <[EMAIL PROTECTED]> wrote:

> What do you mean with "About the only thing I could ask for is a
> %-uptime metric over some interval"?

Basically, I'm looking for some measure of how reliable my systems are, something like:

"System #i was available p% during the last month/quarter/year"

which is almost (but not quite):

"Check #i was successful m of the last n trials (p% = 100 * m / n)"

It gets much more complicated when you factor in scheduled maintenance, monitoring 
system downtime, and expected hours of availability.

Using completely unrelated notation:

s = successful checks
f = failed checks
m = skipped checks due to maintenance (scheduled outage)
d = skipped checks due to other reasons (unknown results)
N = total possible checks
n = total checks performed
T = analysis interval
t = interval between checks

where

n = s + f + m
N = T/t = n + d
n <= N

At this point you get into definitions of availability (s/n) and reliability 
((s+m)/n), how to handle missing data (m & d), and keeping track of expected hours of 
operation (ignoring system status when nobody should be using the system; the analysis 
interval T might only be 9am-5pm M-F excluding holidays rather than 24x7.) Bigger 
issues are how you define 'system' due to check dependencies (e.g. how is webserver 
availability and reliability calculated if external networking is broken but the 
webserver is still capable of serving pages?)

You can have as much fun as you want with this kind of bookkeeping. It's a fair 
question whether this function ought to be incorporated into the monitoring system 
rather than being done externally via log analysis or a database query (MySQL + ODBC 
drivers + SA leads to some interesting ideas.) It doesn't really matter whether this 
is done within Servers Alive or not; the big win is that SA can acquire all this raw 
data and it's just a matter of recording it over time and massaging it into something 
useful. Ultimately, I'm lazy - I'd rather not code it myself. :)

Again, thanks for a great piece of software.

-- Bob
To unsubscribe from a list, send a mail message to [EMAIL PROTECTED]
With the following in the body of the message:
   unsubscribe SAlive

Reply via email to