RE: [ActiveDir] Server up/downtime

Depp, Dennis M. Sun, 28 Mar 2004 17:56:41 -0800

We have serveral tests for determining if a system is functioning. The first is a simple ping test. The second level is to determine if necessary services are running on the server. The services we monitor depend on the functionality of the server. IE, Exchange servers are checked to verify all the Exchange services are running. Similarly for SQL servers. This is still not 100% as a service can be running but not processing commands properly. For mail and web servers, we also do client checks. For mail we send an email to an account on the mail server. This account is set to autorespond to a given mail account. If the reply is not received, an error is generated. In addition to these checks, we also check disk space and cpu usage to ensure the server is functioning within given limits. For most of our checks we are using Big Brother.

Denny

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of joe
Sent: Saturday, March 27, 2004 9:47 PM
To: [EMAIL PROTECTED]
Subject: RE: [ActiveDir] Server up/downtime

I have a general comment for this and kind of curious what people are doing in this area...

Most products check for availability of the server via pinging and agents that scrape events and report availability of servers in terms of whether the server returns a ping or not. This is obviously a good start but quite honestly, pretty worthless. It doesn't say anything about whether the server is truly functioning and able to respond to client calls (ditto for agents running on the server themselves).

What are people doing to get realistic uptime/availability numbers out of their systems? Do you have monitoring that pretends to be a client and use the normal client hooks?

For example, we monitor our DCs all centrally via perl scripts that act like they are users...

For instance, one test is a WINS name resolution test. It basically does Domain 1C record lookups via NMBLOOKUP/NBLOOKUP type calls which emulate clients. Another test tries to read the netlogon shares (which also tests authentication on the DC). Another test does some NET API calls against a DC. An LDAP lookup test also exists. Another checks time on DCs. Again all of these work remotely, they do not run on the DCs themselves. In this way we have a pretty good idea of what is truly available versus just up. I would like to go a step better and actually do this central monitoring from several points around the globe and then centralize the results.

Our company's main outlook on Servers is uptime via ping response with no consideration for application level availability or degradation. The main reason being how hard it is really to do accurately. So say an Exchange Server that is responding to pings but isn't handling mail at all or not very well is considered UP for availability numbers. I obviously don't agree with that approach and did something different. I am curious as to how many others are doing things that way.

joe

-------------

http://www.joeware.net (download joeware)

http://www.cafeshops.com/joewarenet (wear joeware)

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philadelphia, Lynden - Revios Toronto
Sent: Monday, March 22, 2004 10:42 AM
To: '[EMAIL PROTECTED]'
Subject: [ActiveDir] Server up/downtime

This might not be the right forum, but I will ask anyway. Does anyone have a spreadsheet or database that tracks server down/uptime?

Need to produce a report for the management on a monthly basis.

Lynden

RE: [ActiveDir] Server up/downtime

Reply via email to