Hi Magnus, Unfortunately it is not so easy to give an all encompassing view on what has to be monitored and how to know about it.
First of all, you need to look at your business. Group your infrastructure along the lines of business. Talk to the line managers on what exactly they view as absolutely important to have in terms of service. Agree with them on a SLA ( not easy and certainly not fast to get) or at least derive from their answers a sort of base line on what part of the infrastructure is business critical. Once you have this information, try to map this towards your infrastructure. This will get you an idea on what you need to look at in terms of systems (i.e. router, switches, server, etc.). The next step is to define what parameters to look at for any given device or device group and how to monitor those. A very important part is to define what sort of reporting is necessary in order to verify whether a SLA is adhered to or not. Another important area is the notification (escalation). Again, there is no all size fits all approach. It heavily depends upon the SLAs and the requirements of your business. Yes, the size and depth of your IT organisation has an impact as well. Only if above has been sufficiently defined and agreed upon can you go to define the technical aspects of the monitoring and reporting. Lets take a server as an example. You will need to look at CPU usage, memory usage, disk subsystem, NICs, etc. How exactly depends upon the role of the server. Most likely would you like to know about any exception as soon as possible. Well, this gets you into monitoring events (traps, syslog, winevent, etc.). One advise, monitor only those items which are necessary in order to adhere to any SLA. One could monitor everything and then one would need a lot of manpower and/or systems to make sense out of all those information. Now for the bad message. You can not tell in general what sort of mibs and mib variables to use. It depends upon the make, model and brand of the device. Everybody does it different. If you use equipment from the big ones like HP, IBM, Dell, Cisco and the like, things are reasonably easy. All of them have a lot of info on their web sites and in their documentation. Not always easy to find, but it is usually there. A good place to start looking for general SNMP knowledge is http://www.wtcs.org/snmp4tpc/default.htm. An excellent set of documentation regarding general system management and monitoring is here. Network management 1. Introduction - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap1.pdf 2. Network monitoring - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap2.pdf 3. Network control - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap3.pdf 4. SNMP Network Management Concepts - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap4.pdf 5. ASN.1 notation - only in French 6. SNMP Management Information - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap6.pdf SMI: Structure of Management Information MIB: Management Information Base 7. SNMP protocol principles - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap7.pdf 8. RMON basic principles - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap8.pdf 9. RMONv2, SNMPv2, SNMPv3 improvements - http://www.montefiore.ulg.ac.be/~leduc/cours/ISIR/ISIR-chap9.pdf This may not be exactly what you have asked for, but this gets you certainly going into the right direction. I hope this helps Luz Berger Berger Network Consult http://www.bergerl.com -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Finbom Magnus Sent: Friday, July 30, 2004 10:05 AM To: '[EMAIL PROTECTED]' Subject: [WhatsUp Forum] How to know what to look for? Hi! The world of SNMP is quite nice to work with and im learning more every day. One thing that feels like the heavy part of SNMP is to know what to look for. A server has many parts that can break. Both software and hardware. There is the cpu(and maybe MPU), drives in a raid. Raid-card, several nic's, memory, power and more... On the switches there are surely many things that can be monitored as well. What is the easiest way to find out what things that can be monitored an a device? The only ways I know this far is to download a MIB, complie and then browse through it and with help of mibdepot.com find out what every OID is usefull for. I dont want to miss anything. Would be boring if I thought of having a good WUG-config and the a server breaks down becuse I missed that there was that special OID to monitor.. Best regards Magnus Finbom IT-Engineer(Microsoft MCP, MCP+I, MCSE-NT4) Lansforsakringar Skaraborg Bank and Insurance Radhusgatan 8 54129 Skovde Sweden phone 0500 77 70 65, gsm 0708 71 70 60, fax 0500 77 70 30 [EMAIL PROTECTED] http://www.lansforsakringar.se/skaraborg Please visit http://www.ipswitch.com/support/mailing-lists.html to be removed from this list. An Archive of this list is available at: http://www.mail-archive.com/whatsup_forum%40list.ipswitch.com/ Please visit http://www.ipswitch.com/support/mailing-lists.html to be removed from this list. An Archive of this list is available at: http://www.mail-archive.com/whatsup_forum%40list.ipswitch.com/
