Team - Based on our downtime this past evening I took an interest in our current monitoring solution (if you could call it that). The details I found are listed below, and I think clear up some misconceptions I've (we've) had about this box.
signal.gnome.org is, as we know, hosted at OSUDL. It is a 2cpu VM (QEMU Virtual CPU version 0.11.1), with 256M RAM and about 7.5G storage. Currently it is running nagios3 on apache 1.3 and mysql server (a requirement of nagios3?). The current monitoring configuration is poor and looks like it has been for some time. It is only monitoring a handful of services, the key services not even configured properly. As an example, window.gnome.org HTTP service: down 246d 16h 33m 12s. Most configured services are like this. It's mostly red across the board, and I'm sure it's simply misconfiguration. It'll take a little bit of work but it can be cleaned up to provide rudimentary monitoring without a lot of work. This is what I'd like to do: 1) update to apache2 (why is it even on apache 1.3??) 2) define as a group the critical services we want monitored (I'd suggest http for bugzilla and the wiki for starters) 3) configure SSL for the signal webserver. Auth is done by htpasswd. We all know plain text is bad. 4) configure the nagios3 path as the default DocumentRoot. Currently / shows some generic message, the wiki points to /nagios/, but the actual monitoring is at /nagios3/ 5) as an extra, perhaps add a DNS cname/alias for 'nagios.gnome.org' which points to signal. 6) /etc/aliases only defines specific admins as email recipients. I think these should be sent team-wide. All of this would take me maybe a couple hours tomorrow. I'm interested in any other feedback re: services monitored, notification methods (emails to specific sysadmins per-host? emails to -sysadmin? emails to -infrastructure?) In the meantime I'll get started on some basic maintenance, such as fixing the monitoring that is there. Thanks, Christer _______________________________________________ gnome-infrastructure mailing list [email protected] http://mail.gnome.org/mailman/listinfo/gnome-infrastructure
