Graham made an offer on the hcoop-discuss list that I wanted to follow up on here. I don't want to disrupt the current efforts towards user migration, but I thought that there were some things that we could start working on while we finished other critical issues. Specifically, I would like to start monitoring the new systems using nagios on Graham's machine in London if no one objects. I'll work with him to configure things to email admins when one of our critical services goes down.
Graham, how do you propose that we configure this nagios instance in your datacenter? Would you want us to email you nagios configuration files, or would we be able to access the machine, perhaps using a non-root account that has permissions to restart nagios and permissions to write to a nagios file describing the HCoop servers and notification levels? Does on the HCoop board or in the current set of admins object to doing this immediately? Also, I think that backup DNS off-site is something that we should implement as soon as possible. It seems that this should be relatively easy to implement using domtool, but perhaps Adam should comment further about what would be required in order to make your bind instance respond as a slave server for our zones by default. I also recognize if he wants to wait on this until the migration starts. ;) Finally, is there anything that we can help you out with, e.g., backup MX services or DNS? It seems that we might be able to have a mutually beneficial relationship here in order to strengthen both of our services. I assume that the board would have to approve this, and perhaps write up some sort of formal agreement, but I can't see many objections to doing this sort of thing, and a lot of benefits in terms of increased service reliability. Best, Justin _______________________________________________ HCoop-SysAdmin mailing list [email protected] http://hcoop.net/cgi-bin/mailman/listinfo/hcoop-sysadmin
