On Thu, 28 Oct 2010 10:31:14 -0700, Christofer Hardy <[email protected]> wrote:
> What size shop are you? Small, Medium, Large? > Do you need to get every page? > Do you use SMS or Paging? By LOPSA stanards we're a low/mid shop. ~220 servers, just pushing past 100TB storage. Since we're a largeish University (24K accounts) with a single campus, we don't have to worry about getting remote sites working together. We're HA-to-a-point, in that we don't have a formal on-call schedule (we have an informal one), and we pay more attention to service during crunch times (run up to and during finals) than fallow times (breaks). We started out using Big Brother, but maintaining that infrastructure started getting tedious so we invested in Dartware Intermapper. The alerting logic leaves something to be desired (we've yet to be able to trigger an alert for "2 of 3 servers down" without resorting to custom-written probes) but it does support decent alerting options. Recent versions have started supporting Windows WMI probes, which will be handy for creating probes monitoring for specific service processes. Am I happy with it? Not terribly. Because of its limitations, it isn't easy to build truly intelligent alerts (http://blog.serverfault.com/post/1264376462/intelligent-alerts). Because of the complexity in doing so, such alerts stale out way faster than motivation to fix them, so it doesn't get done. - Greg Riedesel _______________________________________________ Discuss mailing list [email protected] http://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
