So, i’ve recently been reading up on the #monitoringsucks tags, their responses, and some of the various things that have come out of it. I’m in a new shop, AWS based, so may of the old standbys aren’t quite as much of a obvious call anymore.
What I’m now trying to figure out is what I’m missing, or would lose, by going with a newer paradigm for monitoring. Anyone using Riemann yet? Do you still use nagios / sensu / etc? — Basically, Riemann operates on a stream of metrics, vs relying on a a check every X min. I’m trying to determine what I’ve lost by not implementing a nagios style system, to basically cron checks. (the alerting & state stuff I’m pretty confidant I’m not loosing.) For example: I had initially thought I’d lose a check of the web site every X min, but the load balancer does that anyways, and that triggers log and metrics about page speed return. I think that as you scale, you start getting even more data & metrics, and the need for manual injection of jobs becomes smaller. I’m curious about peoples thoughts on this… Matthew [email protected] _______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
