On 15 December 2013 23:50, Charlie Boisseau <[email protected]> wrote:
> Gavin,

Hi Charlie,

> We’ve been working on this recently, so I can provide some insight here.

Only recently? You're a good few years ahead of us :-)

> We
> use Pingdom and BGPmon at the moment for basic reachability tests but it’s
> very basic so we we are in the process of beefing it up, and have taken a
> pretty belt and braces approach:

OK, cool. I set up BGPmon and noticed we're seeing AS-TRANS/AS23456 alerts.

> We’re replicating our internal monitoring (Nagios + Xymon) to an external VM
> hosted with Digital Ocean (but could easily be Amazon or similar).  This
> will mean our monitoring servers will be monitored which gives us some extra
> piece of mind.  It also means that we can host a status webpage for
> customers to access if we have a problem.

Yep, same plan here.

> The status page is on a completely separate domain name hosted on external
> DNS servers, just in case we have a problem that affects DNS.  We have
> integrated this to our broadcaster platform so that our admins can post an
> alert or maintenance window and it uses the Wordpress API to post it on the
> status page as well as emailing the affected customers as normal.

I think that's the only sane way to present things to customers. "Oh,
your website is down
so can't check the service status" :-)

> We also have the monitoring servers using an SMS API (I’m sure you know the
> one) to send text alerts in case emails can’t reach us.  The system sends me
> a ‘sanity’ text message every day at 6pm so we know even at quiet times that
> all is working correctly.

Yep. same here. We also have a daily job to dial in to our OoB modem
if all else fails :-)

> Ofcourse the on-net monitoring server will look out for the one hosted
> externally and visa-versa, so theoretically there would have to be multiple
> failures with both us and external parties for there to be a problem that we
> don’t know about.

We need a NLNOG Ring for community monitoring. I don't know if NLNOG
ring has alerts
but will check as I just created the VM last night.

-- 
Kind Regards,

Gavin Henry.

Reply via email to