Well if communication between the two servers was just fine on layer 3 but it couldn't resolve, layer 7, your problem there was that the slave didn't know what IP the master was.
You could up the TTL to 4 hours and it could have worked in that last scenario, or 8 hours, etc. For DNS on something like this I suggest you keep a long record, we'll say a week. If you know you're going to change it, change the TTL for half an hour or a full hour a week in advance of the change. Then change it to the new IP and put the TTL back to a week. Josh Luthman Office: 937-552-2340 Direct: 937-552-2343 1100 Wayne St Suite 1337 Troy, OH 45373 "When you have eliminated the impossible, that which remains, however improbable, must be the truth." --- Sir Arthur Conan Doyle On Thu, Sep 17, 2009 at 7:29 PM, David Rees <[email protected]> wrote: > On Thu, Sep 17, 2009 at 4:13 PM, Josh Luthman > <[email protected]> wrote: > > To rule out DNS - are the boxes using a DNS cache server on themselves or > > using a secondary server? What's the TTL on those A/CNAME records and > how > > long was your outage? > > All the boxes use a caching DNS server - the TTL on the host that went > down that the affected slaves were monitoring was 5 minutes - it was > down for close to 3 hours. > > I've since changed my config to use IP addresses for the host config, > but it'd be nice to not have to and for the slaves to cache the last > lookup in case there is a DNS failure... > > -Dave >
_______________________________________________ smokeping-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
