On Mon, Jun 09, 2003 at 10:51:42AM -0400, Dave Belfer-Shevett wrote: > > Hi all - good question for BBLISA folks. > > We have a large cluster of machiens that have 3 DNS servers available to > them. (sdm[123]). We run applications on the cluster that are 'startup & > run' type of services. > > The problem is that these applications seem to start, check > /etc/resolv.conf, pick a nameserver, and stick to it. If the nameserver > it attaches to (the first one on the list probably) goes down (either > named dying or the machine rebooting), all the applications wedge with > failed lookups. > > We need a stronger failover mechanism for these clusters - how do folks > handle this sort of situation? > > One thing that has been proposed is using a localized 'caching only' named > configuration on each of the servers, with a list of 'upstream' servers > (sdm[123]) to consult if the cache doesn't have the answer.
Caching nameservers on each critical production host is the way to go. If you're really paranoid and it's a small number of hosts, you could set them up as hidden slaves of your zone and let them zone-transfer the domain from your master. If you do that, I'd suggest setting up the primary to notify your production machines about changes by using the also-notify option. -j --- Send mail for the `bblisa' mailing list to [EMAIL PROTECTED]'. Mail administrative requests to [EMAIL PROTECTED]'.
