Hi all - good question for BBLISA folks.

We have a large cluster of machiens that have 3 DNS servers available to
them. (sdm[123]).  We run applications on the cluster that are 'startup &
run' type of services.

The problem is that these applications seem to start, check
/etc/resolv.conf, pick a nameserver, and stick to it.  If the nameserver
it attaches to (the first one on the list probably) goes down (either
named dying or the machine rebooting), all the applications wedge with
failed lookups.

We need a stronger failover mechanism for these clusters - how do folks
handle this sort of situation?

One thing that has been proposed is using a localized 'caching only' named
configuration on each of the servers, with a list of 'upstream' servers
(sdm[123]) to consult if the cache doesn't have the answer.

I've thought about round-robin DNS for the nameservers - but does
libresolv actually handle that appropriately?  Eg, if I call libresolv to
do a nametoaddress lookup, and we say 'use sdm for resolution', will it
keep poking whatever IP 'sdm' is for the answer?

How do folks handle fault-tolerant nameservice inside high availability
clusters?

(btw - these are all Sun machines, running Solaris 7 or higher)

-- 
------------------.--------.
Dave Belfer-Shevett\ KB1FWR \
www.homeport.org    >--------`------------------------------------
[EMAIL PROTECTED]  /     "Life imitates art far more than art     \
------------------<         imitates life." (Oscar Wilde -         |
                  |               www.chiasmus.com)                |
                   \______________________________________________/


---
Send mail for the `bblisa' mailing list to [EMAIL PROTECTED]'.
Mail administrative requests to [EMAIL PROTECTED]'.

Reply via email to