On Fri, 11 Oct 2002, Jerry Asher wrote: > Now there are various ways we might make this more efficient, but one > way that seems to stick out is if we can use some form of broadcast or > multicast so that one system can speak to everyone at a time. This would > be very nice for heartbeats (as spec'd by the client). It would also > seem to make sense for passing state information around as long as we > can somehow ensure those multicasts are reliable.
I'm not sure this really makes things more efficient, if I understood your example properly. If you have a network that includes 300 hosts, and each host is required to maintain state on each additional host, as well as to update every other host on its status, then each host has to handle the 300 incoming connections whether they arrive via multicast, broadcast or unicast. The only place you obtain savings is in how many outbound requests each host has to generate; in the case of the broad/multicast, each host would only have to send one update, but would still have to receive status from every other host. > >You could also consider using DNS to > >propagate the data, with caching servers at various points in the network, > >assuming you feel the traffic would overwhelm a single server. > > Can you explain a bit more on how to get DNS to do this? A simplistic solution would be to introduce TXT records for each host's status, and have a DNS hierarchy that delegates to each host its own status record; then hosts can ask the nearest caching server for the current status of J. Random Host, and the caching server would return the latest status available. You'd have to set the TTL on the TXT records as low as possible -- I'm sure that subsecond TTLs aren't possible, and I'm not really sure how well 1 second TTLs would work, but 2-second TTLs might work. Conventional BIND could handle this, where each host would run a copy of BIND, update the zone file for its status, and then reload the server (assuming the server is only authoritative for that single zone); this process is likely to take a second or two on it's own. You could hack BIND to obtain the data from some other source; there is a database interface available in BIND for this type of work. Note that your DNS info does not have to be "real" DNS info; you're just using the protocol for data distribution. My underlying idea is to have a group of hosts whose job is to maintain status information on other hosts, and have enough of them that failure of some number won't prevent the network from functioning. The case you described is one in which the group of hosts includes all hosts on the network; I'm suggesting that job be restricted to a subset. As for data distribution protocols, there are a number of them, and DNS and LDAP are only two, but I think if you stratify the host status information, similar to the way DNS and NTP are stratified, you will make better use of bandwidth without compromising the availability of the data; you'll just get more latency.
