On Fri, 11 Oct 2002, Jerry Asher wrote:

> Now there are various ways we might make this more efficient, but one
> way that seems to stick out is if we can use some form of broadcast or
> multicast so that one system can speak to everyone at a time. This would
> be very nice for heartbeats (as spec'd by the client). It would also
> seem to make sense for passing state information around as long as we
> can somehow ensure those multicasts are reliable.

I'm not sure this really makes things more efficient, if I understood your
example properly.  If you have a network that includes 300 hosts, and each
host is required to maintain state on each additional host, as well as to
update every other host on its status, then each host has to handle the
300 incoming connections whether they arrive via multicast, broadcast or
unicast.  The only place you obtain savings is in how many outbound
requests each host has to generate; in the case of the broad/multicast,
each host would only have to send one update, but would still have to
receive status from every other host.

> >You could also consider using DNS to
> >propagate the data, with caching servers at various points in the network,
> >assuming you feel the traffic would overwhelm a single server.
>
> Can you explain a bit more on how to get DNS to do this?

A simplistic solution would be to introduce TXT records for each host's
status, and have a DNS hierarchy that delegates to each host its own
status record; then hosts can ask the nearest caching server for the
current status of J. Random Host, and the caching server would return the
latest status available.  You'd have to set the TTL on the TXT records as
low as possible -- I'm sure that subsecond TTLs aren't possible, and I'm
not really sure how well 1 second TTLs would work, but 2-second TTLs might
work.  Conventional BIND could handle this, where each host would run a
copy of BIND, update the zone file for its status, and then reload the
server (assuming the server is only authoritative for that single zone);
this process is likely to take a second or two on it's own.  You could
hack BIND to obtain the data from some other source; there is a database
interface available in BIND for this type of work. Note that your DNS info
does not have to be "real" DNS info; you're just using the protocol for
data distribution.

My underlying idea is to have a group of hosts whose job is to maintain
status information on other hosts, and have enough of them that failure of
some number won't prevent the network from functioning.  The case you
described is one in which the group of hosts includes all hosts on the
network; I'm suggesting that job be restricted to a subset.  As for data
distribution protocols, there are a number of them, and DNS and LDAP are
only two, but I think if you stratify the host status information, similar
to the way DNS and NTP are stratified, you will make better use of
bandwidth without compromising the availability of the data; you'll just
get more latency.

Reply via email to