>Some cells (transarc!) have AFSDB DNS records in place. Are there any
>mechanisms in AFS in place which make use of these DNS records ?
>Are there any plans to incorporate sunc mechanisms ?
>
>It would be a good thing to be able to get rid of the administration of
>CellServDB.

I definately agree that CellServDB should die an evil death.  It wouldn't
be so bad if the cache manager did hostname resolution of all the database
servers (at least that would put the cache manager and the userland
utilities in sync), but that's really the root of the problem, isn't it?

I've even sketched out in my head a solution to the problem:

- First, trash the idea of having a list of all possible cells in your
  filesystem.  DNS should be the ultimate authority when it comes to
  cell information.  A CellServDB file should be kept around for older
  sites that haven't upgraded their nameservers to support AFSDB
  records, but by no means should it be an exhaustive list.  This
  would let anyone be able to add a new cell just by creating an AFSDB
  record; instantly everyone on the Internet would know about it!

- In afs_resource.c, change the afs_GetCell(), afs_GetCellByName(),
  and afs_GetCellByIndex() routines to look in the local kernel database
  for cell information, but if it isn't found, call the user-land
  lookup process (see below).

- Modify the DNS resolver routines so they are asychronous, instead of
  blocking for each DNS call.

- Write a user-level daemon that receives requests for cell name lookups
  from the cache manager and does AFSDB queries to get database server
  information.  I'm not sure what the best way is to notify the userland
  process that the cache manager would like to do a DNS request - I suppose
  sending a SIGUSR1 to the userland process and having it pass the
  information via a pioctl() would be okay.

- Use the TTL from the DNS query to have the cache manager timeout the
  entries for cell database server information.  No doubt there are plenty
  of hidden gotchas with this part :-)

- Do some clever things like looking up cell db info a few minutes before
  the timeout so things don't lock up during a DNS lookup when ttl expiration
  time comes.

- Figure out how to deal with DNS errors in a sensible manner.

- Package up the whole thing and put a nice context-diff in the Transarc
  source-contrib area :-)

Simple, eh? :-)

Every once in a while I think about actually _implementing_ this nasty
thing, but I'm torn between doing that and getting an AFS server running
on NetBSD :-)

Comments and/or suggestions to the above scheme are welcome.

--Ken

Reply via email to