>Some cells (transarc!) have AFSDB DNS records in place. Are there any
>mechanisms in AFS in place which make use of these DNS records ?
>Are there any plans to incorporate sunc mechanisms ?
>
>It would be a good thing to be able to get rid of the administration of
>CellServDB.
I definately agree that CellServDB should die an evil death. It wouldn't
be so bad if the cache manager did hostname resolution of all the database
servers (at least that would put the cache manager and the userland
utilities in sync), but that's really the root of the problem, isn't it?
I've even sketched out in my head a solution to the problem:
- First, trash the idea of having a list of all possible cells in your
filesystem. DNS should be the ultimate authority when it comes to
cell information. A CellServDB file should be kept around for older
sites that haven't upgraded their nameservers to support AFSDB
records, but by no means should it be an exhaustive list. This
would let anyone be able to add a new cell just by creating an AFSDB
record; instantly everyone on the Internet would know about it!
- In afs_resource.c, change the afs_GetCell(), afs_GetCellByName(),
and afs_GetCellByIndex() routines to look in the local kernel database
for cell information, but if it isn't found, call the user-land
lookup process (see below).
- Modify the DNS resolver routines so they are asychronous, instead of
blocking for each DNS call.
- Write a user-level daemon that receives requests for cell name lookups
from the cache manager and does AFSDB queries to get database server
information. I'm not sure what the best way is to notify the userland
process that the cache manager would like to do a DNS request - I suppose
sending a SIGUSR1 to the userland process and having it pass the
information via a pioctl() would be okay.
- Use the TTL from the DNS query to have the cache manager timeout the
entries for cell database server information. No doubt there are plenty
of hidden gotchas with this part :-)
- Do some clever things like looking up cell db info a few minutes before
the timeout so things don't lock up during a DNS lookup when ttl expiration
time comes.
- Figure out how to deal with DNS errors in a sensible manner.
- Package up the whole thing and put a nice context-diff in the Transarc
source-contrib area :-)
Simple, eh? :-)
Every once in a while I think about actually _implementing_ this nasty
thing, but I'm torn between doing that and getting an AFS server running
on NetBSD :-)
Comments and/or suggestions to the above scheme are welcome.
--Ken