At the University of Pittsburgh, we currently have ~55,000 user AFS
accounts. Way back when we were originally setting up AFS, we had
picked a "balanced hash" scheme for locating users' home directories: we
created incrementally-named subdirectories under /afs/pitt.edu called
usr0, usr1, usr2, and so forth. Home volumes for new accounts were
mounted in the subdirectory that contained the fewest mountpoints.
As a convenience for processes that needed to get a user's home
directory but couldn't easily perform getpwnam(), we also created a
single /afs/pitt.edu/usr directory that contained a superset of all the
mountpoints in the individual /afs/pitt.edu/usrNN directories.
After several years of experience with this scheme, in retrospect, we've
mostly concluded that the "balanced hash" approach was the wrong
approach to take.
One of the things we've found is that it's very valuable to be able to
determine the home directory solely from the username. Expecting
processes to use a lookup scheme (such as getpwnam(), or perhaps a
hesiod or LDAP lookup) might seem reasonable in theory, but at least for
us, it breaks down in application. For starters, our WinNT AFS clients
(which we're just starting to bring online) neither know nor care about
our unix passwd files. All of our shell scripts are /bin/sh scripts (as
good shell scripts should be, for portability reasons), and doing passwd
file lookups in /bin/sh means running grep on the passwd file. For
people in other AFS cells (e.g., cs.pitt.edu), getpwnam() lookups are
using the wrong passwd file. The /afs/pitt.edu/usr structure was
supposed to provide an easy way to access a person's home directory
given only their username, but we've found that dealing with a directory
that contains ~55,000 entries is painful at best, and a wedge-o-matic
feature at worst.
(I could probably think of other aspects of the "balanced hash" scheme
that have bitten us in the past, but these are the ones that spring most
easily to mind.)
At any rate, we're currently investigating how easy it would be to move
to a username tree approach (e.g., /afs/pitt.edu/usr/q/r/qralston, or
/afs/pitt.edu/usr/q/r/a/qralston). I don't know if it's something we'll
actually do, as conversion would be a pain. But definitely, if we had
to do it all over again, we would *not* go with the balanced hash
approach.
--
James Crawford Ralston \ [EMAIL PROTECTED] \ Systems and Networks [CIS]
University of Pittsburgh \ 600 Epsilon Drive \ Pittsburgh PA 15238-2887
"Computer, you and I need to have a little talk." - O'Brien, ST:DS9