Timothy:

As Andrew pointed out, the latency of the file server connection is what
will matter most when performing large numbers of small RPCs such as
directory and file creation.  You mention 23,000+ directories and do not
mention the number of files involved.

For rsync to work it must read and stat every single file in every
directory in the tree.  Therefore, the cache must either have enough
room to hold stat info for every directory entry or it must be fetched
from the file server.

You also do not mention where you are performing the rsync.  If you are
doing so locally, you may find that you get better performance by
running an rsync daemon on the AFS file server containing the volumes
you are syncing to.

Since you are syncing and do not care about reading the data back from
AFS, you should be using the cache bypass option.

You mention a desire for OSD.  I'm not sure that OSD is going to help
you if most of your time is spent performing metadata operations and not
actual data transfer.   Based upon what I have heard so far my guess is
that the primary expense is in the metadata operations.

Jeffrey Altman


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to