Timothy: As Andrew pointed out, the latency of the file server connection is what will matter most when performing large numbers of small RPCs such as directory and file creation. You mention 23,000+ directories and do not mention the number of files involved.
For rsync to work it must read and stat every single file in every directory in the tree. Therefore, the cache must either have enough room to hold stat info for every directory entry or it must be fetched from the file server. You also do not mention where you are performing the rsync. If you are doing so locally, you may find that you get better performance by running an rsync daemon on the AFS file server containing the volumes you are syncing to. Since you are syncing and do not care about reading the data back from AFS, you should be using the cache bypass option. You mention a desire for OSD. I'm not sure that OSD is going to help you if most of your time is spent performing metadata operations and not actual data transfer. Based upon what I have heard so far my guess is that the primary expense is in the metadata operations. Jeffrey Altman
signature.asc
Description: OpenPGP digital signature
