How "cp" buffers will be operating system dependent
(in the good old days, it would have been 512 byte blocks),
but that it will buffer is more or less a given. Chances
are that it will be something like 4K, and that may well
be close to optimal when reading from an NFS file server.
For "AFS", there are a couple of constraints: the first
is that of the cache manager. You will want a large
enough cache to hold all of a file. This probably
won't be at all hard for a tree of mostly small source
files. Assuming the cache is large enough, most of the AFS
time and overhead will be in close, flushing the file to the
file server. You can realize some modest gains here by making
the chunk size large (but the default of 64K is probably plenty
big), and making sure there are enough afsd daemons -- (the
default config of "2 + 2" only gives you 2 daemons;
if you can bump that up a bit, you may be able to put
most of the file sytem write activity into the background.
If there are enough files over 64K, but not way over,
it would be worth increasing the chunk size to hold
most of those files, since only the last chunk
of the file is written asynchronously to the file server.
You'll need to edit /etc/rc (or the equivalent), find
the run-line for "afsd", append something like "-daemons 8",
and reboot the filesystem. Of course, if NFS can't
supply data faster than AFS can write it, then there
will be at most one write pending to AFS and even
the default number of daemons is fine.
I am not so familiar with NFS. About all I can suggest
is placing the client close to the NFS file server,
and doing it when the NFS server is otherwise unoccupied.
Perhaps the best way to speed it all up would be to
eliminate the overhead of NFS entirely -- like using
"tar" directly on the NFS file server to extract the
data. Then an AFS client with a tape drive can extract
>From the tape to AFS, and bumping up the # of "afsd"s will
be a big win.
Besides "cpio" and "tar", the other utility to try might
be "up". But, actually, "up" isn't all that smart either;
it just buffers at 4K boundaries, so it won't be any
better than "cp -r". (The big win with "up" is that
it also copies AFS file permissions -- not an issue here.)
The only other improvement I can think of would be to
divide up the tree, and use several clients to copy
different parts of the tree from NFS to AFS. If
the network and file servers can keep up, that may
well be worth it for at least a few clients.
-Marcus Watts
UM ITD RS Umich Systems Group