Re: AFS over ATM

Marcus Watts Sat, 12 Nov 1994 14:18:56 -0500
Benchmarking is one of those relatively thankless jobs;
it takes lots of time, and if you're brave enough to post
your results, somebody else always comes along with some
clever question you hadn't thought of and for which you
don't have a good answer.  So, before I say anything
else, let me say I appreciate the time & effort you
put into making your tests, and thank you for the curtesy
of reporting your results here.  It may well be that
for the particular application you were interested in,
the numbers you got were just what you were looking for.
But if so, I fear your application is probably not real
representative of the sort of application load most
people would expect to see on their clients & servers.
So, remainder of this message is meant not to discourage you
from doing more of the same, but rather to alert you to some
additional possibilities and avenues to explore.

        About machine configuration

The RS/6K has a relatively clever scheme to manage
network interfaces - "ifconfig en0 detach" will actually
make a network interface disappear, & the networking
configuration is stored in /etc/objrepos and so can be
readily changed merely by modifying the files & rebooting.
So, it should be pretty easy to run with both interfaces
but without any concerns about all those messy "multi-homed" issues.

The RS/6K disk configuration also gets squirreled away into
/etc/objrepos.  In fact, "odmget CuVPD" will tell you more than you
ever wanted to know about your RS/6K's innards, albeit
almost totally incomprehensible.

        About AFS writes

when doing AFS writes, the major part of the overhead of
writing the file is not in the write call, but when the file
is closed.  For a large file (many megabytes), close can easily
take many many seconds.  If bonnie only measure the effect
of the "write" calls, and doesn't measure "close", you may
mostly be measuring the performance of the cache, not of
ATM.  Various recent versions of AFS have dallied with
various degrees of "asynchronous" file flushing, hence,
even if close isn't instantaneous, it may not necessarily
indicate that the entire file has reached the file server.
One way to force this is to use "ioctl(fd, VIOCCLOSEWAIT, &blob)"
which causes the cache manager to block in close until the last
chunk is acknowledged by the file server.  There's also a
"-waitclose" option to afsd that makes the same thing
happen on a per-client basis.

Another way to measure of "write" performance would be to
do an "fsync" every so often to the file.  This will
cause any dirty chunks to be written to the file server
and so guarantees that all the network overhead will
be measured.  In recent versions of AFS, it will also
cause the server to do a "fsync" which isn't quite so
convenient.

The "per-character" tests (read & write) are almost certainly
mostly measuring the performance of the AIX compiler and
RS/6K CPU.  The high percentage of CPU utilitization
there is almost certainly the RS/6K slogging its way
though many many iterations of the macro expansion of putc,
doing, in effect, a very slow version of memset or memcpy
into the user space stdio buffer, and only occasionally doing
an "flsbuf", which will even more occasionally result in
doing anything more than adding another few K of data to
an already existing partially filled chunk.

        About AFS caching

The client has 512M of ram - if so, a 10 M memory cache
would seem to be a viable concept.  (But not with a
1 M chunk size!)

On other other hand, the performance with much smaller chunk
sizes would be interesting to know.  While there are
applications where large file size matters, there
are plenty more which deal with smaller files.  I forget
the exact statistic - but something like 80% of all
files are less than 4K in size.  In one arbitrary directory
I picked (the last program I compiled), I have 208 files -
exactly one of which is over 1 M in size.  56 of them are
over 4K in size.  The remaining 152 are not more than 4K in size.

For IFS, with macintoshes, I found that a very small chunk size
produces BETTER performance - the way netatalk works is it
stores an .AppleDouble file with essential pieces of directory
information in it.  When the typical mac user sniffs around things
with the finder, that results in reading about the first 4K chunk
of many hundreds of files -- with the default 64K chunk size,
that means about 80% of the information fetched was never used.

When you do an "fs flushv" - do you do a "system("fs flush X")",
or do you do a "pioctl("X", VIOC_FLUSHVOLUME, &blob, 0)"?
With the former, there may be a significant additional
amount overhead in loading the shell, loading fs, and perhaps
in terms of probing the path to find "fs".

With a "warm" cache, it's certainly the case that if the
file is already in the cache, little or no network I/O happens;
with the large client memory size, it's probably already
in memory, hence, the high CPU rates are not much of
a surprise - clearly the only thing that could be
happening is one of several sorts of memory copy on
the client.

I'm not really at all sure what your "lseek" tests have measured.
With Unix, "lseek" per se is actually only a good measure of system
call handling.  Does "bonnie" read anything after it seeks?  If so,
it might be a good measure of random I/O.  If not, then perhaps what
is being measured here is the overhead of fetching the volume header,
or of directory lookups, or something else.  The low CPU utilization
you report certainly suggests whatever is being measured is not
system call overhead!

        About network types and loading

For most people, switching to 100% ATM is not a viable
near term option.  A much more likely scenario is one where
the DB servers & file servers are on ATM, but most of the
workstation clients are on several ethernet segments.
Some of the questions I'd want to ask are:
        what is the optimal chunksize & MTU?
        how many active ethernet clients can one ATM server handle?
        how many active ATM servers does it take to saturate ATM?

It would also be fascinating to compare ATM and fddi
performance.  Here at the UofM, while most clients are
still ethernet'd, the DB servers & file servers are all
on fddi.  So, for us, the interesting comparison is
not with ethernet, but rather with fddi, and the interesting
metric is not "how fast is one client" but rather "how
many users, and how many clients, vs. how fast."

It would be interesting to know the rate of CPU utilization
on the file server, as well as the amount of disk I/O
on the server, and the amount of network traffic going
on between the two machines, during the tests.  A crude
way to get this information would be to run back to back
tests on the client machine, then, while that's running,
use "iostat" and "netstat" on the server to get this information.

        In Conclusion

I hope I haven't scared you away from benchmarking.  It can be quite
tricky to come up with meaningful results, and it's positively
dangerous to associate meanings with the results.  Nevertheless, it
can still be quite educational, and there's certainly no substitute
when it comes to questions like "how fast is it?"

                        -Marcus Watts
                        UM ITD RS Umich Systems Group
Re: AFS over ATM

Reply via email to