Re: afs server sizing

Marcus Watts Wed, 26 Oct 1994 05:53:28 -0400
The answers to your questions depend a lot on the type of
use you actually make of your file server.  And for a lot
of that, you will want to make careful performance measurements
of your setup.  For instance:

there is little advantage to multiple I/O paths simply for
        pure "file server" use.  The file server runs as a user
        level process, so all reads and writes are "synchronous".
        Unless the read is in the buffer pool (and it probably
        isn't because it wasn't in your client's cache,) it
        has to block for the read, and nothing else is going to
        happen until that read returns.  Writes could in theory
        proceed asynchronously.  However, the file server does
        "fsync"s after writing each chunk, so it's still going to block
        until the data reaches the disk.  So, so far as the file
        server goes, the only thing that counts is how fast a single
        read or write is, not how many reads or writes it can have
        pending.

        So far as fast/slow scsi goes, if you want to maximize
        performance and care about nothing else, then it would
        be worth running benchmarks on your proposed configurations
        and measuring performance from a Unix read/write call
        through the local filesystem on the server.  SCSI bus transfer
        overhead is just one stage in a complex sequence of events and
        other factors, such as interrupt latency, controller card command
        processing, seek time, rotational latency, and software block
        allocation algorithms, all of which have to be tuned just right
        for peak performance.  For any other use, I believe it
        would make more sense to base fast/slow scsi decisions on other
        factors, such as cost, availability, and convenience.  It
        may be far more valuable to have all of your servers using
        the same kind of SCSI interface (so that you can swap drives &
        cabling) rather than trying to fine-tune the hardware configuration
        for each new model of file server.

        If all your clients are concentrated on a relatively small
        number of ethernets, fddi may not be a huge advantage.
        (Unless you frequently shuffle volumes between file servers.)
        Network performance can be just as important as
        disk performance - and network cards can have their
        weird internal controller delays & throughput problems
        just like disk controllers and drives.  Routers, too,
        are almost never as fast as the networks they connect.

If your users tend to concentrate their work on a small number of files,
        and mostly do reads, file server performance may be almost
        completely irrelevant.  Once it's in the cache, the file server is
        pretty much out of the loop, and it's mostly a question of how fast
        the client machine can shuffle data between the cache & programs.
        Adding that "last little bit of cache" may make a larger
        difference to what your users see than any amount of effort
        on all the other pieces.

backups probably consume more of your file server than daily use.  In
        fact, for most cases, the greatest advantage to having a fast
        file server is probably in terms of backups, not in terms
        of actual file server response.  Backups go through a separate
        Unix process from the file server - so its I/O should in fact be
        separate - and so there are some advantages to parallism here.
        The "dump" format for volumes is basically unaligned byte streamed
        data; and there are plenty of other inefficiences in the whole
        backup process that all cost CPU.  Probably the largest win would be
        in making sure each machine has its own tape drive (to ensure the
        least amount of network traffic) & it may be an advantage to put the
        tape drive on its own SCSI channel (the exact performance hit would
        be worth measuring).  Scheduling can also make a difference;
        if you can do your tape backups at night, you may not care
        at all how much of the server it eats.
database machine configuration is a key performance factor.  The latest AFS
        releases are a lot more memory hungry than older releases; and even if
        your cell has not grown much, chances are your AFS backup database
        is a lot larger than it used to be.  Slow database machines
        will slow down even the speediest clients & file servers
        (and *will* even slow down backups) so it's worth making sure your
        database servers aren't overloaded, but are fast, endowed with more
        than enough memory to avoid paging, don't do anything but DB service,
        and & have plenty of fast disk for that.

Client machine configuration makes a difference as well, as well as network
        paths and client subnet congestion.  For best performance, you have to
        look at the whole picture and identify the bottlenecks, instead of just
        concentrating on one piece.

To get the numbers on performance bottlenecks:
        the server & cache manager both come with extensive
        instrumentation; with a bit of effort you can learn
        far more about what each is doing than you ever
        wanted to know.  If you don't have source the numbers may not
        make as much sense - presumably Transarc can help there.

        A good network sniffer can help in terms of figuring
        out where the packets are going and what's being slow.
        Even if you can't decipher the rx packets, knowing
        which machines & ports thing are flying inbetween can
        still help in terms of identifying which places to look at.

        Standard Unix tools, such as vmstat, iostat, netstat,
        and even ps should not be underestimated.  "time" and "dd"
        can be used as a very quick & dirty "disk performance"
        benchmark, one you can even do in the offices of your not
        quite favorite local sales sleaze when his back is turned.

You will probably get the best overall performance by having a relatively
large number of relatively small fairly fast file servers each with its own
tape drive & with sufficient overall network capacity.  Every Unix
workstation company makes a number of really nifty low cost workstations
that perform almost as good as their big fat servers at a fraction of the
cost; for AFS use, these workstations make an attractive bargain, and
you'll probably even save enough buying these to do something crazy like
buying "hot spare"s.  So most purposes, you will probably save yourself
the most grief by standardizing on a relatively small number of configurations
and emphasizing interchangibility and flexibility first.  For a more definitive
answer than that, you really need to study the bottlenecks, needs, and
constraints of your particular site.

I hope this helps, even it wasn't cookbook
"2X + 3Y = # of servers to buy".

                                -Marcus Watts
                                UM ITD RS Umich Systems Group
Re: afs server sizing

Reply via email to