Hi Phil,
Thanks for running these tests.
I think this buffer size will be dependant on the machine
configuration right?
If we work out a simple formula for the buffer size based on say
memory b/w (and/or latency), network b/w (and/or latency), we could
plug that in as a sane default (bandwidth .
I did not realize that this setting will have such a noticable effect
on performance.
I can work on a patch to change these settings at runtime.
thanks,
Murali
> >> - single client
> >> - 16 servers
> >> - gigabit ethernet
> >> - read/write tests, with 40 GB files
> >> - using reads and writes of 100 MB each in size
> >> - varying number of processes running concurrently on the client
> >>
> >> This test application can be configured to be run with multiple
> >> processes and/or multiple client nodes. In this case we kept
> >> everything on a single client to focus on bottlenecks on that
side.
> >>
> >> What we were looking at was the kernel buffer settings
controlled in
> >> pint-dev-shared.h. By default PVFS2 uses 5 buffers of 4 MB
each.
> >> After experimenting for a while, we made a few observations:
> >>
> >> - increasing the buffer size helped performance
> >> - using only 2 buffers (rather than 5) was sufficient to
saturate the
> >> client when we were running multiple processes; adding more
made only
> >> a marginal difference
> >>
> >> We found good results using 2 32MB buffers. Here are some
> >> comparisons between the standard settings and the 2 x 32MB
> >> configuration:
> >>
> >> results for RHEL4 (2.6 kernel):
> >> ------------------------------
> >> 5 x 4MB, 1 process: 83.6 MB/s
> >> 2 x 32MB, 1 process: 95.5 MB/s
> >>
> >> 5 x 4MB, 5 processes: 107.4 MB/s
> >> 2 x 32MB, 5 processes: 111.2 MB/s
> >>
> >> results for RHEL3 (2.4 kernel):
> >> -------------------------------
> >> 5 x 4MB, 1 process: 80.5 MB/s
> >> 2 x 32MB, 1 process: 90.7 MB/s
> >>
> >> 5 x 4MB, 5 processes: 91 MB/s
> >> 2 x 32MB, 5 processes: 103.5 MB/s
> >>
> >>
> >> A few comments based on those numbers:
> >>
> >> - on 3 out of 4 tests, we saw a 13-15% performance
improvement by
> >> going to 2 32 MB buffers
> >> - the remaining test (5 process RHEL4) probably did not see
as much
> >> improvement because we maxed out the network. In the past,
netpipe
> >> has shown that we can get around 112 MB/s out of these nodes.
> >> - the RHEL3 nodes are on a different switch, so it is hard to
say how
> >> much of the difference from RHEL3 to RHEL4 is due to network
topology
> >> and how much is due to the kernel version
> >>
> >> It is also worth noting that even with this tuning, the single
> >> process tests are about 14% slower than the 5 process tests.
I am
> >> guessing that this is due to a lack of pipelining, probably
caused by
> >> two things:
> >> - the application only submitting one read/write at a time
> >> - the kernel module itself serializing when it breaks reads/
writes
> >> into buffer sized chunks
> >>
> >> The latter could be addressed by either pipelining the I/O
through
> >> the bufmap interface (so that a single read or write could keep
> >> multiple buffers busy) or by going to a system like Murali
came up
> >> with for memory transfers a while back that isn't limited by
buffer
> >> size.
> >>
> >> It would also be nice to have a way to set these buffer settings
> >> without recompiling- either via module options or via pvfs2-
client-
> >> core command line options. For the time being we are going
to hard
> >> code our tree to run with the 32 MB buffers. The 64 MB of
RAM that
> >> this uses up (vs. 20 MB with the old settings) doesn't
really matter
> >> for our standard node footprint.
> >>
> >> -Phil
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-
developers
> >>
> >
>
> _______________________________________________
> Pvfs2-developers mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-
developers
>
<kmod-bufsizes.patch>