Murali, Phil,

I've gone ahead and commited this patch.  Thanks Murali!

-sam

On Nov 29, 2006, at 10:21 PM, Murali Vilayannur wrote:

Hi Phil,
Attached patch fixes the read buffer bug that you had mentioned and
also implements the variable sized buffer counts and lengths that we
can pass as command line options to pvfs2-client-core.
I did not implement module time options for buffer size settings since
that is fairly complicated and not intuitive (client core driving the
buffer size and count settings
seems to make more sense to me).

So now we can do
pvfs2-client --desc-count=<NUM1> --desc-size=<NUM2>
in addition to the usual options.
With regards to the changes itself, this involved modifying the
parameters of an existing ioctl, and so we break binary compatibility,
but I don't think we have a policy of maintaining backward binary
compatibility, do we?
I have updated the compat ioctl code as well, so hopefully we won't
break in mixed 32-64 bit environments.
I have tested this out with various buffer sizes and counts on 32 bit
platforms only!
That said, I haven't done a comprehensive testing..so there may still be bugs..
Please review it and let me know if this looks ok.
BTW: patch is against pvfs-2.6.0..sorry abut that.
cvs ports are firewalled off at work and my internet at home is
temporarily not working.
thanks,
Murali


On 11/29/06, Murali Vilayannur <[EMAIL PROTECTED]> wrote:
Hi Phil,
Thanks for running these tests.
I think this buffer size will be dependant on the machine configuration right?
If we work out a simple formula for the buffer size based on say
memory b/w (and/or latency), network b/w (and/or latency), we could
plug that in as a sane default (bandwidth .
I did not realize that this setting will have such a noticable effect
on performance.
I can work on a patch to change these settings at runtime.
thanks,
Murali

> >> - single client
> >> - 16 servers
> >> - gigabit ethernet
> >> - read/write tests, with 40 GB files
> >> - using reads and writes of 100 MB each in size
> >> - varying number of processes running concurrently on the client
> >>
> >> This test application can be configured to be run with multiple
> >> processes and/or multiple client nodes.  In this case we kept
> >> everything on a single client to focus on bottlenecks on that side.
> >>
> >> What we were looking at was the kernel buffer settings controlled in > >> pint-dev-shared.h. By default PVFS2 uses 5 buffers of 4 MB each.
> >> After experimenting for a while, we made a few observations:
> >>
> >> - increasing the buffer size helped performance
> >> - using only 2 buffers (rather than 5) was sufficient to saturate the > >> client when we were running multiple processes; adding more made only
> >> a marginal difference
> >>
> >> We found good results using 2 32MB buffers.  Here are some
> >> comparisons between the standard settings and the 2 x 32MB
> >> configuration:
> >>
> >> results for RHEL4 (2.6 kernel):
> >> ------------------------------
> >> 5 x 4MB, 1 process: 83.6 MB/s
> >> 2 x 32MB, 1 process: 95.5 MB/s
> >>
> >> 5 x 4MB, 5 processes: 107.4 MB/s
> >> 2 x 32MB, 5 processes: 111.2 MB/s
> >>
> >> results for RHEL3 (2.4 kernel):
> >> -------------------------------
> >> 5 x 4MB, 1 process: 80.5 MB/s
> >> 2 x 32MB, 1 process: 90.7 MB/s
> >>
> >> 5 x 4MB, 5 processes: 91 MB/s
> >> 2 x 32MB, 5 processes: 103.5 MB/s
> >>
> >>
> >> A few comments based on those numbers:
> >>
> >> - on 3 out of 4 tests, we saw a 13-15% performance improvement by
> >> going to 2 32 MB buffers
> >> - the remaining test (5 process RHEL4) probably did not see as much > >> improvement because we maxed out the network. In the past, netpipe
> >> has shown that we can get around 112 MB/s out of these nodes.
> >> - the RHEL3 nodes are on a different switch, so it is hard to say how > >> much of the difference from RHEL3 to RHEL4 is due to network topology
> >> and how much is due to the kernel version
> >>
> >> It is also worth noting that even with this tuning, the single
> >> process tests are about 14% slower than the 5 process tests. I am > >> guessing that this is due to a lack of pipelining, probably caused by
> >> two things:
> >> - the application only submitting one read/write at a time
> >> - the kernel module itself serializing when it breaks reads/ writes
> >> into buffer sized chunks
> >>
> >> The latter could be addressed by either pipelining the I/O through
> >> the bufmap interface (so that a single read or write could keep
> >> multiple buffers busy) or by going to a system like Murali came up > >> with for memory transfers a while back that isn't limited by buffer
> >> size.
> >>
> >> It would also be nice to have a way to set these buffer settings
> >> without recompiling- either via module options or via pvfs2- client- > >> core command line options. For the time being we are going to hard > >> code our tree to run with the 32 MB buffers. The 64 MB of RAM that > >> this uses up (vs. 20 MB with the old settings) doesn't really matter
> >> for our standard node footprint.
> >>
> >> -Phil
> >> _______________________________________________
> >> Pvfs2-developers mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2- developers
> >>
> >
>
> _______________________________________________
> Pvfs2-developers mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2- developers
>

<kmod-bufsizes.patch>

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to