Hi Phil,
Thanks for running these tests.
I think this buffer size will be dependant on the machine configuration right?
If we work out a simple formula for the buffer size based on say
memory b/w (and/or latency), network b/w (and/or latency), we could
plug that in as a sane default (bandwidth .
I did not realize that this setting will have such a noticable effect
on performance.
I can work on a patch to change these settings at runtime.
thanks,
Murali

>> - single client
>> - 16 servers
>> - gigabit ethernet
>> - read/write tests, with 40 GB files
>> - using reads and writes of 100 MB each in size
>> - varying number of processes running concurrently on the client
>>
>> This test application can be configured to be run with multiple
>> processes and/or multiple client nodes.  In this case we kept
>> everything on a single client to focus on bottlenecks on that side.
>>
>> What we were looking at was the kernel buffer settings controlled  in
>> pint-dev-shared.h.  By default PVFS2 uses 5 buffers of 4 MB  each.
>> After experimenting for a while, we made a few observations:
>>
>> - increasing the buffer size helped performance
>> - using only 2 buffers (rather than 5) was sufficient to saturate  the
>> client when we were running multiple processes; adding more  made only
>> a marginal difference
>>
>> We found good results using 2 32MB buffers.  Here are some
>> comparisons between the standard settings and the 2 x 32MB
>> configuration:
>>
>> results for RHEL4 (2.6 kernel):
>> ------------------------------
>> 5 x 4MB, 1 process: 83.6 MB/s
>> 2 x 32MB, 1 process: 95.5 MB/s
>>
>> 5 x 4MB, 5 processes: 107.4 MB/s
>> 2 x 32MB, 5 processes: 111.2 MB/s
>>
>> results for RHEL3 (2.4 kernel):
>> -------------------------------
>> 5 x 4MB, 1 process: 80.5 MB/s
>> 2 x 32MB, 1 process: 90.7 MB/s
>>
>> 5 x 4MB, 5 processes: 91 MB/s
>> 2 x 32MB, 5 processes: 103.5 MB/s
>>
>>
>> A few comments based on those numbers:
>>
>> - on 3 out of 4 tests, we saw a 13-15% performance improvement by
>> going to 2 32 MB buffers
>> - the remaining test (5 process RHEL4) probably did not see as much
>> improvement because we maxed out the network.  In the past, netpipe
>> has shown that we can get around 112 MB/s out of these nodes.
>> - the RHEL3 nodes are on a different switch, so it is hard to say  how
>> much of the difference from RHEL3 to RHEL4 is due to network  topology
>> and how much is due to the kernel version
>>
>> It is also worth noting that even with this tuning, the single
>> process tests are about 14% slower than the 5 process tests.  I am
>> guessing that this is due to a lack of pipelining, probably caused  by
>> two things:
>> - the application only submitting one read/write at a time
>> - the kernel module itself serializing when it breaks reads/writes
>> into buffer sized chunks
>>
>> The latter could be addressed by either pipelining the I/O through
>> the bufmap interface (so that a single read or write could keep
>> multiple buffers busy) or by going to a system like Murali came up
>> with for memory transfers a while back that isn't limited by buffer
>> size.
>>
>> It would also be nice to have a way to set these buffer settings
>> without recompiling- either via module options or via pvfs2-client-
>> core command line options.  For the time being we are going to hard
>> code our tree to run with the 32 MB buffers.  The 64 MB of RAM that
>> this uses up (vs. 20 MB with the old settings) doesn't really  matter
>> for our standard node footprint.
>>
>> -Phil
>> _______________________________________________
>> Pvfs2-developers mailing list
>> [email protected]
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
>>
>

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to