Hi Phil, Thanks for running these tests. I think this buffer size will be dependant on the machine configuration right? If we work out a simple formula for the buffer size based on say memory b/w (and/or latency), network b/w (and/or latency), we could plug that in as a sane default (bandwidth . I did not realize that this setting will have such a noticable effect on performance. I can work on a patch to change these settings at runtime. thanks, Murali
>> - single client >> - 16 servers >> - gigabit ethernet >> - read/write tests, with 40 GB files >> - using reads and writes of 100 MB each in size >> - varying number of processes running concurrently on the client >> >> This test application can be configured to be run with multiple >> processes and/or multiple client nodes. In this case we kept >> everything on a single client to focus on bottlenecks on that side. >> >> What we were looking at was the kernel buffer settings controlled in >> pint-dev-shared.h. By default PVFS2 uses 5 buffers of 4 MB each. >> After experimenting for a while, we made a few observations: >> >> - increasing the buffer size helped performance >> - using only 2 buffers (rather than 5) was sufficient to saturate the >> client when we were running multiple processes; adding more made only >> a marginal difference >> >> We found good results using 2 32MB buffers. Here are some >> comparisons between the standard settings and the 2 x 32MB >> configuration: >> >> results for RHEL4 (2.6 kernel): >> ------------------------------ >> 5 x 4MB, 1 process: 83.6 MB/s >> 2 x 32MB, 1 process: 95.5 MB/s >> >> 5 x 4MB, 5 processes: 107.4 MB/s >> 2 x 32MB, 5 processes: 111.2 MB/s >> >> results for RHEL3 (2.4 kernel): >> ------------------------------- >> 5 x 4MB, 1 process: 80.5 MB/s >> 2 x 32MB, 1 process: 90.7 MB/s >> >> 5 x 4MB, 5 processes: 91 MB/s >> 2 x 32MB, 5 processes: 103.5 MB/s >> >> >> A few comments based on those numbers: >> >> - on 3 out of 4 tests, we saw a 13-15% performance improvement by >> going to 2 32 MB buffers >> - the remaining test (5 process RHEL4) probably did not see as much >> improvement because we maxed out the network. In the past, netpipe >> has shown that we can get around 112 MB/s out of these nodes. >> - the RHEL3 nodes are on a different switch, so it is hard to say how >> much of the difference from RHEL3 to RHEL4 is due to network topology >> and how much is due to the kernel version >> >> It is also worth noting that even with this tuning, the single >> process tests are about 14% slower than the 5 process tests. I am >> guessing that this is due to a lack of pipelining, probably caused by >> two things: >> - the application only submitting one read/write at a time >> - the kernel module itself serializing when it breaks reads/writes >> into buffer sized chunks >> >> The latter could be addressed by either pipelining the I/O through >> the bufmap interface (so that a single read or write could keep >> multiple buffers busy) or by going to a system like Murali came up >> with for memory transfers a while back that isn't limited by buffer >> size. >> >> It would also be nice to have a way to set these buffer settings >> without recompiling- either via module options or via pvfs2-client- >> core command line options. For the time being we are going to hard >> code our tree to run with the 32 MB buffers. The 64 MB of RAM that >> this uses up (vs. 20 MB with the old settings) doesn't really matter >> for our standard node footprint. >> >> -Phil >> _______________________________________________ >> Pvfs2-developers mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers >> > _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
