Re: Performance question

Chuck Lever Mon, 18 Feb 2008 09:00:00 -0800

On Feb 18, 2008, at 4:39 AM, Font Bella wrote:

I tried TCP and async options, but I get poor performance in my
benchmarks (a dbench run with 10 clients). Below I tabulated the
outcome of my tests, which show that in my setting there is a huge
difference between sync and async, and udp/tcp. Any
comments/suggestions are warmly welcome.


I also tried setting 128 server threads as Chuck suggested, but this
doesn't seem to affect performance. This makes sense, since we only
have a dozen of clients.

Each Linux client mount point can generate up to 16 server requestsby default. A dozen clients each with a single mount point cangenerate 192 concurrent requests. So 128 server threads is not asoutlandish as you might think.

In this case, you are likely hitting some other bottleneck before theclients can utilize all the server threads.

About sync/async, I am not very concerned about corrupt data if the
cluster goes down, we do mostly computing, no crucial database
transactions or anything like that. Our users wouldn't mind some
degree of data corruption in case of power failure, but speed is
crucial.

The data corruption is silent. If it weren't, you could simplyrestore from a backup as soon as you recover from a server crash.Silent corruption spreads into your backed up data, and startscausing strange application errors, sometimes a long time after thecorruption first occurred.

Our network setting is just a dozen of servers connected to a switch.
Everything (adapters/cables/switch) is 1gigabit. We use ethernet
bonding to double networking speed.

Here are the test results. I didn't measure SYNC+UDP, since SYNC+TCP
already gives me very poor performance. Admittedly, my test is very
simple, and I should probably try something more complete, like
IOzone. But the dbench run seems to reproduce the bottleneck we've
been observing in our cluster.

I assume the dbench test is read and write only (little or nometadata activity like file creation and deletion). How closely doesdbench reflect your production workload?


I see from your initial e-mail that your server file system is:

> SAS 10k disks.
>
> Filesystem: ext3 over LVM.

Have you tried testing over NFS with a file system that resides on asingle physical disk? If you have done a read-only test versus awrite-only test, how do the numbers compare? Have you tested a rangeof write sizes, from small file writes v. writes to writing fileslarger than the server's memory?

********************** ASYNC option in server******************************


rsize,wsize          TCP                 UDP

1024                  24 MB/s            34 MB/s
2048                  35                 49
4096                  37                 75
8192                  40.4               35
16386                 40.2               19

As the size of the read and write requests increase, your UDPthroughput decreases markedly. This does indicate some packet loss,so TCP is going to provide consistent performance and much lower riskto data integrity as your network and client workloads increase.

You might try this test again and watch your clients' ethernetbandwidth and RPC retransmit rate to see what I mean. At the 16386setting, the UDP test may be pumping significantly more packets ontothe network, but is getting only about 20MB/s through. This willcertainly have some effect on other traffic on the network.

The first thing I check in these instances is that gigabit ethernetflow control is enabled in both directions on all interfaces (bothhost and switch).

In addition, using larger r/wsize settings on your clients means theserver can perform disk reads and writes more efficiently, which willhelp your server scale with increasing client workloads.

By examining your current network carefully, you might be able toboost the performance of NFS over both UDP and TCP. With bondedgigabit, you should be able to push network throughput past 200 MB/susing a test like iPerf which doesn't touch disks. Thus, at leastNFS reads from files already in the server's page cache ought to flyin this configuration.

********************** SYNC option in server******************************


rsize,wsize          TCP                 UDP

1024                  6 MB/s             ?? MB/s
2048                  7.44               ??
4096                  7.33               ??
8192                  7                  ??
16386                 7                  ??


--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
-
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Performance question

Reply via email to