[Dai Ngo] wrote: >Henrik Johansen wrote: >> Henrik Johansen wrote: >>> Piyush Shivam wrote: >>>> On 08/05/09 15:53, Henrik Johansen wrote: >>>>> Hi list, >>>>> >>>>> I have 2 servers which are directly connected via ixgbe based nics, >>>>> both >>>>> running OpenSolaris 2009.06. >>>>> >>>>> The actual network connection seems fine, iperf reports ~6.3 Gbits/sec >>>>> in terms of throughput and nicstat seems to agree that the nics are >>>>> ~63% >>>>> utilized. >>>>> Iperf : henrik at opensolaris:~# ./iperf-2.0.4/src/iperf -c 10.10.10.2 >>>>> -N -t 40 >>>>> ------------------------------------------------------------ >>>>> Client connecting to 10.10.10.2, TCP port 5001 >>>>> TCP window size: 391 KByte (default) >>>>> ------------------------------------------------------------ >Can you verify the TCP window size on both client and server system with >this command: > ># ndd -get /dev/tcp tcp_xmit_hiwat > ># ndd -get /dev/tcp tcp_recv_hiwat
Both client and server have both set to 400000. >-Dai >>>>> [ 3] local 10.10.10.3 port 56583 connected with 10.10.10.2 port 5001 >>>>> [ ID] Interval Transfer Bandwidth >>>>> [ 3] 0.0-40.0 sec 29.3 GBytes 6.29 Gbits/sec >>>>> >>>>> Nicstat : henrik at naz01:/tmpfs# /export/home/henrik/nicstat -i ixgbe0 2 >>>>> Time Int rKB/s wKB/s rPk/s wPk/s rAvs wAvs %Util Sat >>>>> 21:13:02 ixgbe0 776175 1222.1 96592.9 18961.7 8228.4 66.00 63.7 >>>>> 83018.3 >>>>> 21:13:04 ixgbe0 773081 1217.2 96221.2 18885.3 8227.2 66.00 63.4 >>>>> 82717.5 >>>>> >>>>> To measure the NFS throughput over this link I have created a tmpfs >>>>> filesystem on the server to avoid the synchronous writes issue as much >>>>> as possible. >>>>> >>>>> Client : henrik at opensolaris:~# mount | grep /nfs >>>>> /nfs on 10.10.10.2:/tmpfs >>>>> remote/read/write/setuid/devices/forcedirectio/xattr/dev=4dc0007 on >>>>> Wed Aug 5 20:06:25 2009 >>>>> >>>>> Server : >>>>> henrik at naz01:/tmpfs# share | grep tmpfs >>>>> - /tmpfs sec=sys,root=10.10.10.3 "" >>>>> henrik at naz01:/tmpfs# mount | grep tmpfs >>>>> /tmpfs on swap read/write/setuid/devices/xattr/dev=4b80006 on Wed >>>>> Aug 5 21:59:31 2009 >>>>> >>>>> I have set the 'forcedirectio' option on the client mount to ensure >>>>> that >>>>> the clients cache gets circumvented. >>>>> >>>>> Using the randomwrite microbenchmark in filebench ($filesize set to >>>>> 1gb) >>>>> I get : >>>>> Local on tmpfs : >>>>> IO Summary: 5013937 ops, 82738.5 ops/s, (0/82738 r/w) 646.4mb/s, >>>>> 71us cpu/op, 0.0ms latency >>>>> >>>>> Tmpfs over NFS : >>>>> IO Summary: 383488 ops, 6328.2 ops/s, (0/6328 r/w) 49.4mb/s, 65us >>>>> cpu/op, 0.2ms latency >>>>> >>>>> These are 2 fully populated 4 socket machines - why the extremely low >>>>> transfer speed ? >>>> randomwrite.f is a single threaded workload (assuming you are using >>>> randomwrite.f filebench workload), which may not be sending enough >>>> work for the server to begin with. If you drive the number of >>>> threads in the workload higher (modify the nthreads variable in >>>> randomwrite.f), you should see better numbers, unless there is some >>>> other limits in the system. You can examine the CPU utilization of >>>> the client (and the server) machine to make sure that the client is >>>> busy sending work to the server. >>> >>> It indeed is the randomwrite.f workload. >>> >>> Now, using 256 threads I can actually push the numbers : >>> IO Summary: 2429950 ops, 40099.1 ops/s, (0/40099 r/w) 313.2mb/s, >>> 75us cpu/op, 5.9ms latency >>> >>> CPU utilisation on the client is about 25 percent - the server hovers >>> around 50%. >>> Sadly this is not what I wanted to do - I need to test and measure the >>> maximum ramdomwrite / randomread throughput over very few NFS >>> connections since this will be the production workload for these >>> machines. >>> >>> If I understand you correctly then filebench is the culprit and simply >>> not pushing the server hard enough ? >>> Any ideas about how I can measure a light threads scenario ? >> >> Well, I have now tested NFS throughput with all I can think of. >> >> I have tried cp,mv,dd and tar from or to a tmpfs filesystem over NFS and >> I can get nowhere near the speed of a local operation. >> >> Using NFSv3 does not make a difference either. >> >> All of my tests were repeated several times and they all show the same : >> CPU utilization is very low, nic utilization is very low and throughput >> over NFS is very low. >> >> An FTP upload gives me ~620 mb/s which is about as fast as local speed >> - the most I have been able to write via NFS is 170 mb/s. >> >> Playing around with different NFS related tunables and mount options has >> yielded nothing so far. >> I have opened a case with Sun support - let's hope that they can shed >> some light on this. >> >>>> -Piyush >>> >>> -- >>> Med venlig hilsen / Best Regards >>> >>> Henrik Johansen >>> >>> _______________________________________________ >>> nfs-discuss mailing list >>> nfs-discuss at opensolaris.org >> > -- Med venlig hilsen / Best Regards Henrik Johansen