On Wed, Mar 30, 2011 at 03:07:13PM +0200, Walter Haidinger wrote:
> Am 29.03.2011 22:42, schrieb Claudio Jeker:
> > Here is a possible fix. The problem was that because of the way NFS uses
> > the socket API it did not turn of the sendbuffer scaling which reset the
> > size of the socket back to 17376 bytes which is a no go when a buffer of
> > more then 17k is generated by NFS. It is better to initialize the sb_wat
> > in soreserve() which is called by NFS and all attach functions.
> > 
> > Please test and report back.
> 
> Thanks for the patch. Glad to test it.
> 
> Well, the good news: No more lockups, neither in a VM nor on real hardware.
> 
> Everything is also pretty fine with larger buffers, but with small buffers
> (e.g. -o tcp,-r=512,-w=512), the system doesn't respond sometimes, like
> short freezes of a couple of seconds as if there is a pause while some
> buffers are emptied.

Buffers below 8k are stupid. For TCP just use 32k or even 64k. 512byte
buffers are silly. They get internally rounded up since the smallest
packet seems to be 512bytes data plus header. This will give you TCP send
and recv buffers of around 1200bytes. No wonder it is slow as hell.
 
> This is also visible in the "runtime", i.e. time stats to put a 16 MiB file
> (dd if=/dev/urandom of=/nfs/foo bs=4096 count=4096), first column is the
> buffer size used for the nfs mount:
> 
>   512:    1m9.07s real     0m0.01s user     0m1.13s system
>  1024:   0m20.13s real     0m0.00s user     0m1.23s system
>  2048:    0m5.59s real     0m0.00s user     0m1.13s system
>  4096:    0m2.07s real     0m0.00s user     0m0.86s system
>  8192:    0m1.41s real     0m0.00s user     0m0.91s system
> 16384:    0m1.19s real     0m0.00s user     0m0.82s system
> 32768:    0m1.11s real     0m0.00s user     0m0.76s system
> 
> Writing a 64 MiB file:
> 
>   512:    6m2.83s real      0m0.03s user     0m5.78s system
>  1024:          2m58.62s real     0m0.03s user     0m5.45s system
>  2048:    1m12.66s real     0m0.07s user     0m4.66s system
>  4096:    0m27.60s real     0m0.05s user     0m4.47s system
>  8192:    0m11.68s real     0m0.01s user     0m3.85s system
> 16384:    0m6.50s real      0m0.00s user     0m3.64s system
> 32768:    0m6.15s real      0m0.00s user     0m3.22s system
> 
> ktrace dumps for all dd runs are available, I can put 
> them somewhere for download if required.
> 

The default block size is 8k for smaller buffers the overhead of header
and round trip time is just too big.

-- 
:wq Claudio

Reply via email to