Re: [BBLISA] Horrid System Performance with NFS and copies.

Dewey Sasser Mon, 23 Nov 2009 13:35:01 -0800

Richard "Doc" Kinne wrote:
> 1) I can't seem to copy anything with regard to this machine without
> its load average going through the roof. An scp, even a cp will drive
> the computer's load average to between 12 and 18. Copying a large
> file, or doing a mysqlhotcopy, will make the load average slowly
> climb, with some spikes up to that level. I can't think that's right.
> Not for something with 4 processors.
I saw similar things in a RHEL 5.2 system I used as an NFS file server. 
My fix was to increase my available NFS threads.  The default of 8
threads wasn't cutting it for a file server -- the threads would spend
all of their time in I/O wait, thus starving other NFS clients when the
disk RAID controller could have fulfilled requests through the magic of
NCQ and caching.


Check your nfsstats to see if you're spending a lot of time with all
threads fully occupied -- e.g. all threads are waiting on I/O.  If so
you can increase.

In my case, my server was doing nothing else other than NFS service so I
increased it to 128 available threads and things worked well until we
started hitting bugs in the LVM2 layer (note:  I've not hit those
problem on other distros, but I've not pushed other distros quite as
hard).  "Load" (i.e. the number of active processes) went up a huge
amount, but CPU consumption stayed quite reasonable.

Note that my failure mode, other than LVM2 occasionally hanging an
entire block device until reboot, was that performance suddenly "fell
off a cliff" -- i.e. a small increment in load caused a huge performance
problem, and this resulted in a bad performance feedback system. 
However, said "cliff" was at least 1200 IOps.

We were much more on the "more IOps" rather than "more throughput"
side.  If you're actually doing high levels of throughput (as opposed to
operations per second) you could also look at NFS block size and
ultimately network jumbo frames. 

Another thing to consider is that the performance of RAID5 is poor when
write sizes are blow the RAID strip size.  At large block writes RAID5
performance is N/(N-1) where at small block writes it's something like
N/3 (NOTE:  This is performance relative to a RAID0 of the same disks. 
2nd NOTE:  I may be misremembering the exact equations but they're in
the right ballpark:  large block performance is 1 disk smaller, small
write a fraction based on RAID algorithm.)  I saw my 1200 IOPs *after* I
moved to RAID 1/0 on 6x750GB SATA and 6x1TB SAS.  Moving from RAID5 to
RAID 1/O was a very noticeable performance increase.

--
Dewey

_______________________________________________
bblisa mailing list
[email protected]
http://www.bblisa.org/mailman/listinfo/bblisa

Re: [BBLISA] Horrid System Performance with NFS and copies.

Reply via email to