On Tue, September 30, 2008 16:43, Andrew Jorgensen wrote: > Okay folks, I'm going out on a limb and admitting to some ignorance > here. Suppose I have a high load average on a server, let's say 20, > how do I tell what's really going on? I understand that load means > that there are processes waiting for some resource but how do I see > what resources they are waiting for? We don't want to go buy more RAM > and then find out that we had plenty of RAM, for instance.
First, understand how loadavg is calculated: "The load average numbers give the number of jobs in the run queue (state R) or waiting for disk I/O (state D) averaged over 1, 5, and 15 minutes." In a nutshell, the problem is usually related to CPU or IO problems. First, open up 'top' and look for a large number of jobs eating up CPU. That can indicated the issue. If there aren't 20+ jobs all trying to eat up the CPU (as is often the case) then you are having IO issues. Keep in mind that IO issues can be caused by LOTS of things. Here is what you ask: 1) Do I have any network attached storage? If so, are there a lot of processes trying to access these shares? 2) Is my disk subsystem behaving correctly (use 'iostat -x 10' to find out)? 3) Do I have enough iops performance in my disk sub system to satisfy demand? A good way to test this is using iostat, and also looking at the iowait CPU value. Depending on your answers, the solutions vary. -Ryan /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
