Look below: load over 7 and no processes take much CPU.Yuri 7.2-PRERELEASE, 32-bit on i7-920. ------------------------------------------------------------last pid: 93192; load averages: 7.68, 6.27, 4.61 up 2+03:11:29 20:25:24204 processes: 9 running, 193 sleeping, 1 stopped, 1 zombie CPU: 5.3% user, 0.0% nice, 0.0% system, 0.0% interrupt, 94.7% idle Mem: 867M Active, 1684M Inact, 279M Wired, 65M Cache, 112M Buf, 92M Free Swap: 16G Total, 142M Used, 16G Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 60032 yuri 1 46 0 285M 183M select 0 41:15 0.59% Xorg60400 yuri 1 4 0 12576K 9144K kqread 4 29:44 0.00% wineserver 92982 yuri 1 44 0 53012K 16800K CPU3 3 18:50 0.00% kdeinit4 92986 yuri 1 44 0 53012K 16800K CPU7 7 18:48 0.00% kdeinit4 92988 yuri 1 107 0 53012K 16840K CPU6 6 17:22 0.00% kdeinit460104 yuri 1 44 0 132M 45860K select 0 16:58 0.00% kwin92984 yuri 1 117 0 53012K 16800K RUN 5 14:56 0.00% kdeinit460096 yuri 1 44 0 89732K 30040K select 4 10:10 0.00% kded493141 yuri 1 53 0 53012K 16800K CPU5 5 3:52 0.00% kdeinit4 93139 yuri 1 44 0 53012K 16800K CPU1 1 3:30 0.00% kdeinit4 60174 yuri 1 44 0 3168K 1400K select 0 1:28 0.00% ksysguardd450 root 1 4 0 3128K 800K select 4 0:44 0.00% dhclient1131 messagebus 1 4 0 3344K 1384K select 4 0:40 0.00% dbus-daemon
Sure. This is not an uncommon occurrence really. The load average is the number of processes in the queue for a CPU time slice averaged over 5, 10 or 15 minutes. For multi-core systems the LA is scaled by the number of cores so a LA of 1.0 means all cores have active processes pretty much continually. Now, you might think that an active process will take the CPU utilisation to 100%, but that is not necessarily so. Some numerical applications can do that, but purely CPU bound processes are relatively uncommon in everyday usage. In actuality what happens is that the processor will need to retrieve data from somewhere to operate on. There's a hierarchy of data stores of various speeds (latency, rather than bandwidth): L1 Cache > L2 Cache > L3 Cache > Main RAM > Disk > NetworkWhere the L1 Cache is accessible in a few clock ticks (nanoseconds), Main RAM can take microseconds to access, disk can take milliseconds to access,
and Network can take 10 -- 1000s of milliseconds. Or in other words, about 9 orders of magnitude difference. So when the data you need to process is too big to fit in the fastest caches, or when it comes from a particularly slow location or when you have a lot of active processes causing context switches, then the CPU core will be making frequent IO requestsand spending time waiting for them to be fulfilled.
Now, for sources like disks and network where the retrieval is much slower than the typical timescale of events on the CPU the process will yield the CPU to something else and only get a new timeslice once the IO request has been fulfilled. For an access to main RAM however that form of yielding is less likely. Consequently the CPU can end up waiting for 100s of clock cycles until it gets some bytes to process. In the mean time, other processes are also sitting in the queue wanting CPU time slices -- hence the high LA with low CPU utilization. Scheduling CPU timeslices to make maximum use of available resources is the difference between a really performant OS and a disaster. A good scheduler is the critical central piece of code around which the rest of an OS can be constructed. Combine that with the complexity of having multiple cores, and that threads of execution sometimes have to be moved to different cores, and on other occasions sometimes need to stick to the same core in order to make best use of resources and you will start to appreciate quite how hard it is to write a good scheduler. Unsurprisingly, the design of such things is a matter of fairly impassioned debate amongst the rarified circle of people capable of writing them. That sort of argument was the genesis of the FreeBSD / DragonflyBSD fork a few years back. You can rest assured though that FreeBSD certainly does have one of the very best schedulers currently available and it is specifically targeted at getting the best out of the sort of multicore CPUs available nowadays. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate Kent, CT11 9PW
Description: OpenPGP digital signature