http://wwww.davygoat.com/software/meter/CPU_Stats.html

CPU Stats

BSD - vmstat

FreeBSD, OpenBSD and NetBSD give CPU statistics in the last three columns of the vmstat(8) command's output; the types of CPU activity given are user (us), system (sy) and idle (id). The -w wait option is used to output the figures every wait seconds.

The following example illustrates the use of vmstat - the last last three columns give the breakdown of percentage usage of CPU time :

$ vmstat -w 1
 procs      memory     page                    disks     faults      cpu
 r b w     avm   fre  flt  re  pi  po  fr  sr ad0 md0   in   sy  cs us sy id
 0 0 0   45944 69824   53   0   0   0  32   0   0   0  251 2204 371  4  2 94
 1 0 0   45944 69820   34   0   0   0  12   0   0   0  231  519 166  5  0 95
 0 0 0   45948 69816   31   0   0   0  12   0   0   0  230  517 180  4  1 95
 1 0 0   45948 69816   30   0   0   0  12   0   0   0  235  528 179  4  0 96
^C
$

Graphing the data is a fairly simple matter of extracting the last three columns, and displaying the user time in blue, and system time in red, the remaining idle time is left blank. The three percentages will not necessarily add up to 100, but the difference is only small (due presumeably to interrupt time, as nice time is included in the user time).

The iostat(8) gives the same information, along with the additional nice (ni) and interrupt (in) times, in its last five columns. While this would be more complete, vmstat has been chosen in preference, in order to make the most of vmstat while it's already running for memory stats, and because leaving CPU and TTY stats off iostat creates extra room for a fourth device.

Linux - /proc/stat

Linux provides cumulative counts for the number of jiffies spent in user, nice, system, and idle CPU mode since the system was booted, or since the jiffy count last wrapped round. A jiffy is a proverbial short amount of time, which is 1/100 second on most CPUs (1/1024 on Alphas).

The four jiffy counts are given in the /proc/stat file, on a line beginning with the word cpu. On multiprocessor systems, the counts are given for each cpu, each on a line beginning with the word cpun, where n is the zero-based CPU number, and the line starting with cpu contains the sums for all processors. The following example shows the results on a single-processor system :

$ grep '^cpu' /proc/stat
cpu  4525 5 2810 139863
$

The duration of a jiffy, and the cumulative number of jiffies, are not much use to us; what is important is the number of jiffies spent in each mode since the last time we had a look. We calculate the differences for each of the modes, and express each difference as a percentage of the total difference. Graphing the data is then again a fairly simple matter of displaying user time in blue, system time in red, and nice time in yellow, the remaining idle time being left blank.

What to conclude from the CPU meter

The CPU meter cannot be any more accurate than the figures given by the underlying vmstat command or /proc/stat file, and no guarantees are made for the accuracy of the graphing, given that rounding errors are likely occur.

Even so, the CPU meter does give a good indication of what your machine is up to, and how busy it is while it's running programs. Here are some observations I've made in my work environment with six people VNC'ing on a Linux box :

  1. User CPU (blue) goes way up during number crunching, or if a process is in a tight loop.

  2. A larger proportion of system time (red) is seen when invoking lots of system calls, or using memory-mapped files, for example when grepping through all files in the system.

  3. Massive MySQL database imports are often characterised by conspicuous niceness (yellow), indicating that the MySQL daemon is running at a low priority, which keeps the system nice and responsive as a whole.

  4. If the user and system CPU go up for longer than a fraction of a second because of database queries, chances are you're not using indexes properly, or your joins are too complicated.

  5. If large database jobs are taking a long time, but CPU usage is low, then you're probably being held up by an I/O bottleneck. If the CPU doesn't seem to be doing anything at all, you might have a lock contention of some sort.

  6. Grabbing a window by its corner, and stretching it repeatedly like an elastic band, is a really good way of hammering the processor. Executing a two million line .csv file really b*gg*rs the system...



Reply via email to