Am Donnerstag, 9. Januar 2003 17:02 schrieb Brian Tinsley:
>> I've been seeing the exact same thing on the same type of system in the
>> same situations. This has been causing all kinds of problems on our
>> clusters: the system live-locks for a minute or two, causes cluster
>> heartbeats to not be received, and falsely fails over when the system
>> recovers from the live-lock. The only thing I can find after the
>> live-lock is that the runtime for kswapd is abnormally high.
>> We started running sar (60 second collection interval) and were able to
>> capture some stats during this live-lock period. I've snipped some I
>> believe may be of interest. Note the missing stats between 03:59:43 and
>> 04:02:03
>> Oh BTW, this is on a stock 2.4.20 kernel (dual P3, 4GB), but I have seen
>> the same behavior on 2.4.19 and 2.4.17.

On Thu, Jan 09, 2003 at 05:42:51PM +0100, Dieter N?tzel wrote:
> I think you should have cc'ed Andrea Arcangeli <[EMAIL PROTECTED]>,
> LKM and try 2.4.20-aa1. Are you sure it is a ReiserFS and not a
> kernel thing?

There simply aren't enough scenarios for this to be a mystery. Both
-aa and 2.5.x should have something in there for it: memclass-related
buffer_head stuff in -aa, and bh-stripping + "bh-less" operation (for
ext2) in 2.5.x + fewer (if any) bh's outside of actual dirty data.

Bloat monitoring scripts attached, which might provide somewhat more
useful output to capture, though they certainly don't eliminate the
need for /proc/meminfo logging. I'll also see if some of the accounting
patches can be backported and send those to Marcelo and Andrea.


Bill
#!/usr/bin/awk -f
BEGIN {
        printf "%18s    %8s %8s %8s\n", "cache", "active", "alloc", "%util";
}

{
        if ($3 != 0.0) {
                pct  = 100.0 * $2 / $3;
                frac = (10000.0 * $2 / $3) % 100;
        } else {
                pct  = 100.0;
                frac = 0.0;
        }
        active = ($2 * $4)/1024;
        alloc  = ($3 * $4)/1024;
        if ((alloc - active) < 1.0) {
                pct  = 100.0;
                frac = 0.0;
        }
        printf "%18s: %8dKB %8dKB  %3d.%-2d\n", $1, active, alloc, pct, frac;
}
#!/bin/sh
while : ; do
        grep -v '^slabinfo' /proc/slabinfo      \
                | bloatmon                      \
                | sort -n -k 4,4                \
                | head -22
        sleep 5
        echo
done
#!/bin/sh

while true
do
        bloatmon < /proc/slabinfo \
                | sort -rn -k 3,3 \
                | head -22
        sleep 60
        echo
done

Reply via email to