I've setup a simple (well, I copied it from someone else and modified it) to
monitor stale NFS mounts.  Some preliminary testing seemed to go okay but
this problem crept up on me this weekend.  The script is as follows:

#!/usr/bin/perl

if (@ARGV < 1) {
        print "Usage:\n";
        print "$0 <file to check with absolute path>\n";
exit 1;
}

eval {
local $SIG{ALRM} = sub {die "alarm\n"};
alarm 2;
$test = `ls @ARGV[0]`;
alarm 0;
};

if ($@) {
die unless $@ eq "alarm\n";
# Timed out - error
exit 1;
} else {
# Okay
exit 0;
}


However, on the machine that experienced the problem `ps aux` showed several
dead processes.  Shouldn't the alarm() function have exited?  I call the
script as follows:

# new_nfs_check.pl /opt/auctions/config/.DoNotDelete

The `ps aux` output follows:

202      22634  0.0  0.0   2752   732 pts/0    D    Nov02   0:00 ls
/opt/auctions/config/.DoNotDelete
202      23728  0.0  0.0   2752   732 pts/0    D    Nov02   0:00 ls
/opt/auctions/config/.DoNotDelete
202      24745  0.0  0.0   2752   732 pts/0    D    Nov02   0:00 ls
/opt/auctions/config/.DoNotDelete
202      26054  0.0  0.0   2748   728 pts/0    D    00:08   0:00 ls
/opt/auctions/config/.DoNotDelete
202      26959  0.0  0.0   2748   728 pts/0    D    00:18   0:00 ls
/opt/auctions/config/.DoNotDelete
202      27742  0.0  0.0   2752   732 pts/0    D    00:28   0:00 ls
/opt/auctions/config/.DoNotDelete
202      28748  0.0  0.0   2748   728 pts/0    D    00:38   0:00 ls
/opt/auctions/config/.DoNotDelete
202      29767  0.0  0.0   2748   728 pts/0    D    00:48   0:00 ls
/opt/auctions/config/.DoNotDelete
202      30410  0.0  0.0   2748   728 pts/0    D    00:58   0:00 ls
/opt/auctions/config/.DoNotDelete
202      31508  0.0  0.0   2748   728 pts/0    D    01:08   0:00 ls
/opt/auctions/config/.DoNotDelete
202      31635  0.0  0.0   2748   728 pts/0    D    01:10   0:00 ls
/opt/auctions/config/.DoNotDelete
202      31648  0.0  0.0   2752   732 pts/0    D    01:10   0:00 ls
/opt/auctions/config/.DoNotDelete


Anyone have ideas?  This node is a SLES9 box (which, as I understand has
issues with either the kernel or nfs-utils).

Matt

Reply via email to