I've setup a simple (well, I copied it from someone else and modified it) to monitor stale NFS mounts. Some preliminary testing seemed to go okay but this problem crept up on me this weekend. The script is as follows:
#!/usr/bin/perl if (@ARGV < 1) { print "Usage:\n"; print "$0 <file to check with absolute path>\n"; exit 1; } eval { local $SIG{ALRM} = sub {die "alarm\n"}; alarm 2; $test = `ls @ARGV[0]`; alarm 0; }; if ($@) { die unless $@ eq "alarm\n"; # Timed out - error exit 1; } else { # Okay exit 0; } However, on the machine that experienced the problem `ps aux` showed several dead processes. Shouldn't the alarm() function have exited? I call the script as follows: # new_nfs_check.pl /opt/auctions/config/.DoNotDelete The `ps aux` output follows: 202 22634 0.0 0.0 2752 732 pts/0 D Nov02 0:00 ls /opt/auctions/config/.DoNotDelete 202 23728 0.0 0.0 2752 732 pts/0 D Nov02 0:00 ls /opt/auctions/config/.DoNotDelete 202 24745 0.0 0.0 2752 732 pts/0 D Nov02 0:00 ls /opt/auctions/config/.DoNotDelete 202 26054 0.0 0.0 2748 728 pts/0 D 00:08 0:00 ls /opt/auctions/config/.DoNotDelete 202 26959 0.0 0.0 2748 728 pts/0 D 00:18 0:00 ls /opt/auctions/config/.DoNotDelete 202 27742 0.0 0.0 2752 732 pts/0 D 00:28 0:00 ls /opt/auctions/config/.DoNotDelete 202 28748 0.0 0.0 2748 728 pts/0 D 00:38 0:00 ls /opt/auctions/config/.DoNotDelete 202 29767 0.0 0.0 2748 728 pts/0 D 00:48 0:00 ls /opt/auctions/config/.DoNotDelete 202 30410 0.0 0.0 2748 728 pts/0 D 00:58 0:00 ls /opt/auctions/config/.DoNotDelete 202 31508 0.0 0.0 2748 728 pts/0 D 01:08 0:00 ls /opt/auctions/config/.DoNotDelete 202 31635 0.0 0.0 2748 728 pts/0 D 01:10 0:00 ls /opt/auctions/config/.DoNotDelete 202 31648 0.0 0.0 2752 732 pts/0 D 01:10 0:00 ls /opt/auctions/config/.DoNotDelete Anyone have ideas? This node is a SLES9 box (which, as I understand has issues with either the kernel or nfs-utils). Matt