On Thu, 7 Nov 2002, Nuno Silva wrote:

> My advice is: "upgrade" to 2.4.18 :)
> My current record with 2.4.19 with moderate load but high I/O is 12 days

Both my production vserver boxes suffer from the `hangs' Cathy has
described.

PIV 1.6Ghz, 1GB [no highmem], ext3, LVM, IDE Soft-"RAID"
2.4.19-pre7-ctx10-ide20020510 is my ``more reliable'' so far...

  75 days, 09:25:41 | Linux 2.4.19-pre7-ctx1  Sun Jul 14
   0 days, 09:21:45 | Linux 2.4.19-pre7-ctx1  Sat Sep 28
  38 days, 22:20:20 | Linux 2.4.19-pre7-ctx1  Sun Sep 29

On one occasion I found ``out of file handles'' in the terminal-server
scroll--now I monitor that.  As Cathy points out monitoring/logwriting (and
presumably the processes trying to do it) completely stop when it ends up
in this state.

PIII 700Mhz, 192MB [no highmem ;-)], ext2, md+SCSI, IDE
2.4.18ctx-10 is my ``less reliable'' box, vis:

   9 days, 22:42:04 | Linux 2.4.18ctx-10      Tue May 14
  56 days, 22:51:47 | Linux 2.4.18ctx-10      Fri May 24
  32 days, 21:26:16 | Linux 2.4.18ctx-10      Sat Jul 20
  13 days, 13:29:46 | Linux 2.4.18ctx-10      Thu Aug 22
  10 days, 05:49:08 | Linux 2.4.18ctx-10      Tue Sep 17
   3 days, 06:01:25 | Linux 2.4.18ctx-10      Thu Sep  5
   5 days, 20:31:13 | Linux 2.4.18ctx-10      Sun Sep  8
   4 days, 05:47:06 | Linux 2.4.18ctx-10      Mon Sep 30
  22 days, 17:39:13 | Linux 2.4.18ctx-10      Sat Oct  5
   9 days, 21:37:55 | Linux 2.4.18ctx-10      Mon Oct 28

This last one was a genuine Oops that spewed (not rebootable with [break] on
the serial console;  most [all?] of the rest (sadly too many to count...)
have been hangs where it returns ICMP request and half-opens TCP connections
and can be rebooted with sysreq from the serial console.

The softdog (kernel/userspace watchdog) cannot be persuaded to reboot the
machines when they end up in this state;  although the kernel--not having
received an update from userspace--should reboot!  And that is with
*everything* turned on (completely paranoid state) in the watchdog program.

        -Paul
-- 
Nottingham, GB



Reply via email to