I managed to get into the office early today to boot into single user mode and run fsck.
Sure enough, the /var file system had some serious errors that are now fixed. All the other file systems seemed fine. Hopefully the problem is resolved; I should only have to wait a couple of days before I can confirm. On 18 October 2013 13:19, Peter Green <[email protected]> wrote: > Thanks Otto, > > When it's appropriate, I will do as suggested. :) > > Pete > > > On 18 October 2013 12:19, Otto Moerbeek <[email protected]> wrote: > >> On Fri, Oct 18, 2013 at 10:57:56AM +0100, Peter Green wrote: >> >> > Hi, >> > >> > I have a Dell R200 running OpenBSD 4.9, which operates as the edge >> router >> > for our office. It's been working for a long time without any >> intervention >> > required until recently, when it began exhibiting kernel panics. >> > >> > At first, I thought it was a random occurrence, but I dutifully took >> screen >> > shots of trace and ps outputs via dbb and rebooted the box. Since that >> > time, it's happened on two or three further occasions, but >> unfortunately, I >> > wasn't the one in the office and so no screen caps were taken. >> > >> > Today, I arrived at the office to find the system panicked again, so I >> took >> > screen caps and compared them to the first time it happened. I'm not >> > experienced in debugging BSD kernel panicks, but it appears that the >> same >> > function is causing the problem: ffs_blkfree() >> > >> > My initial searches online seem to suggest this is potentially a problem >> > with the disk(s); perhaps a bad block. The machine runs a Symbios Logic >> > SAS1068E hardware RAID controller, which appears to the OS as a device >> > mpi0. Running bioctl mpi0 shows the following: >> > >> > # bioctl mpi0 >> > Volume Status Size Device >> > mpi0 0 Online 249376538112 sd0 RAID1 >> > 0 Online 249999999488 0:8.0 noencl <ATA ST3250310NS >> > MA08> >> > 1 Online 249999999488 0:1.0 noencl <ATA ST3250310NS >> > MA08> >> > >> > >> > So, the RAID controller seems to think the underlying disks are ok. >> > >> > Here are the links for the dbb output I grabbed on both occasions: >> > >> > https://www.dropbox.com/s/vmvuzn3qg2af85l/2013-10-10%2008.53.35.jpg >> > >> > https://www.dropbox.com/s/r9jaofaotvjr6gx/2013-10-10%2008.53.41.jpg >> > >> > https://www.dropbox.com/s/creu48dcb48yirh/2013-10-10%2008.53.49.jpg >> > >> > https://www.dropbox.com/s/w0h4sjkkfe5ns1j/2013-10-10%2008.56.17.jpg >> > >> > https://www.dropbox.com/s/5ol10lmaznii3yp/2013-10-10%2008.56.30.jpg >> > >> > https://www.dropbox.com/s/154er8pans2dph5/2013-10-18%2009.29.51.jpg >> > >> > https://www.dropbox.com/s/aqte9poi8p4ezcp/2013-10-18%2009.30.21.jpg >> > >> > https://www.dropbox.com/s/lxl5l8vylavo64o/2013-10-18%2009.30.30.jpg >> > >> > https://www.dropbox.com/s/g2zf1fnk2zrvqml/2013-10-18%2009.30.34.jpg >> > >> > https://www.dropbox.com/s/wnpx6mh7uyrlht2/2013-10-18%2009.30.53.jpg >> > >> > https://www.dropbox.com/s/amf8z1s73g8ovxi/2013-10-18%2009.31.00.jpg >> > >> > https://www.dropbox.com/s/q0yf37n6wbr98cl/2013-10-18%2009.31.06.jpg >> > >> > I hope this helps. As I've stated, I suspect a hardware issue, but I'd >> just >> > like some further analysis from people more experienced than I. >> > >> > Thanks, >> > >> > Pete >> >> Could indeeed be hardware, but first make sure your filesystems are >> not corrupt: >> >> boot into single user more and force a check of all filesystems: fsck -f >> This exercises disk and memory as well. >> >> -Otto
