I managed to get into the office early today to boot into single user mode
and run fsck.

Sure enough, the /var file system had some serious errors that are now
fixed. All the other file systems seemed fine. Hopefully the problem is
resolved; I should only have to wait a couple of days before I can confirm.


On 18 October 2013 13:19, Peter Green <[email protected]> wrote:

> Thanks Otto,
>
> When it's appropriate, I will do as suggested. :)
>
> Pete
>
>
> On 18 October 2013 12:19, Otto Moerbeek <[email protected]> wrote:
>
>> On Fri, Oct 18, 2013 at 10:57:56AM +0100, Peter Green wrote:
>>
>> > Hi,
>> >
>> > I have a Dell R200 running OpenBSD 4.9, which operates as the edge
>> router
>> > for our office. It's been working for a long time without any
>> intervention
>> > required until recently, when it began exhibiting kernel panics.
>> >
>> > At first, I thought it was a random occurrence, but I dutifully took
>> screen
>> > shots of trace and ps outputs via dbb and rebooted the box. Since that
>> > time, it's happened on two or three further occasions, but
>> unfortunately, I
>> > wasn't the one in the office and so no screen caps were taken.
>> >
>> > Today, I arrived at the office to find the system panicked again, so I
>> took
>> > screen caps and compared them to the first time it happened. I'm not
>> > experienced in debugging BSD kernel panicks, but it appears that the
>> same
>> > function is causing the problem: ffs_blkfree()
>> >
>> > My initial searches online seem to suggest this is potentially a problem
>> > with the disk(s); perhaps a bad block. The machine runs a Symbios Logic
>> > SAS1068E hardware RAID controller, which appears to the OS as a device
>> > mpi0. Running bioctl mpi0 shows the following:
>> >
>> > # bioctl mpi0
>> > Volume  Status               Size Device
>> >  mpi0 0 Online       249376538112 sd0     RAID1
>> >       0 Online       249999999488 0:8.0   noencl <ATA     ST3250310NS
>> > MA08>
>> >       1 Online       249999999488 0:1.0   noencl <ATA     ST3250310NS
>> > MA08>
>> >
>> >
>> > So, the RAID controller seems to think the underlying disks are ok.
>> >
>> > Here are the links for the dbb output I grabbed on both occasions:
>> >
>> > https://www.dropbox.com/s/vmvuzn3qg2af85l/2013-10-10%2008.53.35.jpg
>> >
>> > https://www.dropbox.com/s/r9jaofaotvjr6gx/2013-10-10%2008.53.41.jpg
>> >
>> > https://www.dropbox.com/s/creu48dcb48yirh/2013-10-10%2008.53.49.jpg
>> >
>> > https://www.dropbox.com/s/w0h4sjkkfe5ns1j/2013-10-10%2008.56.17.jpg
>> >
>> > https://www.dropbox.com/s/5ol10lmaznii3yp/2013-10-10%2008.56.30.jpg
>> >
>> > https://www.dropbox.com/s/154er8pans2dph5/2013-10-18%2009.29.51.jpg
>> >
>> > https://www.dropbox.com/s/aqte9poi8p4ezcp/2013-10-18%2009.30.21.jpg
>> >
>> > https://www.dropbox.com/s/lxl5l8vylavo64o/2013-10-18%2009.30.30.jpg
>> >
>> > https://www.dropbox.com/s/g2zf1fnk2zrvqml/2013-10-18%2009.30.34.jpg
>> >
>> > https://www.dropbox.com/s/wnpx6mh7uyrlht2/2013-10-18%2009.30.53.jpg
>> >
>> > https://www.dropbox.com/s/amf8z1s73g8ovxi/2013-10-18%2009.31.00.jpg
>> >
>> > https://www.dropbox.com/s/q0yf37n6wbr98cl/2013-10-18%2009.31.06.jpg
>> >
>> > I hope this helps. As I've stated, I suspect a hardware issue, but I'd
>> just
>> > like some further analysis from people more experienced than I.
>> >
>> > Thanks,
>> >
>> > Pete
>>
>> Could indeeed be hardware, but first make sure your filesystems are
>> not corrupt:
>>
>> boot into single user more and force a check of all filesystems: fsck -f
>> This exercises disk and memory as well.
>>
>>         -Otto

Reply via email to