> Date: Mon, 12 Mar 2012 22:39:31 +0100 (CET) > From: Mark Kettenis <mark.kette...@xs4all.nl> > > > Date: Sat, 25 Feb 2012 09:55:57 +0100 > > From: Paul de Weerd <we...@weirdnet.nl> > > > > I recently got a v215 from a friend and have installed OpenBSD on it. > > Occassionally, it will panic during boot. This happened during > > install and I see it now during regular reboots. I can pretty much > > reproduce this at will with a couple of reboots. > > > > Could this be faulty hardware ? To reset the ALOM password, I > > installed Solaris 10 (took an eternity) and that never showed any > > problems, but I guess that doesn't prove much. > > > > First the panic and then full dmesg (from a succesful boot) are > > included below. > > I doubt this is faulty hardware. I've seen similar reports for a > v445, which has the same crappy Acer Labs pciide(4) controller. I > fear that the wdc.c changes made in April 2011 introduced this > behaviour.
So thanks to Paul giving me access to the machine in question I've been able to figure out what's going wrong here. The data error always happens when running wdcintr() for channel 1. Now on these machines we have the following line in dmesg ... pciide0: channel 1 disabled (no drives) ... indicating that there is no actual hardware connected to channel 1. As a result of this we skip further initialization of the channel. Therefore it shouldn't be a terrible surprise that the chip doesn't like it when we try to read registers associated with this channel. On crappy PC hardware this won't be noticed, but on sparc64 this results in an unrecoverable fault. The solution is easy. We shouldn't be calling wdcintr() for a channel that isn't properly initialized. ok? Index: pciide.c =================================================================== RCS file: /cvs/src/sys/dev/pci/pciide.c,v retrieving revision 1.337 diff -u -p -r1.337 pciide.c --- pciide.c 15 Jan 2012 15:16:23 -0000 1.337 +++ pciide.c 13 Mar 2012 18:54:50 -0000 @@ -1838,6 +1838,9 @@ pciide_pci_intr(void *arg) if (cp->compat) continue; + if (cp->hw_ok == 0) + continue; + if (pciide_intr_flag(cp) == 0) continue;