Re: armv7 panic

2016-10-17 Thread Jonathan Gray
On Sun, Oct 16, 2016 at 07:02:55PM +1100, Jonathan Gray wrote:
> On Sun, Oct 16, 2016 at 12:45:36AM -0700, Philip Guenther wrote:
> > On Sun, 16 Oct 2016, Matthieu Herrb wrote:
> > > my Sabre Lite board paniced during the night for the 1st time in several 
> > > month. Here is the information I collected:
> > > 
> > > kernel diagnostic assertion "p->p_wchan == NULL" failed: file 
> > > "/usr/src/sys/ker
> > > n/kern_sched.c", line 333
> > 
> > This means a thread was somehow in the CPU's run-queue...but had a wait 
> > channel as if it was waiting to be woken.
> > 
> > ...
> > > ddb> ps
> > >TID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
> > ...
> > >  86009  0  0  0  2  0x40014200  miiautidle0
> > 
> > Bingo: S=2 --> SRUN!  Or in this case, WTF!?  idle threads must *NEVER* 
> > have a wait channel.  That smells like someone called tsleep() from an 
> > interrupt and arm doesn't have the low-level machinery to detect and panic 
> > at the call.
> > 
> > mii_phy_auto() is the source of the "miiaut" wait channel; are the mii 
> > flags not set right that it's taking the tsleep() path instead of 
> > timeout_* path?
> 
> Looks like.  I don't see a reason why anything using interrupts should
> set that flag.  So fec/sxie seem to get this wrong.
> 
> arch/armv7/imx/if_fec.c:mii->mii_flags = MIIF_AUTOTSLEEP;
> arch/armv7/sunxi/sxie.c:mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_aue.c:   mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_axe.c:   mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_mos.c:   mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_udav.c:  mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_url.c:   mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_smsc.c:  mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_axen.c:  mii->mii_flags = MIIF_AUTOTSLEEP;
> dev/usb/if_ure.c:   mii->mii_flags = MIIF_AUTOTSLEEP;
> 

I have no hardware with sxie, so here is just the fec diff.

Index: if_fec.c
===
RCS file: /cvs/src/sys/arch/armv7/imx/if_fec.c,v
retrieving revision 1.18
diff -u -p -r1.18 if_fec.c
--- if_fec.c22 Sep 2016 12:43:22 -  1.18
+++ if_fec.c17 Oct 2016 00:46:49 -
@@ -413,7 +413,6 @@ fec_attach(struct device *parent, struct
mii->mii_readreg = fec_miibus_readreg;
mii->mii_writereg = fec_miibus_writereg;
mii->mii_statchg = fec_miibus_statchg;
-   mii->mii_flags = MIIF_AUTOTSLEEP;
 
ifmedia_init(>mii_media, 0, fec_ifmedia_upd, fec_ifmedia_sts);
mii_attach(self, mii, 0x, MII_PHY_ANY, MII_OFFSET_ANY, 0);



Re: 6.0 i386 MP kernel hang (6.0 SP and 5.9 MP kernels work)

2016-10-17 Thread Mike Larkin
On Mon, Oct 17, 2016 at 01:31:36PM -0700, Mike Larkin wrote:
> On Sun, Oct 16, 2016 at 10:44:56PM -0400, Jim Faulkner wrote:
> > Hi folks, I own a (fairly old) Fit-PC2i.  It has a 32-bit only Intel Atom
> > dual core processor.  I ran the 5.9 i386 multiprocessor kernel without
> > problem.  However, the 6.0 MP kernel hangs at:
> > booting hd0a:/bsd: 7663236+2035096+189444+0+1085440 [72+518464+510159|
> > 
> > The 6.0 SP kernel works fine.  I patched the kernel today (Oct 16) and
> > rebuilt GENERIC.MP but the patched MP kernel still hangs.
> > 
> > Please see the attached dumps.tar.gz for dmesg, usbdevs, pcidump, and
> > acpidump output.  Let me know what other information I can provide.
> > 
> 
> Can you help by bisecting diffs? There wasn't much changed in that time
> frame that I could envision causing a hang here.
> 
> Thanks.
> 
> -ml
> 

PS, all you need to bisect is the kernel stuff (obviously). If it gets past
the point of hanging, you've found the offending commit.

I thought it may have been related to W^X, but you said 5.9 worked, and 
most of that was already in for 5.9. You could try to validate that part
by commenting out the "detect PAE" code in locore.s and see if it properly
falls back to no-PAE on your MP configuration.

-ml



Re: 6.0 i386 MP kernel hang (6.0 SP and 5.9 MP kernels work)

2016-10-17 Thread Mike Larkin
On Sun, Oct 16, 2016 at 10:44:56PM -0400, Jim Faulkner wrote:
> Hi folks, I own a (fairly old) Fit-PC2i.  It has a 32-bit only Intel Atom
> dual core processor.  I ran the 5.9 i386 multiprocessor kernel without
> problem.  However, the 6.0 MP kernel hangs at:
> booting hd0a:/bsd: 7663236+2035096+189444+0+1085440 [72+518464+510159|
> 
> The 6.0 SP kernel works fine.  I patched the kernel today (Oct 16) and
> rebuilt GENERIC.MP but the patched MP kernel still hangs.
> 
> Please see the attached dumps.tar.gz for dmesg, usbdevs, pcidump, and
> acpidump output.  Let me know what other information I can provide.
> 

Can you help by bisecting diffs? There wasn't much changed in that time
frame that I could envision causing a hang here.

Thanks.

-ml