On Mon, Feb 13, 2023 at 01:31:30PM +0100, Stefan Sperling wrote:
> On Sat, Feb 11, 2023 at 06:32:31PM +0300, Vitaliy Makkoveev wrote:
> > Unfortunately the diff doesn't help. I attached photos with debug
> > information and panic report.
>
> Based on the command code in the firmware trace the error seems to be
> caused by the driver sending this command:
>
> #define IWX_ADD_STA_KEY 0x17
>
> It seems the set_key task is running while the driver is in INIT state?
> That is certainly not intended, and it is unclear why this is happening.
>
> What is the effect of this diff? Do you see the printfs added here?
>
Still panics. The only printf from your diff I see is "iwx_setkey_task:
in state RUN".
I suspect I found the problem. I made the diff below. It introduced some
debug output to catch thread which sets dram->paging concurrently with
iwx_init_task(). Please note, during this race we have IFF_FLAG set, but
IFF_RUNNING flag not set, so the iwx_init_task() will not call
iwx_stop(), but will call iwx_init().
----[ cut ]----
Index: sys/dev/pci/if_iwx.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_iwx.c,v
retrieving revision 1.152
diff -u -p -r1.152 if_iwx.c
--- sys/dev/pci/if_iwx.c 24 Jan 2023 16:18:22 -0000 1.152
+++ sys/dev/pci/if_iwx.c 13 Feb 2023 23:37:19 -0000
@@ -8767,6 +8767,7 @@ iwx_init(struct ifnet *ifp)
struct ieee80211com *ic = &sc->sc_ic;
int err, generation;
+ printf("%s\n", __func__);
rw_assert_wrlock(&sc->ioctl_rwl);
generation = ++sc->sc_generation;
@@ -8901,6 +8902,7 @@ iwx_stop(struct ifnet *ifp)
struct iwx_node *in = (void *)ic->ic_bss;
int i, s = splnet();
+ printf("%s\n", __func__);
rw_assert_wrlock(&sc->ioctl_rwl);
sc->sc_flags |= IWX_FLAG_SHUTDOWN; /* Disallow new tasks. */
@@ -10444,6 +10446,7 @@ iwx_attach_hook(struct device *self)
struct iwx_softc *sc = (void *)self;
KASSERT(!cold);
+ printf("%s\n", __func__);
iwx_preinit(sc);
}
@@ -10944,6 +10947,7 @@ iwx_init_task(void *arg1)
int fatal = (sc->sc_flags & (IWX_FLAG_HW_ERR | IWX_FLAG_RFKILL));
rw_enter_write(&sc->ioctl_rwl);
+ printf("%s\n", __func__);
if (generation != sc->sc_generation) {
rw_exit(&sc->ioctl_rwl);
splx(s);
@@ -11030,10 +11034,12 @@ iwx_activate(struct device *self, int ac
}
break;
case DVACT_RESUME:
+ printf("%s - resume\n", __func__);
iwx_resume(sc);
break;
case DVACT_WAKEUP:
if ((ifp->if_flags & (IFF_UP | IFF_RUNNING)) == IFF_UP) {
+ printf("%s - wakeup\n", __func__);
err = iwx_wakeup(sc);
if (err)
printf("%s: could not initialize hardware\n",
----[ cut ]----
The output on resume was:
iwx_stop
iwx_activate - resume
iwx_activate - wakeup
iwx0: fatal firmware error
iwx_init_task
iwx_init
panic: kernel diagnostic assertion "dram->....
So iwx_activate():
iwx_activate(struct device *self, int act)
{
/* ... */
switch (act) {
/* ... */
case DVACT_RESUME:
printf("%s - resume\n", __func__);
iwx_resume(sc);
break;
case DVACT_WAKEUP:
if ((ifp->if_flags & (IFF_UP | IFF_RUNNING)) == IFF_UP) {
printf("%s - wakeup\n", __func__);
err = iwx_wakeup(sc);
iwx_resume() doesn't do any software context allocation, but
iwx_wakeup() does. It calls iwx_init_hw(), ..., iwx_init_fw_sec() which
allocates "dram->paging". So, when iwx_init_task() scheduled by
interrupt started to run, iwx_init_fw_sec() will be entered again, but
with "dram->paging" set.
I hope this information will be useful to you. I'm not familiar with iwm
code, but existing `if_flags' checks looks wrong to me. May be firmware
errors should be handled with special IWX_FLAG_SW_ERR instead.
I could provide panic screenshot if required.