Hi, On 18/01/2017, Stefan Sperling <s...@stsp.name> wrote: > I managed to trigger the following uvm fault by continously switching an > iwm(4) client between two APs on different channels, i.e. a loop that runs: > ifconfig iwm0 chan X; sleep 10; ifconfig iwm chan Y; sleep 10; > > uvm_fault(0xffffffff819614a0, 0x7, 0, 2) -> e > kernel: page fault trap, code=0 > Stopped at softclock+0x32: movq %rdx,0x8(%rax) > > This diff seems to fix the problem. It cancels mira timeouts from > interrupt context where a state change is scheduled, instead of > cancelling mira timeouts from the process context which performs > the actual state change.
If I understand right, the crash is due to calling timeout_set() via ieee80211_mira_node_init() while mira timeouts are active. Now, these timeouts are cancelled on transition from the RUN state (and cannot be cancelled unconditionally). However the guard to detect this transition is unreliable -- the root cause -- as it races against ic->ic_state. This would imply there is a more general problem with identifying state transitions within iwm_newstate_task? That said, I was unable to further simplify your patch, it addresses the immediate issue, and I've tested it on my iwm(4) without problem. ok procter@ best, Richard. iwm0: hw rev 0x210, fw ver 16.242414.0, address xx:xx:xx:xx:xx:xx > Also, cancel mira timeouts when the interface is brought down in > iwm_stop() because this code path does not go through iwm_newstate(). > > ok? > > Note that unless we are in RUN state the timeouts have not been initialized > so timeout_del() inside mira_cancel_timeouts() would panic if we called it. > This can't be done in a different way since mira state can only be > initialized once the node for our AP (ic->ic_bss) is available. > > Index: if_iwm.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/if_iwm.c,v > retrieving revision 1.156 > diff -u -p -r1.156 if_iwm.c > --- if_iwm.c 12 Jan 2017 18:06:57 -0000 1.156 > +++ if_iwm.c 17 Jan 2017 18:59:20 -0000 > @@ -5496,10 +5496,8 @@ iwm_newstate_task(void *psc) > if (ostate == IEEE80211_S_SCAN && nstate != ostate) > iwm_led_blink_stop(sc); > > - if (ostate == IEEE80211_S_RUN && nstate != ostate) { > + if (ostate == IEEE80211_S_RUN && nstate != ostate) > iwm_disable_beacon_filter(sc); > - ieee80211_mira_cancel_timeouts(&in->in_mn); > - } > > /* Reset the device if moving out of AUTH, ASSOC, or RUN. */ > /* XXX Is there a way to switch states without a full reset? */ > @@ -5632,8 +5630,11 @@ iwm_newstate(struct ieee80211com *ic, en > { > struct ifnet *ifp = IC2IFP(ic); > struct iwm_softc *sc = ifp->if_softc; > + struct iwm_node *in = (void *)ic->ic_bss; > > timeout_del(&sc->sc_calib_to); > + if (ic->ic_state == IEEE80211_S_RUN) > + ieee80211_mira_cancel_timeouts(&in->in_mn); > > sc->ns_nstate = nstate; > sc->ns_arg = arg; > @@ -6116,6 +6117,8 @@ iwm_stop(struct ifnet *ifp, int disable) > ifq_clr_oactive(&ifp->if_snd); > > in->in_phyctxt = NULL; > + if (ic->ic_state == IEEE80211_S_RUN) > + ieee80211_mira_cancel_timeouts(&in->in_mn); > > task_del(systq, &sc->init_task); > task_del(sc->sc_nswq, &sc->newstate_task); > >