On Thu, Apr 23 2020, Stefan Sperling <s...@stsp.name> wrote: > I have observed a uvm fault in ieee80211_mira_probe_timeout_up() while > testing with iwm(4) and tcpbench: > > void > ieee80211_mira_probe_timeout_up(void *arg) > { > struct ieee80211_mira_node *mn = arg; > int s; > > s = splnet(); > mn->probe_timer_expired[IEEE80211_MIRA_PROBE_TO_UP] = 1; > DPRINTFN(3, ("probe up timeout fired\n")); > splx(s); > } > > One obvious possibility is that the 'mn' pointer became invalid before the > timeout was executed. But I am not certain what happened exactly; the info > in ddb was inconclusive since the console switching ran into splassert > failures and I didn't see a good backtrace. But r12 in 'show regs' contained > the address of ieee80211_mira_probe_timeout_up() and it looked like the > kernel was in softclock context. > > In any case, it looks like cancelling timeouts before scheduling the > iwm_newstate_task can lead to a race: > > - Timeouts are cancelled and iwm_newstate_task is scheduled > - Tx done interrupts feed frames to MiRA which adds a new timeout > - iwm_newstate_task runs and switches state without cancelling this timeout > > So cancel timeouts when we are actually switching state in the task. > > While here, initialize MiRA timeouts and other rate scaling state earlier, > when the node is allocated. > > ok?
Works fine so far on iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi iwm0: hw rev 0x230, fw ver 34.0.1, address f8:59:71:xx:xx:xx -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE