On Sat, Jan 21, 2017 at 08:07:53PM +1300, Richard Procter wrote:
> Hi,
> 
> On 18/01/2017, Stefan Sperling <s...@stsp.name> wrote:
> > I managed to trigger the following uvm fault by continously switching an
> > iwm(4) client between two APs on different channels, i.e. a loop that runs:
> > ifconfig iwm0 chan X; sleep 10; ifconfig iwm chan Y; sleep 10;
> >
> >  uvm_fault(0xffffffff819614a0, 0x7, 0, 2) -> e
> >  kernel: page fault trap, code=0
> >  Stopped at        softclock+0x32: movq %rdx,0x8(%rax)
> >
> > This diff seems to fix the problem. It cancels mira timeouts from
> > interrupt context where a state change is scheduled, instead of
> > cancelling mira timeouts from the process context which performs
> > the actual state change.
> 
> If I understand right, the crash is due to calling timeout_set() via
> ieee80211_mira_node_init() while mira timeouts are active.

Yes. It is not legal to call timeout_set() while the timeout is pending.

> Now, these timeouts are cancelled on transition from the RUN state
> (and cannot be cancelled unconditionally). However the guard to detect
> this transition is unreliable -- the root cause -- as it races against
> ic->ic_state.

One plausible explanation I could come up with goes like this:

We are in RUN state. The driver sends a packet.
The Tx completion interrupt handler fires. The handler calls mira_choose()
which schedules a timeout to "probe" a different rate at a certain amount
of time in the future. Mira probes rates with a bad score less frequently
to avoid excessive packet loss due to probing, hence an adjustable timeout.
We are probing a "bad" rate, and the timeout gets scheduled far ahead in
the future, up to 2048 milliseconds.

An ioctl wants to change the channel. ieee80211_ioctl() returns ENETRESET
which causes the driver to run iwm_stop() followed by iwm_init().
Note that iwm_stop() calls sc->sc_newstate(ic, IEEE80211_S_INIT, -1);
which will set ic->ic_state to INIT without going through iwm_newstate().
Without my patch, the timeout is not cancelled here.

The ioctl then calls iwm_init(). This calls ieee80211_begin_scan() which
ends up scheduling iwm_newstate_task() to move into SCAN state.
The state change from INIT to SCAN happens, and timeouts are not cancelled
because the "old" state is INIT instead of RUN. Eventually we choose an AP,
move back into RUN state, and call mira_node_init() which calls timeout_set().

Now the timeout fires, after timeout_set() was called.

> This would imply there is a more general problem with identifying
> state transitions within iwm_newstate_task?

Yes, there is a more general problem. See this discussion:
https://marc.info/?t=148362466200003&r=1&w=2

There are interrupts, ioctl, tasks, and timeouts.
The driver can schedule and cancel tasks and timeouts, but has no
control over scheduling of interrupts and ioctls. There are still
known races with ioctls.

iwm(4) is a bit special. It is the only wireless driver which waits for
responses from the firmware after sending commands (other drivers just
ignore these responses). Hence iwm(4) uses tasks because it must sleep
while the firmware is busy. The driver code base we inherited from genua
was already written like this, and I did not change this before the driver
got imported to CVS.

For a while I thought I had made a mistake, and that iwm(4) should be
doing things like iwn(4) does (never wait for the firmware).
However, for now I think it is good to have both approaches side by side,
and to make sure iwm(4) works correctly in the way it does.

Eventually, as MP processing in the kernel advances, we may want to move
parts of the net80211 code from interrupt context into a task. Getting this
right in one driver is a good prerequisite for that (and a great and much
needed learning experience for me, to be honest).

If you grab a drink with mpi@ I suspect he can tell you more about this :)

> That said, I was unable to further simplify your patch, it addresses
> the immediate issue, and I've tested it on my iwm(4) without problem.
> ok procter@

Thanks. I'll commit it now.

Reply via email to