This diff works around FIFO_UNDERRUN (0x84) Tx errors being reported
by iwn(4) firmware. When this error occurs, it tends to occur multiple
times in a row. Affected frames are lost and never get transmitted, and
traffic stalls for a while. This affects tcpbench very visibly.

I don't understand what is causing this. I have found that it only
occurs when we ask the firmware to use its multi-rate retry table.
If we send frames at a fixed rate, it does not happen.

This error is particularly problematic with block ack, because the
failed frames disappear and leave a hole in the receivers block ack
window. The receiver will then have to wait a while for the lost frames
until it eventually decides to skip them.

This problem effectively makes Tx aggegration unusable on iwn(4).

My workaround is to always use a fixed Tx rate for aggregation queues.
This is not ideal for single frames which also get send from such queues
when traffic is low and may now be more likely to fail.
But if the firmware decides to aggregate frames during traffic bursts,
all frames contained in an aggregate are always sent together at the
same rate anyway, so in this case we don't loose anything.

I would like to find a better fix, but this allows me to proceed with
additional fixes for aggregation support and together with those fixes
this seems better than the lossy behaviour we have now.
 
diff 0eca04344da7ad4deb76485dcef00cdf88803be4 
6f793971788fd7061f66330336cbeb5103b717c3
blob - 14c2d9a35e2973feb1ec2347eeef3d6041864291
blob + 110bbe97b980b338acd8aa61fbabd33926c207be
--- sys/dev/pci/if_iwn.c
+++ sys/dev/pci/if_iwn.c
@@ -3513,10 +3513,12 @@ iwn_tx(struct iwn_softc *sc, struct mbuf *m, struct ie
        else
                tx->rflags = rinfo->flags;
        /*
-        * Skip rate control if our Tx rate is fixed.
-        * Keep the Tx rate constant while mira is probing.
+        * Keep the Tx rate constant while mira is probing, or if this is
+        * an aggregation queue in which case a fixed Tx rate works around
+        * FIFO_UNDERRUN Tx errors.
         */
        if (tx->id == sc->broadcast_id || ieee80211_mira_is_probing(&wn->mn) ||
+           qid >= sc->first_agg_txq ||
            ic->ic_fixed_mcs != -1 || ic->ic_fixed_rate != -1) {
                /* Group or management frame, or probing, or fixed Tx rate. */
                tx->linkq = 0;

Reply via email to