Re: iwn: improve processing of block ack notifications

2020-04-23 Thread f.holop
Stefan Sperling - Fri, 17 April 2020 at 16:03:30
> This applies on top of a previous diff I sent, the one to make use of
> the entire firmware retry table.

I have contacted Stefan privately about a seemingly different issue (and
i wasn't sure it was a bug to begin with) and after he recommended
trying this patch (and the previous one) here are some preliminary
findings:

after upgrading to current, and before the patches, my iwn0 was having
intermittent outages.  I havent used this notebook since 6.4, but back
then i used it daily and i didn't remember so many outages.

after applying the patches and testing for a couple of days, there have
been no outages so far.  but not only that.  booted up the non-patched
kernel to do some non-scientific speed-tests as well:

$ speedtest-cli
Retrieving speedtest.net configuration...
Testing from Ziggo (213.127.117.151)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by fdcservers.net (Amsterdam) [4.51 km]: 38.643 ms
Testing download 
speed
Download: 19.49 Mbit/s
Testing upload 
speed..
Upload: 4.94 Mbit/s

Download: 12.20 Mbit/s
Upload: 3.58 Mbit/s

Download: 24.27 Mbit/s
Upload: 2.42 Mbit/s

Download: 7.85 Mbit/s
Upload: 2.94 Mbit/s

Download: 12.63 Mbit/s
Upload: 2.97 Mbit/s


after patching:

Download: 28.31 Mbit/s
Upload: 4.41 Mbit/s

Download: 25.99 Mbit/s
Upload: 4.02 Mbit/s

Download: 24.11 Mbit/s
Upload: 4.58 Mbit/s

Download: 29.09 Mbit/s
Upload: 4.31 Mbit/s

Download: 25.77 Mbit/s
Upload: 4.38 Mbit/s

it's not just faster, it's more consistent.

Thank you Stefan for putting this much work into the rate setting, it
seems to be one of the main keys to higher performance...



OpenBSD 6.7-beta (GENERIC.MP) #0: Tue Apr 21 16:21:48 CEST 2020
f...@atm.obiit.org:/home/f/src/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4166717440 (3973MB)
avail mem = 4027813888 (3841MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (80 entries)
bios0: vendor LENOVO version "6EET56WW (3.16 )" date 10/26/2012
bios0: LENOVO 2777CTO
acpi0 at bios0: ACPI 3.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT ECDT APIC MCFG HPET SLIC BOOT ASF! SSDT TCPA DMAR 
SSDT SSDT SSDT
acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) EXP0(S4) EXP1(S4) EXP2(S4) 
PCI1(S4) USB0(S3) USB1(S3) USB2(S3) USB3(S3) USB4(S3) USB5(S3) EHC0(S3) 
EHC1(S3) HDEF(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiec0 at acpi0
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU U9400 @ 1.40GHz, 1396.70 MHz, 06-17-06
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu0: 3MB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges
cpu0: apic clock running at 199MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU U9400 @ 1.40GHz, 1396.51 MHz, 06-17-06
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu1: 3MB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins, remapped
acpimcfg0 at acpi0
acpimcfg0: addr 0xe000, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (AGP_)
acpiprt2 at acpi0: bus 2 (EXP0)
acpiprt3 at acpi0: bus 3 (EXP1)
acpiprt4 at acpi0: bus -1 (EXP2)
acpiprt5 at acpi0: bus 21 (PCI1)
acpicpu0 at acpi0: !C3(100@57 mwait.3@0x30), !C2(500@1 mwait.1@0x10), C1(1000@1 
mwait.1), PSS
acpicpu1 at acpi0: !C3(100@57 mwait.3@0x30), !C2(500@1 mwait.1@0x10), C1(1000@1 
mwait.1), PSS
acpipwrres0 at acpi0: PUBS, resource for USB0, USB3, USB5, EHC0, EHC1
acpitz0 at acpi0: critical temperature is 127 degC
acpitz1 at acpi0: critical temperature is 105 degC
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
acpibat0 at acpi0: BAT0 model "93P5030" serial  1268 type LION oem "SANYO"
acpiac0 at acpi0: AC unit online
acpithinkpad0 at acpi0: version 1.0
"PNP0C14" at acpi0 not configured
acpivideo0 at acpi0: VID_
acpivout0 at acpivideo0: LCD0
acpivideo1 at acpi0: VID_
cpu0: Enhanced SpeedStep 1396 MHz: speeds: 1401, 1400, 1200, 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel GM45 Host" rev 0x07
inteldrm0 at pci0 dev 2 function 0 "Intel GM45 Video" rev 

Re: iwn: improve processing of block ack notifications

2020-04-17 Thread Stefan Sperling
On Fri, Apr 17, 2020 at 06:30:22PM +0200, Stefan Sperling wrote:
> On Fri, Apr 17, 2020 at 04:03:30PM +0200, Stefan Sperling wrote:
> > In particular, this fixes wrong assumptions about what the data in the
> > firmware's "compressed block ack" notifications is supposed to represent,
> > and actually pieces information about individual subframes of aggregated
> > frames (A-MPDUs) back together when reporting to MiRA, rather than reporting
> > unrelated subframes to MiRA individually. (Recall that MiRA was primarily
> > designed to operate on units of entire A-MDPUs.)
> 
> Unfortunately, this diff can trigger a page fault. I am looking for a fix.

This new version has been tested by cwen and Josh Grosse and seems to
fix the problems they found with the previous diff. Thanks!

diff 24c5886569df7f9428f24e6e74c71e02cba8c34c 
719fff35cc167c0814926c547e92b317aad32553
blob - 8476916e4337cb6e681c11f26fc1fca5f4a48334
blob + b8beb56aa1112e65c2907b2955694b6a8907d9ae
--- sys/dev/pci/if_iwn.c
+++ sys/dev/pci/if_iwn.c
@@ -2282,7 +2282,11 @@ iwn_mira_choose(struct iwn_softc *sc, struct ieee80211
iwn_set_link_quality(sc, ni);
 }
 
-/* Process an incoming Compressed BlockAck. */
+/*
+ * Process an incoming Compressed BlockAck.
+ * Note that these block ack notifications are generated by firmware and do
+ * not necessarily correspond to contents of block ack frames seen on the air.
+ */
 void
 iwn_rx_compressed_ba(struct iwn_softc *sc, struct iwn_rx_desc *desc,
 struct iwn_rx_data *data)
@@ -2293,7 +2297,8 @@ iwn_rx_compressed_ba(struct iwn_softc *sc, struct iwn_
struct ieee80211_tx_ba *ba;
struct iwn_node *wn;
struct iwn_tx_ring *txq;
-   uint16_t ssn, idx;
+   uint16_t seq, ssn, idx, end_idx;
+   int min_ampdu_id, max_ampdu_id, id;
int qid;
 
if (ic->ic_state != IEEE80211_S_RUN)
@@ -2322,66 +2327,103 @@ iwn_rx_compressed_ba(struct iwn_softc *sc, struct iwn_
if (ba->ba_state != IEEE80211_BA_AGREED)
return;
 
-   ssn = le16toh(cba->ssn); /* BA window starting sequence number */
-   if (!SEQ_LT(ssn, ba->ba_winstart)) {
-   ieee80211_output_ba_move_window(ic, ni, cba->tid, ssn);
+   /*
+* The first bit in cba->bitmap corresponds to the sequence number
+* stored in the sequence control field cba->seq.
+* Any frames older than this can now be discarded; they should
+* already have been reported as failures or been acknowledged.
+*
+* Multiple BA notifications in a row may be using this number, with
+* additional bits being set in cba->bitmap. It is unclear how the
+* firmware decides to shift this window forward.
+*/
+   seq = le16toh(cba->seq) >> IEEE80211_SEQ_SEQ_SHIFT;
+   if (!SEQ_LT(seq, ba->ba_winstart)) {
+   ieee80211_output_ba_move_window(ic, ni, cba->tid, seq);
iwn_ampdu_txq_advance(sc, txq, qid,
-   IWN_AGG_SSN_TO_TXQ_IDX(ssn));
+   IWN_AGG_SSN_TO_TXQ_IDX(seq));
iwn_clear_oactive(sc, txq);
}
+   /* Our BA window should now correspond to the bitmap. */
+   if (ba->ba_winstart != seq)
+   return;
 
-   /* ba->ba_winstart should now correspond to cba->ssn */
-   if (ba->ba_winstart != cba->ssn)
+   /* Skip rate control if our Tx rate is fixed. */
+   if (ic->ic_fixed_mcs != -1)
return;
 
/*
-* Update Tx rate statistics.
-* Skip rate control if our Tx rate is fixed.
+* The firmware's new BA window starting sequence number
+* corresponds to the first hole in cba->bitmap, implying
+* that all frames between 'seq' and 'ssn' have been acked.
 */
-   if (ic->ic_fixed_mcs == -1 && cba->nframes_sent > 0) {
-   int end_idx = IWN_AGG_SSN_TO_TXQ_IDX(ba->ba_winend);
-   int bit = 0, nsent = cba->nframes_sent;
+   ssn = le16toh(cba->ssn);
 
+   /* Determine the min/max IDs we assigned to AMPDUs in this range. */
+   idx = IWN_AGG_SSN_TO_TXQ_IDX(seq);
+   end_idx = IWN_AGG_SSN_TO_TXQ_IDX(ssn);
+   min_ampdu_id = txq->data[idx].ampdu_id;
+   max_ampdu_id = min_ampdu_id;
+   while (idx != end_idx) {
+   struct iwn_tx_data *txdata = >data[idx];
+
+   if (txdata->m != NULL) {
+   if (min_ampdu_id > txdata->ampdu_id)
+   min_ampdu_id = txdata->ampdu_id;
+   if (max_ampdu_id < txdata->ampdu_id)
+   max_ampdu_id = txdata->ampdu_id;
+   }
+
+   idx = (idx + 1) % IWN_TX_RING_COUNT;
+   }
+
+   /*
+* Update Tx rate statistics for A-MPDUs before firmware's BA window.
+*/
+   for (id = min_ampdu_id; id <= max_ampdu_id; id++) {
+   int have_ack = 0, bit = 0;
+   idx = IWN_AGG_SSN_TO_TXQ_IDX(seq);
+   

Re: iwn: improve processing of block ack notifications

2020-04-17 Thread Stefan Sperling
On Fri, Apr 17, 2020 at 04:03:30PM +0200, Stefan Sperling wrote:
> In particular, this fixes wrong assumptions about what the data in the
> firmware's "compressed block ack" notifications is supposed to represent,
> and actually pieces information about individual subframes of aggregated
> frames (A-MPDUs) back together when reporting to MiRA, rather than reporting
> unrelated subframes to MiRA individually. (Recall that MiRA was primarily
> designed to operate on units of entire A-MDPUs.)

Unfortunately, this diff can trigger a page fault. I am looking for a fix.



iwn: improve processing of block ack notifications

2020-04-17 Thread Stefan Sperling
This applies on top of a previous diff I sent, the one to make use of
the entire firmware retry table.

Recently I have spent a good amount of time trying to reverse-engineer
the sparsely documented behaviour of both iwn(4) and iwm(4) firmware with
respect to Tx aggregation. For details, see my comments in the diff.

I have also put some thought into how we can reconcile the firmware's
representation of relevant information with what MiRA expects to see.

This diff is the best I could come up with so far. There were at lot of
earlier attempts and I'm still not sure if this is the best possible
solution, but it seems more correct than what we have now.

In particular, this fixes wrong assumptions about what the data in the
firmware's "compressed block ack" notifications is supposed to represent,
and actually pieces information about individual subframes of aggregated
frames (A-MPDUs) back together when reporting to MiRA, rather than reporting
unrelated subframes to MiRA individually. (Recall that MiRA was primarily
designed to operate on units of entire A-MDPUs.)
 
diff 24c5886569df7f9428f24e6e74c71e02cba8c34c 
636d6d139ade79b967dc9eb8b553e48dcd939227
blob - 8476916e4337cb6e681c11f26fc1fca5f4a48334
blob + f1a30eea896c8502e8f15bb3d1133b74120253f4
--- sys/dev/pci/if_iwn.c
+++ sys/dev/pci/if_iwn.c
@@ -2282,7 +2282,11 @@ iwn_mira_choose(struct iwn_softc *sc, struct ieee80211
iwn_set_link_quality(sc, ni);
 }
 
-/* Process an incoming Compressed BlockAck. */
+/*
+ * Process an incoming Compressed BlockAck.
+ * Note that these block ack notifications are generated by firmware and do
+ * not necessarily correspond to contents of block ack frames seen on the air.
+ */
 void
 iwn_rx_compressed_ba(struct iwn_softc *sc, struct iwn_rx_desc *desc,
 struct iwn_rx_data *data)
@@ -2293,7 +2297,8 @@ iwn_rx_compressed_ba(struct iwn_softc *sc, struct iwn_
struct ieee80211_tx_ba *ba;
struct iwn_node *wn;
struct iwn_tx_ring *txq;
-   uint16_t ssn, idx;
+   uint16_t seq, ssn, idx, end_idx;
+   int min_ampdu_id, max_ampdu_id, id;
int qid;
 
if (ic->ic_state != IEEE80211_S_RUN)
@@ -2322,66 +2327,98 @@ iwn_rx_compressed_ba(struct iwn_softc *sc, struct iwn_
if (ba->ba_state != IEEE80211_BA_AGREED)
return;
 
-   ssn = le16toh(cba->ssn); /* BA window starting sequence number */
-   if (!SEQ_LT(ssn, ba->ba_winstart)) {
-   ieee80211_output_ba_move_window(ic, ni, cba->tid, ssn);
+   /*
+* The first bit in cba->bitmap corresponds to the sequence number
+* stored in the sequence control field cba->seq.
+* Any frames older than this can now be discarded; they should
+* already have been reported as failures or been acknowledged.
+*
+* Multiple BA notifications in a row may be using this number, with
+* additional bits being set in cba->bitmap. It is unclear how the
+* firmware decides to shift this window forward.
+*/
+   seq = le16toh(cba->seq) >> IEEE80211_SEQ_SEQ_SHIFT;
+   if (!SEQ_LT(seq, ba->ba_winstart)) {
+   ieee80211_output_ba_move_window(ic, ni, cba->tid, seq);
iwn_ampdu_txq_advance(sc, txq, qid,
-   IWN_AGG_SSN_TO_TXQ_IDX(ssn));
+   IWN_AGG_SSN_TO_TXQ_IDX(seq));
iwn_clear_oactive(sc, txq);
}
+   /* Our BA window should now correspond to the bitmap. */
+   if (ba->ba_winstart != seq)
+   return;
 
-   /* ba->ba_winstart should now correspond to cba->ssn */
-   if (ba->ba_winstart != cba->ssn)
+   /* Skip rate control if our Tx rate is fixed. */
+   if (ic->ic_fixed_mcs != -1)
return;
 
/*
-* Update Tx rate statistics.
-* Skip rate control if our Tx rate is fixed.
+* The firmware's new BA window starting sequence number
+* corresponds to the first hole in cba->bitmap, implying
+* that all frames between 'seq' and 'ssn' have been acked.
 */
-   if (ic->ic_fixed_mcs == -1 && cba->nframes_sent > 0) {
-   int end_idx = IWN_AGG_SSN_TO_TXQ_IDX(ba->ba_winend);
-   int bit = 0, nsent = cba->nframes_sent;
+   ssn = le16toh(cba->ssn);
 
+   /* Determine the min/max IDs we assigned to AMPDUs in this range. */
+   idx = IWN_AGG_SSN_TO_TXQ_IDX(seq);
+   end_idx = IWN_AGG_SSN_TO_TXQ_IDX(ssn);
+   min_ampdu_id = txq->data[idx].ampdu_id;
+   max_ampdu_id = min_ampdu_id;
+   while (idx != end_idx) {
+   struct iwn_tx_data *txdata = >data[idx];
+
+   if (min_ampdu_id > txdata->ampdu_id)
+   min_ampdu_id = txdata->ampdu_id;
+   if (max_ampdu_id < txdata->ampdu_id)
+   max_ampdu_id = txdata->ampdu_id;
+
+   idx = (idx + 1) % IWN_TX_RING_COUNT;
+   }
+
+   /*
+* Update Tx rate statistics for A-MPDUs before