Thomas,

On 3/13/18, 5:53 PM, "Thomas Pedersen" <tho...@eero.com> wrote:

    Javier,
    
    On Tue, Mar 13, 2018 at 4:42 PM, Javier Cardona <jcard...@fb.com> wrote:
    > Hi,
    >
    > We have resolved this issue.  I'm sharing the details in case that might 
help others.
    >
    > The description of the problem was accurate EXCEPT that the acks that we 
observed on the sniffer were not being sent by the failed station (MAP2, in the 
context of my original e-mail) but by a third station MAP3.  Those Acks were 
sent by MAP3 but with MAP2's address in the Transmitter Address field.
    >
    > These anomalous Block Acks were sent because the MAC-ADDRESS-FILTER was 
misconfigured at MAP3, which caused that station to respond to addresses 
different than its own.  The reasons for this misconfiguration were:
    >   (1) in mesh (and other) mode(s), the driver creates a hidden monitor 
vif along the mesh vif
    
    At least since some 10.4 firmware, the hidden monitor vdev is no longer
    required. See 
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_linux-2Dwireless_msg156475.html&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=CH8h6v1aoPH3YUC48S5HeA&m=nnX55-ZbHxnvePVayeWJ_xi-_XOXran2F6f_TrBFTDM&s=VxPb7xHUlVSTmjtiULlwwGp5xwUp5c5LkxwZxq6wdeA&e=
    
Oh yes, you are right.  That monitor vif is created only when using ath10k-ct 
firmware.  The original stock firmware does not create the additional vif.  
However, the original issue (MAC-ADDRESS-FILTER getting corrupted, which 
triggers anomalous Block Acks) still occurs with the stock driver and 10.4 
firmware.
It seems like the firmware will still use original mac address provided by the 
driver on load, even if the mesh interface uses a different address.

If the driver uses the bogus address in the pre-cal file (12:34:56:78:90:12), 
the MAC FILTER are misconfigured:

MAC FILTERS ==
phy0:0x00032018:0xb93efa8d
phy0:0x0003201c:0x00009bec

Once the pre-cal file is patched to match the address that will be assigned to 
the mesh vif, the mac filter is now correct:

MAC FILTERS ==
phy0:0x00032018:0xffffffff
phy0:0x0003201c:0x0000ffff

Thanks!

Javier

    >   (2) this second monitor vif is assigned the default mac address 
reported by the firmware (arvif->mac_addr)
    >   (3) this mac address (which the driver was getting from the pre-cal 
files) is XOR'd with the mesh vif address to configure MAC-ADDRESS-FILTER
    >   (4) once that happens, the hardware will ack all addresses that pass 
the MAC-ADDRESS-FILTER.  If the two mac addresses (vif->addr and 
arvif->mac_addr) are very dissimilar, that will result in a storm of invalid 
Block Acks
    >
    > We resolved the issue by patching the pre-cal data with the same address 
as the mesh interface, so that vif->addr == arvif->mac_addr.  This is more a 
workaround than a real fix, because this misconfiguration this 
MAC-ADDRESS-FILTER can easily go unnoticed.  In fact, what unblocked us on this 
issue was switching to Candela Tech's custom firmware and driver (ath10k-ct).  
This provides a nice interface to the hardware registers:
    >
    > cat /debug/ieee80211/wiphy1/ath10k/fw_regs
    >
    >         ath10k Target Register Dump
    >         =================
    >         MAC-FILTER-ADDR-L32 0xffffffff
    >         MAC-FILTER-ADDR-U16 0x0000ffff
    >
    > We would probably be still trying to solve this if Ben Greear did not 
point us in the right direction.
    >
    > Cheers,
    >
    > Javier
    >
    >
    > On 2/7/18, 2:15 PM, "Javier Cardona" <jcard...@fb.com> wrote:
    >
    >     Hi,
    >
    >     We have observed a problem where, under certain conditions, the 
ath10k firmware will acknowledge frames but not send them up to the driver.
    >     Frames are sent by a mesh access point (MAP1) to a second mesh AP 
(MAP2) at MCS 9/NSS-3, which at that distance is probably marginal.  Since 
frames get acknlowledged by MAP2, MAP1 will not try a lower rate.  But the 
driver at MAP2 does not receive the frames.
    >
    >     We have captures of this exchange for both the unsuccessful as well 
as the successful case, which happens when we move MAP2 closer to MAP1.   They 
can be found here:
    >     
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_sh_0or86c8vxotygdc_AABI7RmQ2nztcOBF3UDXbPUma&d=DwIFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=CH8h6v1aoPH3YUC48S5HeA&m=nnX55-ZbHxnvePVayeWJ_xi-_XOXran2F6f_TrBFTDM&s=3Yrhtv_RtixAZobAEOhKvTFxxeqbCFwEtefEHG3lB1k&e=
    >
    >     In both scenarios, the frames from MAP1 to MAP2 are acknowledged, as 
observed in sniffer captures.
    >
    >     In the successful scenario, the driver logs show the frames being 
received by the driver.  Sequence numbers in the debug logs match those in the 
sniffer captures.
    >
    >     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | 
grep ucast | grep 'len 374'
    >     Feb 05 05:43:31 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb 
d734f6c0 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1027 vht sgi 
rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 
amsdu-more 0
    >     Feb 05 05:43:42 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb 
d6aaa480 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1031 vht sgi 
rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 
amsdu-more 0
    >     Feb 05 05:43:53 nbg-3ed9da kernel: ath10k_pci 0000:01:00.0: rx skb 
d53a8d80 len 374 peer 60:31:97:3e:82:e6 tid 0 (BE) ucast sn 1037 vht sgi 
rate_idx 9 vht_nss 3 freq 5825 band 1 flag 0x600800 fcs-err 0 mic-err 0 
amsdu-more 0
    >
    >     In the failure scenario, the driver logs show no frames, even if the 
capture shows that the frames are acknowledged.
    >
    >     root@nbg-3ed9da:~# journalctl -kf | grep "peer 60:31:97:3e:82:e6" | 
grep ucast | grep 'len 374'
    >     <nothing here>
    >
    >     If we force MAP1 to use a single stream, the frames are received 
successfully.
    >
    >       # iw mesh0 set bitrates legacy-5 ht-mcs-5 vht-mcs-5 1:0-9
    >
    >     It seems as if the firmware is acknowledging but silently discarding 
frames… is that possible?
    >     Can anyone provide some pointers on how to troubleshoot this?
    >
    >     We are using this firmware: 
https://github.com/kvalo/ath10k-firmware/blob/master/QCA9984/hw1.0/3.4/firmware-5.bin_10.4-3.4-00104
 and kernel 4.9.31 with a few cherry-picked patches from the ath10k branch.
    >     The hardware is QCA994.
    >
    >     Best,
    >
    >     Javier
    >
    >
    >
    >
    >
    >
    > _______________________________________________
    > ath10k mailing list
    > ath10k@lists.infradead.org
    > http://lists.infradead.org/mailman/listinfo/ath10k
    
    
    
    -- 
    thomas
    

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

Reply via email to