On Tue, Aug 20, 2019 at 12:31:56PM +0200, Felix Fietkau wrote:
> >> >> void mt76x0e_wake_tx_queue(struct ieee80211_hw *hw, struct 
> >> >> ieee80211_txq *txq)
> >> >> {
> >> >>         if (is_mt7630(dev)) {
> >> >>             mt76_txq_schedule(dev, txq->ac);
> >> >>         } else {
> >> >>             tasklet_schedule(&dev->tx_tasklet);
> >> >>         }
> >> >> }
> >> > 
> >> > Not sure about reduction of lock contention for which the tx_tasklet
> >> > was introduced here, but looks ok for me as fix.
> >> I think if we work around the bug like this, it can easily come back to
> >> bite us again later. 
> > 
> > I'm not into workarounds any kind, but this is really strange issue,
> > maybe FW bug that triggers just by slightly different driver behaviour.
> > 
> >> I don't see any logical explanation as to how this
> >> makes a difference with hardware encryption.
> >> Also, I think it would be helpful to figure out what key operation (if
> >> any) triggers this, adding or removing keys.
> > 
> > Seems not to be related with set_key operation at all. We set 2 HW
> > keys at the beginning and hang happen after some tx/rx traffic
> > without any re-keyring.
> > 
> > I'm not sure why disabling HW encryption helps. Maybe it is due to
> > ordering or timing. With SW encryption we spend more time in mac80211
> > before pass skb's to the driver. Or maybe we just mix some HW keys 
> > and SW (group) keys in way that FW does not like.
> > 
> >> Maybe it could also help if we change the order in which the WCID table
> >> entries are updated, i.e. changing MT_WCID_ATTR first when removing keys.
> >> 
> >> Maybe temporarily clearing MT_MAC_SYS_CTRL_ENABLE_TX before the key
> >> update and setting it again afterwards could also help.
> > 
> > I tested below patch and it did not help.
> Can you test if disabling hw encryption only for shared or only for
> pairwise keys makes any difference?

Disabling only pairwise keys helps. Disabling only shared keys does
not help.

Not sure if this will be helpful information or make things more
confusing, but seems the difference between mt76_txq_schedule()
and tasklet_schedule() in mt76_wake_tx_queue() is that on 
mt76_txq_schedule() some tx packets are serialized by dev->rx_lock
(because some ARP and TCP packets are sent via network stack as response
of incoming packet within ieee80211_rx_napi() call). Removing
spin_lock(&dev->rx_lock) in mt76_rx_complete() make the problem
reproducible again with mt76_txq_schedule() & HW encryption.

Stanislaw

Reply via email to