On Tue, Feb 25, 2020 at 01:06:31PM +0100, Tobias Heider wrote: > Hi, > > I tried to figure out the reason for the lost Tx receive interrupt > by comparing iwm with iwlwifi and I think our handling of the > periodic RX interrupt is a bit off.
Very nice find. Not sure if the lost Tx interrupt is related, though. > In linux on receive of any of the possible RX interrupts the periodic > interrupt is disabled and then reenabled. As far as I understand this > is to make sure that no matter what kind of RX interrupt was received, > we make sure there will be another one in 8 ms when the newly activated > periodic interrupt will trigger. So, as far as I understand: The interrupt is called "periodic" but that is in fact misleading. The firmware offers a periodic interrupt mechanism which the Linux driver uses as a one-shot. The driver's goal is to generate just *one* interrupt in another 8 ms to address a producer / consumer race between firmware and driver during Rx. See Linux commit 74ba67edfcb235c0415a62d37493866c8380dc1d. Could we copy some of the elaborate comments from there? > iwm on the other hand disables the periodic interrupt only when a > periodic interrupt was received AND at the same time SW_RX or FH_RX is > set, and then reenables it only on a SW_RX or FH_RX interrupt that was > not periodic. Do you see a lower interrupt rate under Rx load with this diff? Currently, if the device is loaded with as much Rx as we can take then the system spends a lot of time in interrupt context. There have been reports of systems becoming noticeably less responsive in such situations. Is the periodic interrupt perhaps firing too often because we leave it enabled too long because of the bug you identified? > A discrepany between iwm and iwlwifi is that in iwm > IWM_CSR_INT_BIT_RX_PERIODIC > alone can lead to a write of IWM_CSR_FH_INT_RX_MASK to IWM_CSR_FH_INT_STATUS > (which might be the receive confirmation), which in iwlwifi only happens if > SW_RX or FH_RX are set. > > The attached diff makes iwm behave more like iwlwifi in the cases described > above. > > ok? Which devices have you tested this with?
