> Thank you for the extensive debugging. We are looking into this. Arend wrote
> yesterday to ask for detailed timing on wen eapol is inserted. We want this
> so we can increase the timeout. This is not a "nice" way to solve the
> problem, and it should be solved in firmware, but in the meanwhile we do
> want to increase timer, because we think that ampdu issues can rise at any
> given moment and even with changes/updates in firmware it might be necessary
> to increase timeout.

I'm kindly asking to keep replies in related threads :) I'm pretty sure above is
about problem described in "AMPDU stalls with brcmfmac4366b-pcie.bin triggering

> Second problem is harder, it is good to see that the frame gets returned to
> driver at some point. Our biggest worry is that a frame remains indefinitely
> in the firmware, but that appears not to be the case. Now why could this
> fail. There is one possible reason I found, and that is when a flowring is
> deleted while it holds the eapol, see flowring.c. It does not call the
> brcmf_txfinalize, but frees the packet directly. I think this is wrong but
> need to investigate this in more detail. In the meanwhile, if you keep doing
> tests I would like to ask you to add a WARN_ON() call to the function
> __brcmu_pkt_buf_free_skb where you print ***BUG*** so we know where the
> packet got freed from.

Please take a look at my e-mail & log (& maybe diff) once again. You really
quite missed the point.

The function brcmf_txfinalize *was* called. I was describing it in my e-mail
and there is a log:
[ 1440.414653] brcmfmac: [__brcmf_txfinalize -> __brcmu_pkt_buf_free_skb] [ifp:c72e7c80] 
***BUG*** skb:c70ddc00 skb->dev:c72e7800 skb->dev->name:wlan1-1
Above means that brcmf_txfinalize was called for skb c70ddc00 and it called

My debugging code noticed that it wasn't alright as this packet was still
pending and pend_8021x_cnt wasn't decreased for him. Please note it was
brcmf_txfinalize's fault (which was called for 100% sure). For some reason it
didn't pass if (type == ETH_P_PAE) condition. I already described it and I
shared my guess of firmware corrupting skb data. I'm now using debugging patch
which prints copied and current content of skb data in case of fault.

You're right I should have used WARN in my ***BUG*** place. It's a stupid habit
from MIPS devices where backtraces aren't reliable. I printed mini call chain
on my own instead. I mean this part:
[__brcmf_txfinalize -> __brcmu_pkt_buf_free_skb]

So please take a look at my e-mail again and let me know if it makes more sense
What do you think about my guess of firmware corrupting skb data?

