Re: [ath9k-devel] ath5k tx completion

Adrian Chadd Wed, 06 Nov 2013 09:19:05 -0800

On 6 November 2013 05:41, Bob Copeland <m...@bobcopeland.com> wrote:
> Hi Adrian (and anyone else who wants to chime in),
>
> I'm looking to fix some accounting issues in the ath5k tx path, which
> arise because we always try to keep one descriptor available at the
> end for the hardware's sake.
>
> I believe you have some concrete, hard-won knowledge about when the
> hardware marks a descriptor complete versus when the 'next' pointer
> in the descriptor is accessed -- do you mind sharing?


Sure!

The TL;DR version - you need to keep the the final descriptor around
for each hardware queue. The legacy madwifi/freebsd way of doing it
was to keep the last descriptor that was handled. This wasn't enough.
It needs to be per TX queue. You also must never program TxDP after
you've programmed it once - you can only program it again after you've
completely torn down TX DMA for the particular queue.

This is relevant for ath5k _and_ ath9k chips!


The long version:

The DMA engine follows the TX descriptor link list to process things.
Once it finishes a descriptor, it'll read the link descriptor pointer
to see what the next descriptor is. If it's NULL, it'll keep the
address of the link pointer around so it can re-read it when it needs
to start DMA again.

The reason for this is because TX DMA can restart at any time for any
reason. It can be gated by timers, by beacon reception, by burst /
ready time, etc; it can be pre-empted by traffic from a higher
priority queue. It stores this per QCU, so you need to store the last
descriptor per QCU, rather than just the last it handled.

Now, since the intended use case is that software will just write to
the link pointer if there's a new packet to transmit, it needs to be
handled in an atomic way. So, here are the conditions:

* no transmit, no past transmit. TxDP[n] is 0x0. It's never
transmitted anything before. You queue a frame to TxDP[n], set TxE[n]
to 1, and it starts.
* transmit is going on, TxDP[n] is non-NULL. You add something to the
link pointer and then you set TxE[n] to 1 to kick things along. The
DMA engine was NOT already sending the final frame in the queue to the
air, so it'll just keep reading the list as normal.
* transmit is going on, TxDP[n] is non-NULL and it points to the last
frame. You sneak in an update to the link pointer before the hardware
completes the frame, tickle TxE[n] and the hardware will keep going.
* transmit is going on, TxDP[n] is non-NULL and it points to the last
frame. It fnishes it, reads the NULL pointer, sees its NULL, stops
transmitting. You update the link pointer, tickle TxE[n], but .. and
here's the problem.

So to make the last option work, the only sane way is for the hardware
to re-read the link pointer to see if it's updated. It'll then always
do the right thing - if it's just read a NULL link pointer it'll
re-read it to see the updated value and continue transmitting.

The _other_ option (which i think freebsd/madwifi did) was to just
update link pointer and write a new TxDP before transmitting. The TDMA
code in FreeBSD did this. Now this would also get around the race,
right? If you hit the end of the list and it hit NULL, then you write
a new TxDP[n] and set TxE to 1, it should start the new list, right?

Then, comes the "weird crap' I found when digging through the RTL
(chip source.)  There are erm, "situations" where the chip won't
actually process a TxDP update. Thus, even if you write a new TxDP
out, even if the hardware is not transmitting, it may "ignore" the
write.

This is why I took the time in FreeBSD to do this:

* staging descriptor is now per-QCU;
* if i've never transmitted before, I will update TxDP
* otherwise, if I've transmitted before, I will always append a frame
to the transmit queue by adding it to the last descriptor (and if this
is a done staging descriptor, so be it), and poke TxE[n]

This works really, really well.

Now for the ath9k relevant bits!

All the ath9k chips behave the same.

Even the EDMA chips.

So, for EDMA, you _can_ just queue one frame per TX FIFO slot. You
don't have the above behaviour.

However, if you queue multiple frames to a TX FIFO slot (_not_ A-MPDU
frames, but say, you append 15 frames to the cabq or 15 non-aggregate
frames to a normal data queue by appending 15 frames to one TX FIFO
entry, rather than one frame per FIFO entry) then the same behaviour
as above may occur. You still need to keep a staging descriptor around
whilst you're processing TX completions, as TX completions are
per-frame, _not_ per TX FIFO slot. Once you've processed the whole
slot, you don't need to keep the staging descriptor around for the
next slot.

FreeBSD does this for TX EDMA descriptor handling.

I hope this clears things up!

Next - ask me about:

* read-and-clear bugs
* PHY register read problems, and why ANI may occasionally look whacked




-adrian
_______________________________________________
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel

Re: [ath9k-devel] ath5k tx completion

Reply via email to