On Wed, Mar 07, 2018 at 10:32:26AM -0500, David Miller wrote:
> From: Niklas Cassel <niklas.cas...@axis.com>
> Date: Sat, 3 Mar 2018 00:28:53 +0100
> > However, the last write we do is "DMA start transmission",
> > this is a register in the IP, i.e. it is a write to the cache
> > incoherent MMIO region (rather than a write to cache coherent memory).
> > To ensure that all writes to cache coherent memory have
> > completed before we start the DMA, we have to use the barrier
> > wmb() (which performs a more extensive flush compared to
> > dma_wmb()).
> The is an implicit memory barrier between physical memory writes
> and those to MMIO register space.
> So as long as you place the dma_wmb() to ensure the correct
> ordering within the descriptor words, you need nothing else
> after the last descriptor word write.

Hello David,

Looking at writel() in e.g. arch/arm/include/asm/io.h:
#define writel(v,c)             ({ __iowmb(); writel_relaxed(v,c); })
it indeed has a __iowmb() (which is defined as a wmb()) in its definition.

Is is safe to assume that this is true for all archs?

If so, perhaps the example at:
Should be updated.

Considering this, you can drop/revert:
95eb930a40a0 ("net: stmmac: use correct barrier between coherent memory and 
or perhaps you want me to send a revert?

After reverting 95eb930a40a0, we will still have a dma_wmb() _after_ the
last descriptor word write. You just explained that nothing else is needed
after the last descriptor word write, so I actually think that this last
barrier is superfluous.

Best regards,

