On Tue, 2018-03-27 at 16:10 +0100, Will Deacon wrote: > To clarify: are you saying that on x86 you need a wmb() prior to a writel > if you want that writel to be ordered after prior writes to memory? Is this > specific to WC memory or some other non-standard attribute? > > The only reason we have wmb() inside writel() on arm, arm64 and power is for > parity with x86 because Linus (CC'd) wanted architectures to order I/O vs > memory by default so that it was easier to write portable drivers. The > performance impact of that implicit barrier is non-trivial, but we want the > driver portability and I went as far as adding generic _relaxed versions for > the cases where ordering isn't required. You seem to be suggesting that none > of this is necessary and drivers would already run into problems on x86 if > they didn't use wmb() explicitly in conjunction with writel, which I find > hard to believe and is in direct contradiction with the current Linux I/O > memory model (modulo the broken example in the dma_*mb section of > memory-barriers.txt).
Another clarification while we are at it .... All of this only applies to concurrent access by the CPU and the device to memory allocate with dma_alloc_coherent(). For memory "mapped" into the DMA domain via dma_map_* then an extra dma_sync_for_* is needed. In most useful server cases etc... these latter are NOPs, but architecture without full DMA cache coherency or using swiotlb, dma_map_* might maintain bounce buffers or play additional cache flushing tricks. Cheers, Ben.