[PATCH 1/2] ARM: mmu: optimize dma_alloc_coherent for cache-coherent DMA masters

Ahmad Fatoum Thu, 15 Jan 2026 04:06:42 -0800

If a device is DMA-capable and cache-coherent, it can be considerably
faster to keep shared memory cached, instead of mapping it uncached
unconditionally like we currently do.


This was very noticeable when using Virt I/O with KVM acceleration as
described in commit 3ebd05809a49 ("virtio: don't use DMA API unless
required").

In preparation for simplifying the code in the aforementioned commit,
consult dev_is_dma_coherent() before doing cache maintenance.

Signed-off-by: Ahmad Fatoum <[email protected]>
---
 arch/arm/cpu/mmu-common.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm/cpu/mmu-common.c b/arch/arm/cpu/mmu-common.c
index a1431c0ff461..2b22ab47cac8 100644
--- a/arch/arm/cpu/mmu-common.c
+++ b/arch/arm/cpu/mmu-common.c
@@ -50,9 +50,11 @@ void *dma_alloc_map(struct device *dev,
                *dma_handle = (dma_addr_t)ret;
 
        memset(ret, 0, size);
-       dma_flush_range(ret, size);
 
-       remap_range(ret, size, map_type);
+       if (!dev_is_dma_coherent(dev)) {
+               dma_flush_range(ret, size);
+               remap_range(ret, size, map_type);
+       }
 
        return ret;
 }
@@ -70,8 +72,8 @@ void *dma_alloc_coherent(struct device *dev,
 void dma_free_coherent(struct device *dev,
                       void *mem, dma_addr_t dma_handle, size_t size)
 {
-       size = PAGE_ALIGN(size);
-       remap_range(mem, size, MAP_CACHED);
+       if (!dev_is_dma_coherent(dev))
+               remap_range(mem, PAGE_ALIGN(size), MAP_CACHED);
 
        free(mem);
 }
-- 
2.47.3

[PATCH 1/2] ARM: mmu: optimize dma_alloc_coherent for cache-coherent DMA masters

Reply via email to