On Wed, Aug 12, 2015 at 10:14:25AM +0200, Gregory CLEMENT wrote:
> On 11/08/2015 19:26, Catalin Marinas wrote:
> > On Tue, Aug 11, 2015 at 07:05:43PM +0200, Gregory CLEMENT wrote:
> >> From: Nadav Haklai <[email protected]>
> >>
> >> When a PL310 cache is used in a system that provides hardware
> >> coherency, the entire outer cache operations are useless, and can be
> >> skipped.  Moreover, on some systems, it is harmful as it causes
> >> deadlocks between the Marvell coherency mechanism, the Marvell PCIe
> >> controller and the Cortex-A9.
> >>
> >> This commit extends a previous commit:
> >> 98ea2dba65932ffc456b6d7b11b8a0624e2f7b95 which added the io-coherent
> >> support for the PL310 cache by also disabling the outer cache flush
> >> range operation.
> >>
> >> In the current kernel implementation, the outer cache flush range
> >> operation is triggered by the dma_alloc function.
> >> This operation can be take place during runtime and in some
> >> circumstances may lead to the PCIe/PL310 deadlock on Armada 375/38x
> >> SoCs.
> > 
> > While this may work around the issue for your specific SoC, I think a
> > better fix is in DMA alloc code to avoid flushing caches for coherent
> > devices. This would be the __dma_clear_buffer() implementation which
> > isn't aware of whether the device is coherent or not.
> 
> Indeed, the other use of the outer cache flush range is done pretty
> early in the boot and should not be a problem.
> 
> What do you think of the following patch?
[...]
>  arch/arm/include/asm/outercache.h | 9 +++++++++
>  arch/arm/mm/cache-l2x0.c          | 1 +
>  arch/arm/mm/dma-mapping.c         | 6 ++++--
>  3 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/outercache.h 
> b/arch/arm/include/asm/outercache.h
> index 563b92fc2f41..7f7bbfcf1d32 100644
> --- a/arch/arm/include/asm/outercache.h
> +++ b/arch/arm/include/asm/outercache.h
> @@ -35,6 +35,7 @@ struct outer_cache_fns {
>       void (*sync)(void);
>  #endif
>       void (*resume)(void);
> +     bool is_coherent;
> 
>       /* This is an ARM L2C thing */
>       void (*write_sec)(unsigned long, unsigned);
> @@ -56,6 +57,14 @@ static inline void outer_inv_range(phys_addr_t start, 
> phys_addr_t end)
>  }
> 
>  /**
> + * outer_is_coherent - tell if the outer cache is io coherent
> + */
> +static inline bool outer_is_coherent(void)
> +{
> +     return outer_cache.is_coherent;
> +}
> +
> +/**
>   * outer_clean_range - clean dirty outer cache lines
>   * @start: starting physical address, inclusive
>   * @end: end physical address, exclusive
> diff --git a/arch/arm/mm/cache-l2x0.c b/arch/arm/mm/cache-l2x0.c
> index 71b3d3309024..0071e276adc0 100644
> --- a/arch/arm/mm/cache-l2x0.c
> +++ b/arch/arm/mm/cache-l2x0.c
> @@ -1291,6 +1291,7 @@ static const struct l2c_init_data 
> of_l2c310_coherent_data __initconst = {
>               .flush_all   = l2c210_flush_all,
>               .disable     = l2c310_disable,
>               .resume      = l2c310_resume,
> +             .is_coherent = true,
>       },
>  };

I don't really think we need these. If you need to check a device, we
already have is_device_dma_coherent(). Just pass bool argument to
__dma_clear_buffer() and fix the calling places. The __dma_alloc()
function also takes a "bool coherent" but in this case
__alloc_simple_buffer() is only ever used on coherent devices, so you
may not even need to check is_device_dma_coherent() (however, there are
some patches to add CMA support for coherent devices, I don't know
whether they are queued yet).

> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index 1ced8a0f7a52..4d87df2c16f9 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -241,12 +241,14 @@ static void __dma_clear_buffer(struct page *page, 
> size_t size)
>                       page++;
>                       size -= PAGE_SIZE;
>               }
> -             outer_flush_range(base, end);
> +             if (!outer_is_coherent())
> +                     outer_flush_range(base, end);
>       } else {
>               void *ptr = page_address(page);
>               memset(ptr, 0, size);
>               dmac_flush_range(ptr, ptr + size);
> -             outer_flush_range(__pa(ptr), __pa(ptr) + size);
> +             if (!outer_is_coherent())
> +                     outer_flush_range(__pa(ptr), __pa(ptr) + size);
>       }
>  }

This would make sense if we have a device which is outer coherent but
inner non-coherent, though I doubt this is the case you are trying to
fix.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to