On 10/18/2016 03:28 PM, york sun wrote:
On 10/18/2016 02:01 PM, Stephen Warren wrote:
On 10/18/2016 12:40 PM, york sun wrote:
On 10/18/2016 11:14 AM, Stephen Warren wrote:
On 10/18/2016 09:28 AM, york sun wrote:
On 10/17/2016 04:35 PM, Stephen Warren wrote:
From: Stephen Warren <swar...@nvidia.com>

SoC-specific logic may be required for all forms of cache-wide
operations; invalidate and flush of both dcache and icache (note that
only 3 of the 4 possible combinations make sense, since the icache never
contains dirty lines). This patch adds an optional hook for all
implemented cache-wide operations, and renames the one existing hook to
better represent exactly which operation it is implementing. A dummy
no-op implementation of each hook is provided. These dummy
implementations are moved into C code, since there's no need to
implement them in assembly.


Moving this function to C may pose an issue. I had a debug a couple of
years ago that calling a C function put the stack into cache after
flushing L1/L2. That's why I used asm function to flush L3.

Assuming the stack is located in cachable memory, the CPU is free (per
the definition of the ARM architecture) to pull it into the cache at any
time the cache is enabled (and perhaps even when it isn't enabled, at
the very least for the icache on ARMv8 if not other cases too).
Implementation in C vs. assembly has absolutely no effect here. I guess
your statement assumes that C functions will write data to the stack and
assembly functions never will. There's no strict 1:1 correlation between
those two things; assembly code can touch the stack just like C code. If
there's an assumption it won't, it needs to be documented in the header
defining these hook functions.

I assume you're specifically talking about dirtying the dcache between
the point when dcache flushing starts and the point when the dcache is
disabled? If so, flush_dcache_all() itself would have to be manually
coded in assembly to avoid using the stack, as would dcache_disable()
and set_sctlr(). I think this is why dcache_disable() currently disables
the dcache first (thus preventing it acquiring new dirty data) and then
flushes the dcache afterwards (thus guaranteeing that all dirty data is
flushed with no race condition). This implies that your change to swap
the order of those two functions isn't correct. I'm pretty sure I'm

I wonder if David can shed some light on the original order of calls to
disable dcache.

correct in saying that the dcache can hit even if it's disabled, hence
disabling the dcache while it contains dirty data won't lead to issues?

My earlier debug was based on the original order of calls. I found I had
to avoid using the stack before flushing L3. Now with the changed order,
I haven't tested. But I can image the stack will be dirty and flushing
L3 may or may not push the data into main memory (depending on the L3
implementation whether inclusive or not).

You said you are sure dcache can hit even if it is disabled. Can you
explain more? My test shows as soon as the d-cache is disabled, the core
cannot get the data in dirty cache.

By "hit" here, I mean that even with the dcache disabled, when the CPU
performs a read access, if the dcache contains a copy of that data, it
can return it rather than requiring it to be fetched from DRAM.

Yes, with the dcache disabled, I would not expect any writes to allocate
new lines in the cache (although presumably writes would update any
lines already there, in a write-though sense).

At least, I'm pretty sure this is all true. It seems the only way to
allow switching from cache-on to cache-off state without losing dirty data.

I believe my test showed otherwise. As soon as the dcache is disabled,
the core cannot get the dirty cached data if the dcache was flushed by
way/set for L1 and L2 (L3 wasn't flushed). That's why I proposed to
change the order to flush first.

It looks like what I was saying is true for ARMv7 but not for ARMv8 (or perhaps it's not architectural but implementation-defined). For example, the Cortex A15 TRM says:


When you disable the cache, all Write-Back Cacheable requests still look up the L1 cache. If there is a cache hit, the cache is read or updated in the same way as if the cache is enabled. This enables Cacheable memory to remain fully coherent while the cache is disabled.

However, the Cortex A72 TRM says:


When you disable the cache, all Write-Back Cacheable requests do not look up the L1 data cache. L1 cache still services the snoops from the L2 cache.

So yes, we should flush the entire cache first, then disable it.

This does indeed imply that we shouldn't write data between starting to flush the dcache and disabling the cache. However, that seems too restrictive, especially in the face of caches beyond L1/L2 which might need quite some code to flush out. I'll try to investigate that aspect some more.
U-Boot mailing list

Reply via email to