Thanks Antonio,

So my assumptions were wrong about how openocd accesses the memory and this 
gives me a better comprehension of what you already explained.

The patch you pointed is interesting, I tried to solve my issue with a custom 
command as well.
But my approach was very simple, it was simply to call the function that clears 
and invalidates both caches (This function has been introduced by the patches 
we already submitted). I enclose my modifications for information.

Giving the issue some thoughts, we might have (at least) two different use 
cases:

  *
User loads a first ELF that activate the cache in crt0 code. We need  to clear 
and invalidate to properly load a second ELF. My attempt is focused on this 
case.
  *
User loads a first ELF. Then a second binary that do not need the cache may be 
loaded. In this case it's important to have a finer access to SCTLR register, 
as proposed in patch 6055.

I'm also wondering if there might be an issue with L1 and L2 caches. Are they 
both controlled with the same instructions?
Note that I didn't read ARM TRM on this topic yet. I'll catch up and get 
feedback from my team as well.

Best regards,

Adrien Charruel
________________________________
De : Antonio Borneo <borneo.anto...@gmail.com>
Envoyé : lundi 5 mai 2025 16:40
À : Adrien CHARRUEL <acharr...@nanoxplore.com>
Cc : OpenOCD <openocd-devel@lists.sourceforge.net>
Objet : Re: ARMv8r: Unable to Load ELF With Cache Enabled

Thinking over your use case ...
what I already wrote is not fully relevant.

When a previous FW has enabled the cache, to load and execute a new FW you need 
either:
- or sync I-Cache and D-Cache after any write, but this slows down everything;
- or disable caching before loading the 2nd FW. But this also means that the 
2nd FW should enable caches if it needs.

Today for armv8a/r we do not have an OpenOCD command to disable cache.
But there is change 6055 that was targeting this.

Do you think this could work in your case?

Antonio

On Mon, May 5, 2025, 14:55 Antonio Borneo 
<borneo.anto...@gmail.com<mailto:borneo.anto...@gmail.com>> wrote:
Hi Adrian,

OpenOCD creates a link between the CPU and GDB.
Nor GDB nor OpenOCD have the knowledge required to access the RAM directly 
(apart for the cortex-m case where the RAM is mapped in the same address space 
of the debug registers). By the way, not relevant here, there are cases where 
having GDB accessing the RAM is required, but this is work on progress in 
https://review.openocd.org/c/openocd/+/8815


Today GDB memory requests are managed through the CPU. For armv8r, OpenOCD 
injects instructions in the pipeline of the CPU to R/W memory through CPU 
registers.
Memory is read and written by the CPU, eventually passing through the D-Cache.
I-Cache is not impacted, so after a memory write it is required to flush 
D-Cache and invalidate I-Cache before executing a new code.
The D-Cache flush requires also a flush of any write buffer present in the CPU.
The I-Cache invalidate requires the invalidate of the instructions pre-fetch 
buffer in the CPU.
The sequence provided by ARM should guarantee all is property synchronized.
The SW sequence in OpenOCD is incorrect; it misses at least the invalidate of 
the pre-fetch buffer.

Regards
Antonio




On Mon, May 5, 2025, 14:16 Adrien CHARRUEL 
<acharr...@nanoxplore.com<mailto:acharr...@nanoxplore.com>> wrote:
Hi Antonio,

Sorry the word "sideloading" may not be appropriate in my case.

What I meant it that, to my understanding, OpenOCD is loading the ELF file 
through the DAP of the SoC.
The DAP is linked to the interconnect and has access to the embedded RAM to 
write the binary.

The loading of the second ELF file is done without reseting the board.
And doing so, it bypasses the caches of the CPUs thus compromising the 
coherency. So caches have to be invalidated before running the CPUs.

Here is a schematic of our architecture.
I hope this makes sense.


  OPENOCD
     │
     │
     │                          ┌─────────────┐
     │                          │             │
     │                          │    eRAM     │
     │                          │             │
     │                          └──────┬──────┘
┌────┴─────┐     ┌─────────────────────┴─────────────────────────────────┐
│          │     │                                                       │
│   DAP    ├─────┤           ARM Network Interconnect                    │
│          │     │                                                       │
└──────────┘     └────────────────────────┬──────────────────────────────┘
                                          │
                     ┌────────────┐ ┌─┌───┴───┐─┐
                     │            │ │ │ CACHE │ │
                     │    CORE0   │ │ └───────┘ │
                     │    TCMA    ├─┤           │
                     │            │ │           │
                     └────────────┘ │           │
                     ┌────────────┐ │    CPU0   │
                     │            │ │           │
                     │    CORE0   │ │           │
                     │    TCMB    ├─┤           │
                     │            │ │           │
                     └────────────┘ └───────────┘

Thanks for your time.
Best regards,

Adrien Charruel
________________________________
De : Antonio Borneo <borneo.anto...@gmail.com<mailto:borneo.anto...@gmail.com>>
Envoyé : dimanche 4 mai 2025 13:03
À : Adrien CHARRUEL <acharr...@nanoxplore.com<mailto:acharr...@nanoxplore.com>>
Cc : 
openocd-devel@lists.sourceforge.net<mailto:openocd-devel@lists.sourceforge.net> 
<openocd-devel@lists.sourceforge.net<mailto:openocd-devel@lists.sourceforge.net>>
Objet : Re: ARMv8r: Unable to Load ELF With Cache Enabled

On Tue, Apr 29, 2025 at 4:12 PM Adrien CHARRUEL
<acharr...@nanoxplore.com<mailto:acharr...@nanoxplore.com>> wrote:
>
> Hi,
>
> I'd like to raise an issue we are having in my team with OpenOCD. I'm not 
> able to load an ELF file from GDB once some code is already running and 
> caches have been enabled.
>
> Here is a more complete description of my use case:
>
> Start OpenOCD and connect with GDB.
> Load a first ELF file and run it through GDB. This first program enables 
> caches.
> Halt the target.
> Load a second ELF file through GDB with the "load" command.
> Run it => fails... Sometimes I get a crash. For very small program the old 
> one still resides in cache and is executed instead of the new one.
>
> The issue is that I'm sideloading the binary through the DAP and memory 
> coherency is not preserved.


Hi,
what do you mean by "sideloading the binary through the DAP" ?
The GDB command "load" in your email uses the current target for the
load. No sideloading through other APs is implemented.

In latest ARM reference manual for armv8a
https://developer.arm.com/documentation/ddi0487/latest/
in chapter for 64 bits B2.7.4.2 - "Synchronization and coherency
issues between data and instruction accesses"
and for 32 bits E2.5.3.2 - "Synchronization and coherency issues
between data and instruction accesses"
there are the sequences of instructions to guarantee the coherency.
I see that OpenOCD code does not respect it.
On Cortex-A35 during step-by-step execution, changing a next
instruction with a SW breakpoint fails because the CPU has already
pre-fetched next instructions.
I cannot find information on such a prefetch, but it's reasonable to
consider it exists and that the correct sequence from ARM is required
to invalidate it.

Checking the extra documents for armv8r, DDI0568 and DDI0600, I don't
find any specific info.

Probably updating the sequence in OpenOCD for my SW breakpoint issue
is fixing your issue too.

Antonio

> Here are some more details on my setup:
>
> Target: quad cortex-r52 from our in-house chip. It's a standard armv8r core. 
> We use our in-house JTAG probe, Angie, which is upstreamed in openocd.
> Configuration script is "ngultra.cfg" enclosed to this email.
> Command line:
> ./src/openocd -f tcl/interface/angie.cfg -f tcl/target/ngultra.cfg
> List of GDB commands:
> target remote :3333
> load apps/test_led/out/test_led.elf
> file apps/test_led/out/test_led.elf
> continue
> CTRL+C
> load apps/test_led2/out/test_led2.elf
> file apps/test_led2/out/test_led2.elf
> continue
> Expected: second led test should run fine, it's not the case.
> OpenOCD logs enclosed as "openocd.log".
>
>
> Version of OpenOCD is SHA1 "169d463a3d3c91f62c980aba287b5e110b310ad0" with 
> extra patches available here:
>
> https://review.openocd.org/q/topic:nx-armv8r
> https://review.openocd.org/q/topic:nx-angie
>
>
>
> Potential solution and discussion:
>
> As there is already a discussion on this topic with Antonio 
> (https://review.openocd.org/c/openocd/+/8656), I removed this part from my 
> patchset.
> Indeed Antonio's point is good, it's too slow to attempt a clear cache at 
> every memory write. It's not an option.
>
> I prepared another patch that adds a "clear_caches" command for aarch64.
> The user will have to manually call this before loading another binary. It 
> might be a better solution.
>
> I'm still wondering if the clear cache could be called from another function, 
> maybe in the aarch64_halt() callback, but I can't tell if there will be side 
> effects.
>
> Moreover, this behaviour is not only relevant for armv8r but for armv8a as 
> well. I reproduced it by using aarch64 instead of armv8r in the target 
> configuration.
>
> I hope that my question is clear enough. I'll remain available to discuss 
> this point.
> Maybe it's an expected behaviour and I'd like to hear from the community to 
> know how to deal with this issue and for me to have a better understanding of 
> it.
>
> Thanks a lot.
> Best regards,
>
>
>
>
> Adrien Charruel
From 3ddf2500fd2fec372e61c8ae9cb11ff3be7b418c Mon Sep 17 00:00:00 2001
From: Adrien Charruel <acharr...@nanoxplore.com>
Date: Tue, 29 Apr 2025 16:44:15 +0200
Subject: [PATCH] target/aarch64: Replace clear cache on memwrite by user
 command

Change-Id: Ia846ee7e53f8a37d3b638abb68dbb02c5f833590
---
 src/target/aarch64.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/src/target/aarch64.c b/src/target/aarch64.c
index 2c8c975f5..723dc9cdb 100644
--- a/src/target/aarch64.c
+++ b/src/target/aarch64.c
@@ -130,9 +130,13 @@ static int aarch64_flush_and_deactivate_caches(struct target *target)
 	struct aarch64_common *aarch64 = target_to_aarch64(target);
 	struct armv8_common *armv8 = &aarch64->armv8_common;
 	int retval = ERROR_OK;
-
 	uint64_t system_control_reg_curr_updated = aarch64->system_control_reg_curr;
 
+	if (target->state != TARGET_HALTED) {
+		LOG_TARGET_ERROR(target, "not halted");
+		return ERROR_TARGET_NOT_HALTED;
+	}
+
 	if (system_control_reg_curr_updated & 0x4U) {
 		/*  data cache is active */
 		system_control_reg_curr_updated &= ~0x4U;
@@ -2509,10 +2513,6 @@ static int aarch64_read_phys_memory(struct target *target,
 	int retval = ERROR_COMMAND_SYNTAX_ERROR;
 
 	if (count && buffer) {
-		/* Flush any CPU cache */
-		retval = aarch64_flush_and_deactivate_caches(target);
-		if (retval != ERROR_OK)
-			return retval;
 		/* read memory through APB-AP */
 		retval = aarch64_mmu_modify(target, 0);
 		if (retval != ERROR_OK)
@@ -2549,11 +2549,6 @@ static int aarch64_write_phys_memory(struct target *target,
 	int retval = ERROR_COMMAND_SYNTAX_ERROR;
 
 	if (count && buffer) {
-		/* Flush any CPU cache */
-		retval = aarch64_flush_and_deactivate_caches(target);
-		if (retval != ERROR_OK)
-			return retval;
-
 		/* write memory through APB-AP */
 		retval = aarch64_mmu_modify(target, 0);
 		if (retval != ERROR_OK)
@@ -3172,6 +3167,12 @@ COMMAND_HANDLER(aarch64_mcrmrc_command)
 	return ERROR_OK;
 }
 
+COMMAND_HANDLER(aarch64_handle_clear_caches)
+{
+	struct target *target = get_current_target(CMD_CTX);
+	return aarch64_flush_and_deactivate_caches(target);
+}
+
 static const struct command_registration aarch64_exec_command_handlers[] = {
 	{
 		.name = "cache_info",
@@ -3215,6 +3216,13 @@ static const struct command_registration aarch64_exec_command_handlers[] = {
 		.help = "read coprocessor register",
 		.usage = "cpnum op1 CRn CRm op2",
 	},
+	{
+		.name = "clear_caches",
+		.handler = aarch64_handle_clear_caches,
+		.mode = COMMAND_EXEC,
+		.help = "clear and invalidate instruction and data caches",
+		.usage = "",
+	},
 	{
 		.chain = smp_command_handlers,
 	},
-- 
2.47.2



Reply via email to