Hi, Oleksij is working on making OpenOCD properly for debugging kernel and the userspace on what seems to be the toughest ARM configuration now: dual SMP core Cortex-A9 (i.MX6) with MMU, L1 and L2 caches enabled and AHB-AP available. We spent some time today experimenting and discussing the issues faced and I'd like to sum it up here.
My idea is that we should first agree on how it should be done properly, then try to implement it; current patches we have on Gerrit, are not cutting it at all, AFAICT. I have never used OpenOCD with Cortex-A myself, so I might be wrong here and there, please correct me if you see a mistake. Some basic terminology (without strict definitions) first: * AHB-AP access: fast access to main memory, without involving any CPU caches, available on some targets * APB-AP access: slower memory access going basically through the CPU core, equivalent to what would target firmware perform, available on all targets; * L1 cache: per-core instruction and data caches * L2 cache: common cache for all SMP cores * Cache invalidation/flush: drops current cache data by marking it invalid * Cache cleaning: pushes current cache contents (if it was written to) to upper cache (or RAM) without invalidating anything * Software breakpoint: a breakpoint implemented by temporarily changing a target firmware instruction in RAM to a bkpt command Current issues: cache handling in general is a mess and inconsistent. One specific problem worth mentioning is that L2 cache management requires APB access so when AHB is available, it ends up doing nothing. Now I'll try to describe what is the desired behaviour in my opinion. The main rationale is that for the end-user be least surprised we should ensure that OpenOCD manipulates exactly the same data as the target firmware on the currently active core. The three major usecases I describe here are: 1. Regular memory access (mdw/mww); 2. Loading and dumping big chunks of data; 3. Software breakpoints. I'll discuss them one by one, in order. 1. Regular memory access To make it correct with APB no cache operations should be performed at all. Rationale: regular data reads and writes are expected to be done per-core. Implicit cache operations are not needed here, as the user shouldn't expect other cores to be fully in sync at arbitrary points of time. When reading via AHB: clean affected dcache, clean affected l2 cache, read via AHB; When writing via AHB: write data, invalidate affected dcache and icache on the current core and l2 cache An open question: is using AHB here ever worth it? Should at least "phys" operations be always using APB? 2. Loading and dumping big chunks of data Here the user likely expects all cores to see the same picture, so: APB read: clean affected dcache on all cores, clean affected l2 cache, perform read; AHB read: same; APB write: do the write; clean affected dcache, clean affected l2 cache, invalidate affected icache and dcache on all cores; AHB write: do the write; invalidate affected icache and dcache on all cores, invalide affected l2 cache; Oleksij also says that for AHB operations here he'd like to have an option to omit cache maintenance due to potential performance issues but I have an impression there're none. 3. Software breakpoints Breakpoints are special in SMP configuration because the end-user expects all cores to be affected by the same set of breakpoints. Setting and clearing a breakpoint via APB: write memory, clean affected dcache, clean affected l2 cache, invalidate affected icache on all cores Setting and clearing a breakpoint via AHB: write memory, invalidate affected icache on all cores, invalidate affected l2 cache Is it worth doing via AHB ever? There's a possible optimisation opportunity for when a step or resume operation is performed after the target was stopped on a breakpoint: currently OpenOCD uses generic unset (to restore the original instruction), then single-steps the current core, and then sets the breakpoint again. When using APB these operations can spare cache maintenance as only the current core should be affected anyway. It's getting late here, so I might be messing something up again. Please correct, discuss and let's develop a consolidated sane approach to finally make OpenOCD do the right thing on those powerful ARM cores. HTH -- Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software! mailto:[email protected] ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ OpenOCD-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/openocd-devel
