On Fri, Feb 27, 2026 at 11:39:44AM +0100, Emanuele Ghidoli wrote: > > > On 2/27/26 11:13, Francis, Neha wrote: > > > > > > On 2/26/2026 10:01 PM, Tom Rini wrote: > >> On Thu, Feb 26, 2026 at 05:30:06PM +0100, Stefan Eichenberger wrote: > >>> Hi Francesco and Tom, > >>> > >>> On Thu, Feb 26, 2026 at 05:11:49PM +0100, Francesco Dolcini wrote: > >>>> +Emanuele > >>>> > >>>> Hello Tom, > >>>> > >>>> On Thu, Feb 26, 2026 at 08:23:45AM -0600, Tom Rini wrote: > >>>>> On Thu, Feb 26, 2026 at 08:05:02AM +0100, Francesco Dolcini wrote: > >>>>>> Hello Tom, > >>>>>> > >>>>>> On Fri, Mar 14, 2025 at 11:06:49AM +0100, Stefan Eichenberger wrote: > >>>>>>> From: Stefan Eichenberger <[email protected]> > >>>>>>> > >>>>>>> The get_ram_size() function fails to restore the original RAM data > >>>>>>> when > >>>>>>> the data cache is enabled. This issue was observed on an AM625 R5 SPL > >>>>>>> with 512MB of RAM and is a regression that became visible with > >>>>>>> commit bc07851897bd ("board: ti: Pull redundant DDR functions to a > >>>>>>> common > >>>>>>> location and Fixup DDR size when ECC is enabled"). > >>>>>>> > >>>>>>> Observed boot failure messages: > >>>>>>> Warning: Did not detect image signing certificate. Skipping > >>>>>>> authentication to prevent boot failure. This will fail on Security > >>>>>>> Enforcing(HS-SE) devices > >>>>>>> Authentication passed > >>>>>>> Starting ATF on ARM64 core... > >>>>>>> > >>>>>>> The system then hangs. This indicates that without a data cache flush, > >>>>>>> data in the cache is not coherent with RAM, preventing the system from > >>>>>>> booting. This was verified by printing the content of this address > >>>>>>> when > >>>>>>> the issue occurs. > >>>>>>> > >>>>>>> Add a data cache flush after each restore operation to resolve this > >>>>>>> issue. > >>>>>>> > >>>>>>> Fixes: bc07851897bd ("board: ti: Pull redundant DDR functions to a > >>>>>>> common location and Fixup DDR size when ECC is enabled") > >>>>>>> Fixes: 1c64b98c1ec4 ("common/memsize.c: Fix get_ram_size() when cache > >>>>>>> is enabled") > >>>>>>> Signed-off-by: Stefan Eichenberger <[email protected]> > >>>>>> > >>>>>> Tom, can we merge this? > >>>>>> This is the last bit to solve the regression reported here, > >>>>>> https://lore.kernel.org/all/20260224152405.GD340942@francesco-nb/ > >>>>> > >>>>> I wasn't happy with this at the time, and Stefan's last email in the > >>>>> thread left me with the impression more investigation was needed and > >>>>> likely something else was the root cause. > >>>> > >>>> I believe that this patch is needed. > >>>> > >>>> On AM62 what is happening is the following. > >>>> > >>>> We have a cortex-R5 that is the first core booting (there is also a > >>>> cortex-m4, but it's not relevant for this discussion). > >>>> > >>>> It runs from internal memory and it configures the DDR ram > >>>> > >>>> We load to DDR memory various pieces of firmware (TFA, U-Boot for the > >>>> cortex A53, ...) > >>>> > >>>> We do execute get_ram_size(), that read/write the memory, and it is > >>>> supposed to restore it back the original content > >>>> > >>>> However when we have the cache enabled, we might miss to write back the > >>>> original memory content, where the other pieces of firmware are. > >>>> > >>>> And after that we start the cortex A53, running in DDR, and there the > >>>> memory content might not be correct, because there is no cache coherency > >>>> between the cortex-A and the cortex-R. And because of that we have > >>>> crashes. > >>>> > >>>> Stefan: any comment here? Can you help? > >>> > >>> I think what you wrote summarises the issue well. If I recall correctly, > >>> I "fixed" the issue last time by simply calling get_ram_size() once > >>> before enabling the cache. This was in commit 4164289db882e. The SPL > >>> then informs U-Boot of the memory size via fdt fixup. However, something > >>> has probably changed now (possibly in the R5 SPL), meaning the cache is > >>> enabled earlier, so the cache is enabled again when get_ram_size() is > >>> called. > >>> > >>> For the AMP use case, either "get_ram_size" should not be called once > >>> the cache is enabled, or a similar patch to the one I proposed is > >>> required. > >> > >> I would lean towards the former if at all possible. > >> > > > > Just trying to understand, what is the reasoning behind ensuring > > get_ram_size is > > not called if cache is not enabled? Wasn't get_ram_size written with the > > possibility of cache being enabled (existence of dcache_en logic); then this > > patch is a valid fix right? > > > > In parallel, I do agree we need to have a code analysis w.r.t dram_init, we > > are > > making certain cache and dram calls spuriously making this confusing. > > > > Hello Tom, > I agree with Francis. > > When I proposed commit 1c64b98c1ec4 ("common/memsize.c: Fix get_ram_size() > when cache is enabled"), I was not considering the presence of other actors > (other cores, DMA engines, etc.). > > That patch fixes what I had overlooked at the time. We need to restore the > actual RAM contents, not only what is perceived by the core executing > get_ram_size(). > > To me this patch sounds intrinsically correct.
Alright. Can I please get some Reviewed / Tested by tags? Thanks. -- Tom
signature.asc
Description: PGP signature

