On Mon, 16 Oct 2023 02:42:08 +0100, Chris Packham <judge.pack...@gmail.com> wrote: > > On Sun, Oct 15, 2023 at 10:29 AM Chris Packham <judge.pack...@gmail.com> > wrote: > > > > > > > > On Sat, 14 Oct 2023, 11:04 am Marc Zyngier, <m...@kernel.org> wrote: > >> > >> On 2023-10-13 03:40, Chris Packham wrote: > >> > Hi Marc, Paul, > >> > > >> > On Sat, Mar 18, 2023 at 5:23 AM Ying-Chun Liu (PaulLiu) > >> > <paul....@linaro.org> wrote: > >> >> > >> >> From: Marc Zyngier <m...@kernel.org> > >> >> > >> >> Some recent arm64 cores have a facility that allows the page > >> >> table walker to track the dirty state of a page. This makes it > >> >> really efficient to perform CMOs by VA as we only need to look > >> >> at dirty pages. > >> >> > >> >> Signed-off-by: Marc Zyngier <m...@kernel.org> > >> >> [ Paul: pick from the Android tree. Rebase to the upstream ] > >> >> Signed-off-by: Ying-Chun Liu (PaulLiu) <paul....@linaro.org> > >> >> Cc: Tom Rini <tr...@konsulko.com> > >> >> Link: > >> >> https://android.googlesource.com/platform/external/u-boot/+/3c433724e6f830a6b2edd5ec3d4a504794887263 > >> > > >> > I think this may have caused a regression for the Marvell AC5X > >> > board(s). I found that v2023.07 locked up at boot but v2023.01 was > >> > fine. The lockup seemed to be in the 'Net:' init probably just as the > >> > mvneta driver was being initialised. > >> > > >> > A git bisect led me to this change although for this specific change > >> > instead of the lockup I get a crash so maybe I'm actually hitting a > >> > different issue. > >> > > >> > Any thoughts as to why this may have caused problems? > >> > >> Not really. What CPUs does this platform have? What is the offending > >> driver doing to trigger the issue? Can you provide some level of > >> tracing? > > > > > > The Marvell AC5X is a network switch ASIC with an integrated ARMv8 CPU (8.1 > > specifically I think). > > > > I think there is something that the mvneta driver is doing triggering the > > issue. I have another AC5X based board without an Ethernet port that boots > > just fine (this is also why I didn't notice earlier). > > > > I'll try and get some more debug out when I'm back in the office > > > > The thing the mvneta driver does that upsets things appears to be > > mmu_set_region_dcache_behaviour((phys_addr_t)bd_space, BD_SPACE, > DCACHE_OFF); > > I can comment that line out and everything works.
This leads to two questions: - is the device cache coherent, in which case it doesn't need the memory being non-cacheable? If everything is OK, then why the switch to device memory? - what goes wrong when these attributes are applied? do we have to split a block mapping? Instrumenting the MMU code would certainly help understanding what goes wrong here. Thanks, M. -- Without deviation from the norm, progress is not possible.