Hi Casey,
On Tue, 12 May 2026 at 14:13, Casey Connolly <[email protected]> wrote:
>
> As more platforms start ensuring they explicitly unmap reserved-memory
> regions a few issues have appeared with how the existing dynamic mapping
> code works. Fix these and get a small optimisation as well.
>
> 1. Teach pte_type() to actually respect the PTE_TYPE_VALID bit
> 2. Don't walk the TLB a second time if we call mmu_change_region_attr()
> with PTE_TYPE_FAULT (since it would just be a slow nop)
> 3. Fix how set_one_region() decides to split blocks.
>
> Today set_one_region() will always split blocks until it reaches the
> smallest granule size (4k) and then update all of these pages. This
> appears to be due to a big in how is_aligned() is implemented, since
> it only evaluates to true if addr and size are both multiples of the
> current granule size, so a mapping aligned to 2M which is 4M in size
> will cleanly result in 2 blocks being set, but a mapping aligned to
> 2M which is 4M + 8k in size will result in blocks being split and 1026
> individual pages being set.
>
> While for the address it is correct to enforce that it is aligned to
> the current granule size, we only need to check if the region size is
> greater than the current granule size. This allows us to simplify our
> second example above to only 4 entries being updated (assuming no blocks
> have to be split) since we only need to update 2 blocks to map the first
> 4M, drastically improving the best-case performance.
>
> In the case where the address is 4k aligned rather than 2M aligned we
> will still be restricted to mapping 4k pages until we reach 2M alignment
> where we could then map a larger 2M granule which previously would never
> happen.
>
> Signed-off-by: Casey Connolly <[email protected]>
> ---
> arch/arm/cpu/armv8/cache_v8.c | 18 ++++++++++++++----
> 1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
> index e9ce6335ee9e..bcfc36603adb 100644
> --- a/arch/arm/cpu/armv8/cache_v8.c
> +++ b/arch/arm/cpu/armv8/cache_v8.c
> @@ -162,9 +162,9 @@ u64 get_tcr(u64 *pips, u64 *pva_bits)
> #define MAX_PTE_ENTRIES 512
>
> static int pte_type(u64 *pte)
> {
> - return *pte & PTE_TYPE_MASK;
> + return *pte & PTE_TYPE_VALID ? *pte & PTE_TYPE_MASK : PTE_TYPE_FAULT;
> }
>
> /* Returns the LSB number for a PTE on level <level> */
> static int level2shift(int level)
> @@ -980,11 +980,12 @@ u64 *__weak arch_get_page_table(void) {
>
> return NULL;
> }
>
> +/* Checks if the current PTE is an aligned subset of the region */
> static bool is_aligned(u64 addr, u64 size, u64 align)
> {
> - return !(addr & (align - 1)) && !(size & (align - 1));
> + return !(addr & (align - 1)) && size >= align;
> }
>
> /* Use flag to indicate if attrs has more than d-cache attributes */
> static u64 set_one_region(u64 start, u64 size, u64 attrs, bool flag, int
> level)
> @@ -992,11 +993,16 @@ static u64 set_one_region(u64 start, u64 size, u64
> attrs, bool flag, int level)
> int levelshift = level2shift(level);
> u64 levelsize = 1ULL << levelshift;
> u64 *pte = find_pte(start, level);
>
> - /* Can we can just modify the current level block PTE? */
> + /* Can we can just modify the current level block/page? */
> if (is_aligned(start, size, levelsize)) {
> - if (flag) {
> + if (attrs == PTE_TYPE_FAULT) {
> + if (pte_type(pte) == PTE_TYPE_TABLE)
The code will still unmap the wrong regions here. One easy to way to
reprod this is apply this for QEMU, enabme meminfo and have a look at
the output.
index 38f0ec5f2fbb..c001ccca878d 100644
--- a/board/emulation/qemu-arm/qemu-arm.c
+++ b/board/emulation/qemu-arm/qemu-arm.c
@@ -114,6 +114,7 @@ int board_late_init(void)
if (CONFIG_IS_ENABLED(USB_KEYBOARD))
usb_init();
+ mmu_change_region_attr(0x50100000, 0x200000, PTE_TYPE_FAULT);
return 0;
}
So, what I think is happening is that pte_type() doesn't take the
level into account. But For L3 entries PTE_TYPE_TABLE (which is really
PTE_TYPE_PAGE for that level) will alway be true on level 3.
Cheers
/Ilias
/Ilias
> + *pte &= ~0;
> + else
> + *pte &= ~(PMD_ATTRMASK | PTE_TYPE_MASK |
> PTE_BLOCK_INNER_SHARE);
> + } else if (flag) {
> *pte &= ~PMD_ATTRMASK;
> *pte |= attrs & PMD_ATTRMASK;
> } else {
> *pte &= ~PMD_ATTRINDX_MASK;
> @@ -1097,8 +1103,12 @@ void mmu_change_region_attr(phys_addr_t addr, size_t
> size, u64 attrs)
> flush_dcache_range(gd->arch.tlb_addr,
> gd->arch.tlb_addr + gd->arch.tlb_size);
> __asm_invalidate_tlb_all();
>
> + /* If we were unmapping a region then we have nothing to make and can
> return. */
> + if (attrs == PTE_TYPE_FAULT)
> + return;
> +
> mmu_change_region_attr_nobreak(addr, size, attrs);
> }
>
> int pgprot_set_attrs(phys_addr_t addr, size_t size, enum pgprot_attrs perm)
>
> --
> 2.53.0
>