On 06/03/2026 14:17, David Hildenbrand (Arm) wrote:
On 3/6/26 13:48, Nikita Kalyazin wrote:


On 05/03/2026 17:34, David Hildenbrand (Arm) wrote:
On 1/26/26 17:47, Kalyazin, Nikita wrote:
From: Nikita Kalyazin <[email protected]>

These allow guest_memfd to remove its memory from the direct map.
Only implement them for architectures that have direct map.
In folio_zap_direct_map(), flush TLB on architectures where
set_direct_map_valid_noflush() does not flush it internally.

"Let's provide folio_{zap,restore}_direct_map helpers as preparation for
supporting removal of the direct map for guest_memfd folios. ...

Will update, thanks.



The new helpers need to be accessible to KVM on architectures that
support guest_memfd (x86 and arm64).  Since arm64 does not support
building KVM as a module, only export them on x86.

Direct map removal gives guest_memfd the same protection that
memfd_secret does, such as hardening against Spectre-like attacks
through in-kernel gadgets.

Would it be possible to convert mm/secretmem.c as well?

There, we use

          set_direct_map_invalid_noflush(folio_page(folio, 0));

and

          set_direct_map_default_noflush(folio_page(folio, 0));

Which is a bit different to below code. At least looking at the x86
variants, I wonder why we don't simply use
set_direct_map_valid_noflush().


If so, can you add a patch to do the conversion, pleeeeassse ? :)

Absolutely!



Reviewed-by: Ackerley Tng <[email protected]>
Signed-off-by: Nikita Kalyazin <[email protected]>
---
   arch/arm64/include/asm/set_memory.h     |  2 ++
   arch/arm64/mm/pageattr.c                | 12 ++++++++++++
   arch/loongarch/include/asm/set_memory.h |  2 ++
   arch/loongarch/mm/pageattr.c            | 12 ++++++++++++
   arch/riscv/include/asm/set_memory.h     |  2 ++
   arch/riscv/mm/pageattr.c                | 12 ++++++++++++
   arch/s390/include/asm/set_memory.h      |  2 ++
   arch/s390/mm/pageattr.c                 | 12 ++++++++++++
   arch/x86/include/asm/set_memory.h       |  2 ++
   arch/x86/mm/pat/set_memory.c            | 20 ++++++++++++++++++++
   include/linux/set_memory.h              | 10 ++++++++++
   11 files changed, 88 insertions(+)

diff --git a/arch/arm64/include/asm/set_memory.h b/arch/arm64/
include/asm/set_memory.h
index c71a2a6812c4..49fd54f3c265 100644
--- a/arch/arm64/include/asm/set_memory.h
+++ b/arch/arm64/include/asm/set_memory.h
@@ -15,6 +15,8 @@ int set_direct_map_invalid_noflush(const void *addr);
   int set_direct_map_default_noflush(const void *addr);
   int set_direct_map_valid_noflush(const void *addr, unsigned long
numpages,
                                 bool valid);
+int folio_zap_direct_map(struct folio *folio);
+int folio_restore_direct_map(struct folio *folio);
   bool kernel_page_present(struct page *page);

   int set_memory_encrypted(unsigned long addr, int numpages);
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index e2bdc3c1f992..0b88b0344499 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -356,6 +356,18 @@ int set_direct_map_valid_noflush(const void
*addr, unsigned long numpages,
        return set_memory_valid((unsigned long)addr, numpages, valid);
   }

+int folio_zap_direct_map(struct folio *folio)
+{
+     return set_direct_map_valid_noflush(folio_address(folio),
+                                         folio_nr_pages(folio), false);
+}
+
+int folio_restore_direct_map(struct folio *folio)
+{
+     return set_direct_map_valid_noflush(folio_address(folio),
+                                         folio_nr_pages(folio), true);
+}

Is there a good reason why we cannot have two generic inline functions
that simply call set_direct_map_valid_noflush() ?

Is it because of some flushing behavior? (which we could figure out)

Yes, on x86 we need an explicit flush.  Other architectures deal with it
internally.

So, we call a _noflush function and it performs a ... flush. What.

Yeah, that's unfortunately the status quo as pointed by Aneesh [1]

[1] https://lore.kernel.org/kvm/[email protected]/


Take a look at secretmem_fault(), where we do an unconditional
flush_tlb_kernel_range().

Do we end up double-flushing in that case?

Yes, looks like that. I'll remove the explicit flush and rely on folio_zap_direct_map().


Do you propose a bespoke implementation for x86 and a
"generic" one for others?

We have to find a way to have a single set of functions for all archs
that support directmap removal.

I believe Dave meant to address that with folio_{zap,restore}_direct_map() [2].

[2] https://lore.kernel.org/kvm/[email protected]/


One option might be to have some indication from the architecture that
no flush_tlb_kernel_range() is required.

Could be a config option or some simple helper function.

I'd be inclined to know what arch maintainers think because I don't have a strong opinion on that.


--
Cheers,

David


Reply via email to