On 06/03/2026 14:17, David Hildenbrand (Arm) wrote:
On 3/6/26 13:48, Nikita Kalyazin wrote:
On 05/03/2026 17:34, David Hildenbrand (Arm) wrote:
On 1/26/26 17:47, Kalyazin, Nikita wrote:
From: Nikita Kalyazin <[email protected]>
These allow guest_memfd to remove its memory from the direct map.
Only implement them for architectures that have direct map.
In folio_zap_direct_map(), flush TLB on architectures where
set_direct_map_valid_noflush() does not flush it internally.
"Let's provide folio_{zap,restore}_direct_map helpers as preparation for
supporting removal of the direct map for guest_memfd folios. ...
Will update, thanks.
The new helpers need to be accessible to KVM on architectures that
support guest_memfd (x86 and arm64). Since arm64 does not support
building KVM as a module, only export them on x86.
Direct map removal gives guest_memfd the same protection that
memfd_secret does, such as hardening against Spectre-like attacks
through in-kernel gadgets.
Would it be possible to convert mm/secretmem.c as well?
There, we use
set_direct_map_invalid_noflush(folio_page(folio, 0));
and
set_direct_map_default_noflush(folio_page(folio, 0));
Which is a bit different to below code. At least looking at the x86
variants, I wonder why we don't simply use
set_direct_map_valid_noflush().
If so, can you add a patch to do the conversion, pleeeeassse ? :)
Absolutely!
Reviewed-by: Ackerley Tng <[email protected]>
Signed-off-by: Nikita Kalyazin <[email protected]>
---
arch/arm64/include/asm/set_memory.h | 2 ++
arch/arm64/mm/pageattr.c | 12 ++++++++++++
arch/loongarch/include/asm/set_memory.h | 2 ++
arch/loongarch/mm/pageattr.c | 12 ++++++++++++
arch/riscv/include/asm/set_memory.h | 2 ++
arch/riscv/mm/pageattr.c | 12 ++++++++++++
arch/s390/include/asm/set_memory.h | 2 ++
arch/s390/mm/pageattr.c | 12 ++++++++++++
arch/x86/include/asm/set_memory.h | 2 ++
arch/x86/mm/pat/set_memory.c | 20 ++++++++++++++++++++
include/linux/set_memory.h | 10 ++++++++++
11 files changed, 88 insertions(+)
diff --git a/arch/arm64/include/asm/set_memory.h b/arch/arm64/
include/asm/set_memory.h
index c71a2a6812c4..49fd54f3c265 100644
--- a/arch/arm64/include/asm/set_memory.h
+++ b/arch/arm64/include/asm/set_memory.h
@@ -15,6 +15,8 @@ int set_direct_map_invalid_noflush(const void *addr);
int set_direct_map_default_noflush(const void *addr);
int set_direct_map_valid_noflush(const void *addr, unsigned long
numpages,
bool valid);
+int folio_zap_direct_map(struct folio *folio);
+int folio_restore_direct_map(struct folio *folio);
bool kernel_page_present(struct page *page);
int set_memory_encrypted(unsigned long addr, int numpages);
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index e2bdc3c1f992..0b88b0344499 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -356,6 +356,18 @@ int set_direct_map_valid_noflush(const void
*addr, unsigned long numpages,
return set_memory_valid((unsigned long)addr, numpages, valid);
}
+int folio_zap_direct_map(struct folio *folio)
+{
+ return set_direct_map_valid_noflush(folio_address(folio),
+ folio_nr_pages(folio), false);
+}
+
+int folio_restore_direct_map(struct folio *folio)
+{
+ return set_direct_map_valid_noflush(folio_address(folio),
+ folio_nr_pages(folio), true);
+}
Is there a good reason why we cannot have two generic inline functions
that simply call set_direct_map_valid_noflush() ?
Is it because of some flushing behavior? (which we could figure out)
Yes, on x86 we need an explicit flush. Other architectures deal with it
internally.
So, we call a _noflush function and it performs a ... flush. What.
Yeah, that's unfortunately the status quo as pointed by Aneesh [1]
[1] https://lore.kernel.org/kvm/[email protected]/
Take a look at secretmem_fault(), where we do an unconditional
flush_tlb_kernel_range().
Do we end up double-flushing in that case?
Yes, looks like that. I'll remove the explicit flush and rely on
folio_zap_direct_map().
Do you propose a bespoke implementation for x86 and a
"generic" one for others?
We have to find a way to have a single set of functions for all archs
that support directmap removal.
I believe Dave meant to address that with
folio_{zap,restore}_direct_map() [2].
[2]
https://lore.kernel.org/kvm/[email protected]/
One option might be to have some indication from the architecture that
no flush_tlb_kernel_range() is required.
Could be a config option or some simple helper function.
I'd be inclined to know what arch maintainers think because I don't have
a strong opinion on that.
--
Cheers,
David