On 3/6/26 13:48, Nikita Kalyazin wrote:
> 
> 
> On 05/03/2026 17:34, David Hildenbrand (Arm) wrote:
>> On 1/26/26 17:47, Kalyazin, Nikita wrote:
>>> From: Nikita Kalyazin <[email protected]>
>>>
>>> These allow guest_memfd to remove its memory from the direct map.
>>> Only implement them for architectures that have direct map.
>>> In folio_zap_direct_map(), flush TLB on architectures where
>>> set_direct_map_valid_noflush() does not flush it internally.
>>
>> "Let's provide folio_{zap,restore}_direct_map helpers as preparation for
>> supporting removal of the direct map for guest_memfd folios. ...
> 
> Will update, thanks.
> 
>>
>>>
>>> The new helpers need to be accessible to KVM on architectures that
>>> support guest_memfd (x86 and arm64).  Since arm64 does not support
>>> building KVM as a module, only export them on x86.
>>>
>>> Direct map removal gives guest_memfd the same protection that
>>> memfd_secret does, such as hardening against Spectre-like attacks
>>> through in-kernel gadgets.
>>
>> Would it be possible to convert mm/secretmem.c as well?
>>
>> There, we use
>>
>>          set_direct_map_invalid_noflush(folio_page(folio, 0));
>>
>> and
>>
>>          set_direct_map_default_noflush(folio_page(folio, 0));
>>
>> Which is a bit different to below code. At least looking at the x86
>> variants, I wonder why we don't simply use
>> set_direct_map_valid_noflush().
>>
>>
>> If so, can you add a patch to do the conversion, pleeeeassse ? :)
> 
> Absolutely!
> 
>>
>>>
>>> Reviewed-by: Ackerley Tng <[email protected]>
>>> Signed-off-by: Nikita Kalyazin <[email protected]>
>>> ---
>>>   arch/arm64/include/asm/set_memory.h     |  2 ++
>>>   arch/arm64/mm/pageattr.c                | 12 ++++++++++++
>>>   arch/loongarch/include/asm/set_memory.h |  2 ++
>>>   arch/loongarch/mm/pageattr.c            | 12 ++++++++++++
>>>   arch/riscv/include/asm/set_memory.h     |  2 ++
>>>   arch/riscv/mm/pageattr.c                | 12 ++++++++++++
>>>   arch/s390/include/asm/set_memory.h      |  2 ++
>>>   arch/s390/mm/pageattr.c                 | 12 ++++++++++++
>>>   arch/x86/include/asm/set_memory.h       |  2 ++
>>>   arch/x86/mm/pat/set_memory.c            | 20 ++++++++++++++++++++
>>>   include/linux/set_memory.h              | 10 ++++++++++
>>>   11 files changed, 88 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/set_memory.h b/arch/arm64/
>>> include/asm/set_memory.h
>>> index c71a2a6812c4..49fd54f3c265 100644
>>> --- a/arch/arm64/include/asm/set_memory.h
>>> +++ b/arch/arm64/include/asm/set_memory.h
>>> @@ -15,6 +15,8 @@ int set_direct_map_invalid_noflush(const void *addr);
>>>   int set_direct_map_default_noflush(const void *addr);
>>>   int set_direct_map_valid_noflush(const void *addr, unsigned long
>>> numpages,
>>>                                 bool valid);
>>> +int folio_zap_direct_map(struct folio *folio);
>>> +int folio_restore_direct_map(struct folio *folio);
>>>   bool kernel_page_present(struct page *page);
>>>
>>>   int set_memory_encrypted(unsigned long addr, int numpages);
>>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
>>> index e2bdc3c1f992..0b88b0344499 100644
>>> --- a/arch/arm64/mm/pageattr.c
>>> +++ b/arch/arm64/mm/pageattr.c
>>> @@ -356,6 +356,18 @@ int set_direct_map_valid_noflush(const void
>>> *addr, unsigned long numpages,
>>>        return set_memory_valid((unsigned long)addr, numpages, valid);
>>>   }
>>>
>>> +int folio_zap_direct_map(struct folio *folio)
>>> +{
>>> +     return set_direct_map_valid_noflush(folio_address(folio),
>>> +                                         folio_nr_pages(folio), false);
>>> +}
>>> +
>>> +int folio_restore_direct_map(struct folio *folio)
>>> +{
>>> +     return set_direct_map_valid_noflush(folio_address(folio),
>>> +                                         folio_nr_pages(folio), true);
>>> +}
>>
>> Is there a good reason why we cannot have two generic inline functions
>> that simply call set_direct_map_valid_noflush() ?
>>
>> Is it because of some flushing behavior? (which we could figure out)
> 
> Yes, on x86 we need an explicit flush.  Other architectures deal with it
> internally.  

So, we call a _noflush function and it performs a ... flush. What.

Take a look at secretmem_fault(), where we do an unconditional
flush_tlb_kernel_range().

Do we end up double-flushing in that case?

> Do you propose a bespoke implementation for x86 and a
> "generic" one for others?

We have to find a way to have a single set of functions for all archs
that support directmap removal.

One option might be to have some indication from the architecture that
no flush_tlb_kernel_range() is required.

Could be a config option or some simple helper function.

-- 
Cheers,

David

Reply via email to