[...]

>>> +     /*
>>> +      * Direct map restoration cannot fail, as the only error condition
>>> +      * for direct map manipulation is failure to allocate page tables
>>> +      * when splitting huge pages, but this split would have already
>>> +      * happened in folio_zap_direct_map() in
>>> kvm_gmem_folio_zap_direct_map().
>>> +      * Note that the splitting occurs always because guest_memfd
>>> +      * currently supports only base pages.
>>> +      * Thus folio_restore_direct_map() here only updates prot bits.
>>> +      */
>>> +     WARN_ON_ONCE(folio_restore_direct_map(folio));
>>
>> Which raised the question: why should this function then even return an
>> error?
> 
> Dave pointed earlier that the failures were possible [1].  Do you think
> we can document it better?

I'm fine with checking that somewhere (to catch any future problems).

Why not do the WARN_ON_ONCE() in folio_restore_direct_map()?

Then, carefully document (in the new kerneldoc for
folio_restore_direct_map() etc) that folio_restore_direct_map() is only
allowed after a prior successful folio_zap_direct_map(), and add a
helpful comment above the WARN_ON_ONCE() in folio_restore_direct_map()
that we don't expect errors etc.

[...]

>>> -     if (!is_prepared)
>>> +     if (!is_prepared) {
>>>                r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio);
>>> +             if (r)
>>> +                     goto out_unlock;
>>> +     }
>>> +
>>> +     if (kvm_gmem_no_direct_map(folio_inode(folio))) {
>>> +             r = kvm_gmem_folio_zap_direct_map(folio);
>>> +             if (r)
>>> +                     goto out_unlock;
>>> +     }
>>
>>
>> It's a bit nasty that we have two different places where we have to call
>> this. Smells error prone.
> 
> We will actually have 2 more: for the write() syscall and UFFDIO_COPY,
> and 0 once we have [2]
> 
> [2] https://lore.kernel.org/linux-mm/20260225-page_alloc-unmapped-v1-0-
> [email protected]/
> 
>>
>> I was wondering why kvm_gmem_get_folio() cannot handle that?
> 
> Most of the call sites follow the pattern alloc -> write -> zap so
> they'll need direct map for some time after the allocation.
> 

Okay. Nasty. :)

-- 
Cheers,

David

Reply via email to