Vlastimil Babka <[email protected]> writes:

> On 10/17/25 22:11, Ackerley Tng wrote:
>> filemap_add_folio(), called from filemap_grab_folio(), adds folios to
>> an LRU list. This is unnecessary for guest_memfd, which does not
>> participate in swapping.
>
> IIRC guest_memfd mappings are unevictable. That should mean they are not
> ultimately added to a list (see lruvec_add_folio()).
>
>> In addition, the LRU list takes a reference count on the folio. With
>
> IIUC the refcount is temporary while being on the percpu
> &cpu_fbatches.lru_add, added by __folio_batch_add_and_move().

Thanks for pointing this out. You're right about this, I misunderstood
this refcounting earlier.

> When flushed
> via folio_batch_move_lru(), the refcount is removed and there's only the LRU
> folio flag that remains. The fbatch flushing can be triggered if you see an
> unexpected refcount increase.

The new plan is, to update kvm_gmem_is_safe_for_conversion() to drain
the fbatch if it some elevated refcount is found:

static bool kvm_gmem_is_safe_for_conversion(struct inode *inode,
                                            pgoff_t start, size_t nr_pages,
                                            pgoff_t *err_index)
{
        struct address_space *mapping = inode->i_mapping;
        const int filemap_get_folios_refcount = 1;
        pgoff_t last = start + nr_pages - 1;
        struct folio_batch fbatch;
        bool lru_drained = false;
        bool safe = true;
        int i;

        folio_batch_init(&fbatch);
        while (safe && filemap_get_folios(mapping, &start, last, &fbatch)) {

                for (i = 0; i < folio_batch_count(&fbatch);) {
                        struct folio *folio = fbatch.folios[i];

                        safe = (folio_ref_count(folio) ==
                                folio_nr_pages(folio) +
                                filemap_get_folios_refcount);

                        if (safe) {
                                ++i;
                        } else if (!lru_drained) {
                                lru_add_drain_all();
                                lru_drained = true;
                        } else {
                                *err_index = folio->index;
                                break;
                        }
                }

                folio_batch_release(&fbatch);
        }

        return safe;
}

I hope this is what you meant!

> So it might be feasible to do without this
> patch (maybe it was already tried and there were substantial issues, in
> which case should be mentioned).
>

The patch "KVM: guest_memfd: Skip LRU for guest_memfd folios" will be
dropped from the next revision, and "KVM: guest_memfd: Don't set
FGP_ACCESSED when getting folios" is no longer a requirement for this
patch series.

>> shared-to-private memory conversions for KVM guests dependent on folio
>> refcounts, this extra reference can cause conversions to fail due to
>> unexpected refcounts.
>>
>>
>> [...snip...]
>>

Reply via email to