On 2026-02-02 at 21:51 +1100, Thomas Hellström 
<[email protected]> wrote...
> On Mon, 2026-02-02 at 21:34 +1100, Alistair Popple wrote:
> > On 2026-02-02 at 20:13 +1100, Thomas Hellström
> > <[email protected]> wrote...
> > > On Sat, 2026-01-31 at 13:42 -0800, John Hubbard wrote:
> > > > On 1/31/26 11:00 AM, Matthew Brost wrote:
> > > > > On Sat, Jan 31, 2026 at 01:57:21PM +0100, Thomas Hellström
> > > > > wrote:
> > > > > > On Fri, 2026-01-30 at 19:01 -0800, John Hubbard wrote:
> > > > > > > On 1/30/26 10:00 AM, Andrew Morton wrote:
> > > > > > > > On Fri, 30 Jan 2026 15:45:29 +0100 Thomas Hellström
> > > > > > > > <[email protected]> wrote:
> > > > > > > ...
> > > > 
> > > > > 
> > > > > > I'm also not sure a folio refcount should block migration
> > > > > > after
> > > > > > the
> > > > > > introduction of pinned (like in pin_user_pages) pages. Rather
> > > > > > perhaps a
> > > > > > folio pin-count should block migration and in that case
> > > > > > do_swap_page()
> > > > > > can definitely do a sleeping folio lock and the problem is
> > > > > > gone.
> > > > 
> > > > A problem for that specific point is that pincount and refcount
> > > > both
> > > > mean, "the page is pinned" (which in turn literally means "not
> > > > allowed
> > > > to migrate/move").
> > > 
> > > Yeah this is what I actually want to challenge since this is what
> > > blocks us from doing a clean robust solution here. From brief
> > > reading
> > > of the docs around the pin-count implementation, I understand it as
> > > "If
> > > you want to access the struct page metadata, get a refcount, If you
> > > want to access the actual memory of a page, take a pin-count"
> > > 
> > > I guess that might still not be true for all old instances in the
> > > kernel using get_user_pages() instead of pin_user_pages() for
> > > things
> > > like DMA, but perhaps we can set that in stone and document it at
> > > least
> > > for device-private pages for now which would be sufficient for the
> > > do_swap_pages() refcount not to block migration.
> > 
> > Having just spent a long time cleaning up a bunch of special
> > rules/cases for
> > ZONE_DEVICE page refcounting I'm rather against reintroducing them
> > just for some
> > ZONE_DEVICE pages. So whatever arguments are applied or introduced
> > here would
> > need to be made to work for all pages, not just some ZONE_DEVICE
> > pages.
> 
> That's completely understandable. I would like to be able to say if we
> apply the argument that when checking the pin-count pages are locked,
> lru-isolated and with zero map-count then that would hold for all
> pages, but my knowledge of the mm internals isn't sufficient
> unfortunately.

We don't actually have a good model for pinning device-private pages anyway
so I'm open to discussion, but I don't think we need to do that to solve this
problem. I would appreciate it if you could look at the proposed solution in the
other thread a litte bit more closely - AFAICT it should address your problem
by doing the same thing as replacing the trylock_page() with lock_page() without
requiring getting a page reference, etc.

 - Alistair

> So even if that would be an ultimate goal, we would probably have to be
> prepared to have to revert (at least temporarily) such a solution for
> !ZONE_DEVICE pages and have a plan for handling that.
> 
> Thanks,
> Thomas
> 
> 
> > 
> > > > 
> > > > (In fact, pincount is implemented in terms of refcount, in most
> > > > configurations still.)
> > > 
> > > Yes but that's only a space optimization never intended to
> > > conflict,
> > > right? Meaning a pin-count will imply a refcount but a refcount
> > > will
> > > never imply a pin-count?
> > > 
> > > Thanks,
> > > Thomas
> > > 

Reply via email to