On 2026-02-02 at 21:51 +1100, Thomas Hellström <[email protected]> wrote... > On Mon, 2026-02-02 at 21:34 +1100, Alistair Popple wrote: > > On 2026-02-02 at 20:13 +1100, Thomas Hellström > > <[email protected]> wrote... > > > On Sat, 2026-01-31 at 13:42 -0800, John Hubbard wrote: > > > > On 1/31/26 11:00 AM, Matthew Brost wrote: > > > > > On Sat, Jan 31, 2026 at 01:57:21PM +0100, Thomas Hellström > > > > > wrote: > > > > > > On Fri, 2026-01-30 at 19:01 -0800, John Hubbard wrote: > > > > > > > On 1/30/26 10:00 AM, Andrew Morton wrote: > > > > > > > > On Fri, 30 Jan 2026 15:45:29 +0100 Thomas Hellström > > > > > > > > <[email protected]> wrote: > > > > > > > ... > > > > > > > > > > > > > > > I'm also not sure a folio refcount should block migration > > > > > > after > > > > > > the > > > > > > introduction of pinned (like in pin_user_pages) pages. Rather > > > > > > perhaps a > > > > > > folio pin-count should block migration and in that case > > > > > > do_swap_page() > > > > > > can definitely do a sleeping folio lock and the problem is > > > > > > gone. > > > > > > > > A problem for that specific point is that pincount and refcount > > > > both > > > > mean, "the page is pinned" (which in turn literally means "not > > > > allowed > > > > to migrate/move"). > > > > > > Yeah this is what I actually want to challenge since this is what > > > blocks us from doing a clean robust solution here. From brief > > > reading > > > of the docs around the pin-count implementation, I understand it as > > > "If > > > you want to access the struct page metadata, get a refcount, If you > > > want to access the actual memory of a page, take a pin-count" > > > > > > I guess that might still not be true for all old instances in the > > > kernel using get_user_pages() instead of pin_user_pages() for > > > things > > > like DMA, but perhaps we can set that in stone and document it at > > > least > > > for device-private pages for now which would be sufficient for the > > > do_swap_pages() refcount not to block migration. > > > > Having just spent a long time cleaning up a bunch of special > > rules/cases for > > ZONE_DEVICE page refcounting I'm rather against reintroducing them > > just for some > > ZONE_DEVICE pages. So whatever arguments are applied or introduced > > here would > > need to be made to work for all pages, not just some ZONE_DEVICE > > pages. > > That's completely understandable. I would like to be able to say if we > apply the argument that when checking the pin-count pages are locked, > lru-isolated and with zero map-count then that would hold for all > pages, but my knowledge of the mm internals isn't sufficient > unfortunately.
We don't actually have a good model for pinning device-private pages anyway so I'm open to discussion, but I don't think we need to do that to solve this problem. I would appreciate it if you could look at the proposed solution in the other thread a litte bit more closely - AFAICT it should address your problem by doing the same thing as replacing the trylock_page() with lock_page() without requiring getting a page reference, etc. - Alistair > So even if that would be an ultimate goal, we would probably have to be > prepared to have to revert (at least temporarily) such a solution for > !ZONE_DEVICE pages and have a plan for handling that. > > Thanks, > Thomas > > > > > > > > > > > > (In fact, pincount is implemented in terms of refcount, in most > > > > configurations still.) > > > > > > Yes but that's only a space optimization never intended to > > > conflict, > > > right? Meaning a pin-count will imply a refcount but a refcount > > > will > > > never imply a pin-count? > > > > > > Thanks, > > > Thomas > > >
