On 5/13/26 10:37, David Hildenbrand (Arm) wrote: > On 5/13/26 09:47, Christian König wrote: >> Hi David & Thomas, >> >> On 5/12/26 22:03, David Hildenbrand (Arm) wrote: >>> On 5/12/26 13:31, Thomas Hellström wrote: >> ... >>>> >>>> OK, can eliminate those. Is VM_WARN_ON_FOLIO() preferred, >>>> or any other type of assert? >>> >>> VM_WARN_ON_FOLIO() is usually what you want, or VM_WARN_ON_ONCE(). >>> >>>> >>>> >>>> OK, let me understand the concern. The pages are allocated as multi- >>>> page folios using alloc_pages(gfp, order), but typically not promoted >>>> to compound pages, until inserted here. Is it that promotion that is of >>>> concern or inserting pages of unknown origin into shmem? Anything we >>>> can do to alleviate that concern? >>> >>> It's all rather questionable. >>> >>> A couple of points: >>> >>> a) The pages are allocated to be unmovable, but adding them to shmem >>> effectively >>> turns them movable. Now you interfere with the page allocator logic of >>> placing movable and unmovable pages a reasonable way into >>> pageblocks that group allocations of similar types. >>> >>> b) A driver is not supposed to decide which folio size will be allocated for >>> shmem. >> >> Exactly that is one of the major reasons why we aren't using a shmem as >> backing store for TTM buffers in the first place. > > What was the problem with that the last time this was considered? > > shmem nowadays supports THP (e.g., 2M) and even mTHP (e.g., 64K). > > For internal mounts, it must be enabled accordingly > (/sys/kernel/mm/transparent_hugepage/.../shmem_enabled). > > Some distributions still default to "never". I guess if an admin enables it, > you > would just get THPs.
Yeah, exactly that is not acceptable. We have some customers who already use that approach through udmabuf, so we already have some experience with it. And I can't count how often I had to explain that it's a configuration issue and that the admin has to enable THP to get decent performance. > If "distro default" is the only problem, I guess we could think about how to > improve that. For example, just let internal GPU DRM objects allocate any > folio > size available and supported etc. Mhm, that sounds not so bad. I think what drivers really need is that they can give the order to shmem_read_folio_gfp() and get a folio with that order or -ENOMEM as return. In other words we need to enforce it and if the desired page size doesn't work we can then still decide if we want a fallback or not based on the use case the driver tries to implement. > Would that make it possible to just use shmem natively? (e.g., how would this > interact with shmem features like folio migration, would that be workable with > DRM objects?). Mostly, I mean there is still the use case for UC and USWC memory but at least for AMD GPUs that is mostly negligible (we need it for a handfull of workarounds for HW bugs etc...). Thanks, Christian.
