On Wed, 2026-05-13 at 12:03 +0200, David Hildenbrand (Arm) wrote: > On 5/13/26 10:51, Thomas Hellström wrote: > > On Wed, 2026-05-13 at 10:37 +0200, David Hildenbrand (Arm) wrote: > > > On 5/13/26 09:47, Christian König wrote: > > > > Hi David & Thomas, > > > > > > > > ... > > > > > > > > Exactly that is one of the major reasons why we aren't using a > > > > shmem as backing store for TTM buffers in the first place. > > > > > > What was the problem with that the last time this was considered? > > > > > > shmem nowadays supports THP (e.g., 2M) and even mTHP (e.g., 64K). > > > > > > For internal mounts, it must be enabled accordingly > > > (/sys/kernel/mm/transparent_hugepage/.../shmem_enabled). > > > > > > Some distributions still default to "never". I guess if an admin > > > enables it, you > > > would just get THPs. > > > > FWIW, the i915 driver which uses shmem "natively" uses a special > > mount > > here that gives back THPs. > > > > > > > > If "distro default" is the only problem, I guess we could think > > > about > > > how to > > > improve that. For example, just let internal GPU DRM objects > > > allocate > > > any folio > > > size available and supported etc. > > > > > > Would that make it possible to just use shmem natively? (e.g., > > > how > > > would this > > > interact with shmem features like folio migration, would that be > > > workable with > > > DRM objects?). > > > > Currently the drivers that use shmem in this way use > > "mapping_set_unevictable()" as long as the object is bound to the > > GPU. > > Then shrinkers can unbind from GPU and revert that setting. > > Right, but mapping_set_unevictable() only affects folio_evictable() - > -reclaim > behavior. Not other properties (such as folio migration).
Interesting. Does that imply that a shmem folio can be replaced underneath without additional measures? It looks like most DRM call sites imply that mapping_set_unevictable() pins underlying shmem folios > > > > > The problem, (as also stated in the cover letter of this series) is > > for > > drivers that need to change caching of the pages to WC or UC. > > I assume you mean "To be able to easily maintain pools of pages > mapped uncached > or write-combined". > Exactly. > Can you point me at the code that changes the caching of the pages? x86 implementation is here: https://elixir.bootlin.com/linux/v7.1-rc3/source/arch/x86/mm/pat/set_memory.c#L2556 TTM calls it here: https://elixir.bootlin.com/linux/v7.1-rc3/source/drivers/gpu/drm/ttm/ttm_pool.c#L249 And there are actually shmem helpers that do this as well, without pooling. https://elixir.bootlin.com/linux/v7.1-rc3/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L212 > > > That's an > > extremely costly operation so TTM needs to pool such allocations. > > That's where using shmem natively becomes very ugly, because you > > can't > > really use a 1:1 mapping between shmem objects and DRM objects > > anymore. > > So you would require different caching attributes within a DRM > object? The way the TTM pools work are that there are separate pools for each allocation order and caching modes. That would essentially mean allocations from a single shmem object would be spread out across different pools, and we'd loose the 1:1 mapping between DRM objects and shmem objects. One alternative would be a single large sparse shmem object common for all DRM objects, with a range allocator, but that also got pretty ugly when I tried to implement that. Finally, (and I think that might be what Christian was getting at as well) Without CONFIG_TRANSPARENT_HUGEPAGE, we'd only see order 0 shmem folios, right? Thanks, Thomas
