On Wed, 2026-05-13 at 12:03 +0200, David Hildenbrand (Arm) wrote:
> On 5/13/26 10:51, Thomas Hellström wrote:
> > On Wed, 2026-05-13 at 10:37 +0200, David Hildenbrand (Arm) wrote:
> > > On 5/13/26 09:47, Christian König wrote:
> > > > Hi David & Thomas,
> > > > 
> > > > ...
> > > > 
> > > > Exactly that is one of the major reasons why we aren't using a
> > > > shmem as backing store for TTM buffers in the first place.
> > > 
> > > What was the problem with that the last time this was considered?
> > > 
> > > shmem nowadays supports THP (e.g., 2M) and even mTHP (e.g., 64K).
> > > 
> > > For internal mounts, it must be enabled accordingly
> > > (/sys/kernel/mm/transparent_hugepage/.../shmem_enabled).
> > > 
> > > Some distributions still default to "never". I guess if an admin
> > > enables it, you
> > > would just get THPs.
> > 
> > FWIW, the i915 driver which uses shmem "natively" uses a special
> > mount
> > here that gives back THPs.
> > 
> > > 
> > > If "distro default" is the only problem, I guess we could think
> > > about
> > > how to
> > > improve that. For example, just let internal GPU DRM objects
> > > allocate
> > > any folio
> > > size available and supported etc.
> > > 
> > > Would that make it possible to just use shmem natively? (e.g.,
> > > how
> > > would this
> > > interact with shmem features like folio migration, would that be
> > > workable with
> > > DRM objects?).
> > 
> > Currently the drivers that use shmem in this way use
> > "mapping_set_unevictable()" as long as the object is bound to the
> > GPU.
> > Then shrinkers can unbind from GPU and revert that setting.
> 
> Right, but mapping_set_unevictable() only affects folio_evictable() -
> -reclaim
> behavior. Not other properties (such as folio migration).

Interesting. Does that imply that a shmem folio can be replaced
underneath without additional measures? It looks like most DRM
call sites imply that mapping_set_unevictable() pins underlying shmem
folios


> 
> > 
> > The problem, (as also stated in the cover letter of this series) is
> > for
> > drivers that need to change caching of the pages to WC or UC.
> 
> I assume you mean "To be able to easily maintain pools of pages
> mapped uncached
> or write-combined".
> 

Exactly.


> Can you point me at the code that changes the caching of the pages?



x86 implementation is here:
https://elixir.bootlin.com/linux/v7.1-rc3/source/arch/x86/mm/pat/set_memory.c#L2556

TTM calls it here:
https://elixir.bootlin.com/linux/v7.1-rc3/source/drivers/gpu/drm/ttm/ttm_pool.c#L249

And there are actually shmem helpers that do this as well, without
pooling.
https://elixir.bootlin.com/linux/v7.1-rc3/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L212


> 
> > That's an
> > extremely costly operation so TTM needs to pool such allocations.
> > That's where using shmem natively becomes very ugly, because you
> > can't
> > really use a 1:1 mapping between shmem objects and DRM objects
> > anymore.
> 
> So you would require different caching attributes within a DRM
> object?

The way the TTM pools work are that there are separate pools for each
allocation order and caching modes. That would essentially mean
allocations from a single shmem object would be spread out across
different pools, and we'd loose the 1:1 mapping between DRM objects and
shmem objects.

One alternative would be a single large sparse shmem object common for
all DRM objects, with a range allocator, but that also got pretty ugly
when I tried to implement that.

Finally, (and I think that might be what Christian was getting at as
well) Without CONFIG_TRANSPARENT_HUGEPAGE, we'd only see order 0 shmem
folios, right?

Thanks,
Thomas


Reply via email to