On 5/13/26 12:37, Thomas Hellström wrote:
> On Wed, 2026-05-13 at 12:03 +0200, David Hildenbrand (Arm) wrote:
>> On 5/13/26 10:51, Thomas Hellström wrote:
>>>
>>> FWIW, the i915 driver which uses shmem "natively" uses a special
>>> mount
>>> here that gives back THPs.
>>>
>>>
>>> Currently the drivers that use shmem in this way use
>>> "mapping_set_unevictable()" as long as the object is bound to the
>>> GPU.
>>> Then shrinkers can unbind from GPU and revert that setting.
>>
>> Right, but mapping_set_unevictable() only affects folio_evictable() -
>> -reclaim
>> behavior. Not other properties (such as folio migration).
> 
> Interesting. Does that imply that a shmem folio can be replaced
> underneath without additional measures? It looks like most DRM
> call sites imply that mapping_set_unevictable() pins underlying shmem
> folios

I don't think there is anything preventing folio migration. shmem implement the
migrate_folio() callback simply by wiring up migrate_folio().

However, any raised reference on a shmem folio would prevent migration. However,
taking longterm references on folios that are allocated as being movable (shmem
default) breaks CMA, memory hotunplug, compaction...

drm_gem_get_pages()/drm_gem_put_pages() seem to handle some part of that ... by
grabbing/putting references.

So if DRM actually might hold these references for a longer time, I suspect this
breaks CMA etc..

We have the memfd_pin_folios() interface that takes care of handling that
properly by only allowing longterm references if longterm references are
actually allowed -- and otherwise migrates pages to physical memory areas where
longterm pinning is allowed.

[...]

> x86 implementation is here:
> https://elixir.bootlin.com/linux/v7.1-rc3/source/arch/x86/mm/pat/set_memory.c#L2556
> 
> TTM calls it here:
> https://elixir.bootlin.com/linux/v7.1-rc3/source/drivers/gpu/drm/ttm/ttm_pool.c#L249
> 

Ah, the calls to set_pages_array_wc/set_pages_array_uc.

> And there are actually shmem helpers that do this as well, without
> pooling.
> https://elixir.bootlin.com/linux/v7.1-rc3/source/drivers/gpu/drm/drm_gem_shmem_helper.c#L212

Thanks for pointing me at them.

Right, after grabbing a reference, the folio is unmovable and we can modify the
directmap.

> 
> 
>>
>>> That's an
>>> extremely costly operation so TTM needs to pool such allocations.
>>> That's where using shmem natively becomes very ugly, because you
>>> can't
>>> really use a 1:1 mapping between shmem objects and DRM objects
>>> anymore.
>>
>> So you would require different caching attributes within a DRM
>> object?
> 
> The way the TTM pools work are that there are separate pools for each
> allocation order and caching modes. That would essentially mean
> allocations from a single shmem object would be spread out across
> different pools, and we'd loose the 1:1 mapping between DRM objects and
> shmem objects.

Right.

> 
> One alternative would be a single large sparse shmem object common for
> all DRM objects, with a range allocator, but that also got pretty ugly
> when I tried to implement that.

Does not sound too crazy, though.

> 
> Finally, (and I think that might be what Christian was getting at as
> well) Without CONFIG_TRANSPARENT_HUGEPAGE, we'd only see order 0 shmem
> folios, right?

Right. Because large folios do not exist in such a world. So this is expected.

If you don't set CONFIG_TRANSPARENT_HUGEPAGE, expect the system to have bad
performance.

-- 
Cheers,

David

Reply via email to