On 5/13/26 10:37, David Hildenbrand (Arm) wrote:
> On 5/13/26 09:47, Christian König wrote:
>> Hi David & Thomas,
>>
>> On 5/12/26 22:03, David Hildenbrand (Arm) wrote:
>>> On 5/12/26 13:31, Thomas Hellström wrote:
>> ...
>>>>
>>>> OK, can eliminate those. Is VM_WARN_ON_FOLIO() preferred,
>>>> or any other type of assert?
>>>
>>> VM_WARN_ON_FOLIO() is usually what you want, or VM_WARN_ON_ONCE().
>>>
>>>>
>>>>
>>>> OK, let me understand the concern. The pages are allocated as multi-
>>>> page folios using alloc_pages(gfp, order), but typically not promoted
>>>> to compound pages, until inserted here. Is it that promotion that is of
>>>> concern or inserting pages of unknown origin into shmem? Anything we
>>>> can do to alleviate that concern?
>>>
>>> It's all rather questionable.
>>>
>>> A couple of points:
>>>
>>> a) The pages are allocated to be unmovable, but adding them to shmem 
>>> effectively
>>>    turns them movable. Now you interfere with the page allocator logic of
>>>    placing movable and unmovable pages a reasonable way into
>>>    pageblocks that group allocations of similar types.
>>>
>>> b) A driver is not supposed to decide which folio size will be allocated for
>>>    shmem.
>>
>> Exactly that is one of the major reasons why we aren't using a shmem as 
>> backing store for TTM buffers in the first place.
> 
> What was the problem with that the last time this was considered?
> 
> shmem nowadays supports THP (e.g., 2M) and even mTHP (e.g., 64K).
> 
> For internal mounts, it must be enabled accordingly
> (/sys/kernel/mm/transparent_hugepage/.../shmem_enabled).
> 
> Some distributions still default to "never". I guess if an admin enables it, 
> you
> would just get THPs.

Yeah, exactly that is not acceptable. We have some customers who already use 
that approach through udmabuf, so we already have some experience with it.

And I can't count how often I had to explain that it's a configuration issue 
and that the admin has to enable THP to get decent performance.

> If "distro default" is the only problem, I guess we could think about how to
> improve that. For example, just let internal GPU DRM objects allocate any 
> folio
> size available and supported etc.

Mhm, that sounds not so bad.

I think what drivers really need is that they can give the order to 
shmem_read_folio_gfp() and get a folio with that order or -ENOMEM as return.

In other words we need to enforce it and if the desired page size doesn't work 
we can then still decide if we want a fallback or not based on the use case the 
driver tries to implement.

> Would that make it possible to just use shmem natively? (e.g., how would this
> interact with shmem features like folio migration, would that be workable with
> DRM objects?).

Mostly, I mean there is still the use case for UC and USWC memory but at least 
for AMD GPUs that is mostly negligible (we need it for a handfull of 
workarounds for HW bugs etc...).

Thanks,
Christian.

Reply via email to