On 10.09.25 14:52, Thadeu Lima de Souza Cascardo wrote:
> On Wed, Sep 10, 2025 at 02:11:58PM +0200, Christian König wrote:
>> On 10.09.25 13:59, Thadeu Lima de Souza Cascardo wrote:
>>> When the TTM pool tries to allocate new pages, it stats with max order. If
>>> there are no pages ready in the system, the page allocator will start
>>> reclaim. If direct reclaim fails, the allocator will reduce the order until
>>> it gets all the pages it wants with whatever order the allocator succeeds
>>> to reclaim.
>>>
>>> However, while the allocator is reclaiming, lower order pages might be
>>> available, which would work just fine for the pool allocator. Doing direct
>>> reclaim just introduces latency in allocating memory.
>>>
>>> The system should still start reclaiming in the background with kswapd, but
>>> the pool allocator should try to allocate a lower order page instead of
>>> directly reclaiming.
>>>
>>> If not even a order-1 page is available, the TTM pool allocator will
>>> eventually get to start allocating order-0 pages, at which point it should
>>> and will directly reclaim.
>>
>> Yeah that was discussed before quite a bit but at least for AMD GPUs that is 
>> absolutely not something we should do.
>>
>> The performance difference between using high and low order pages can be up 
>> to 30%. So the added extra latency is just vital for good performance.
>>
>> We could of course make that depend on the HW you use if it isn't necessary 
>> for some other GPU, but at least both NVidia and Intel seem to have pretty 
>> much the same HW restrictions.
>>
>> NVidia has been working on extending this to even use 1GiB pages to reduce 
>> the TLB overhead even further.
> 
> But if the system cannot reclaim or is working hard on reclaiming, it will
> not allocate that page and the pool allocator will resort to lower order
> pages anyway.
> 
> In case the system has pages available, it will use them. I think there is
> a balance here and I find this one is reasonable. If the system is not
> under pressure, it will allocate those higher order pages, as expected.
> 
> I can look into the behavior when the system might be fragmented, but I
> still believe that the pool is offering such a protection by keeping those
> higher order pages around. It is when the system is under memory presure
> that we need to resort to lower order pages.
> 
> What we are seeing here is on a low memory (4GiB) single node system with
> an APU, that it will have lots of latencies trying to allocate memory by
> doing direct reclaim trying to allocate order-10 pages, which will fail and
> down it goes until it gets to order-4 or order-3. With this change, we
> don't see those latencies anymore and memory pressure goes down as well.
That reminds me of the scenario I described in the 00862edba135 ("drm/ttm: Use 
GFP_TRANSHUGE_LIGHT for allocating huge pages") commit log, where taking a 
filesystem backup could cause Firefox to freeze for on the order of a minute.

Something like that can't just be ignored as "not a problem" for a potential 
30% performance gain.


-- 
Earthling Michel Dänzer       \        GNOME / Xwayland / Mesa developer
https://redhat.com             \               Libre software enthusiast

Reply via email to