On 10.09.25 14:52, Thadeu Lima de Souza Cascardo wrote: > On Wed, Sep 10, 2025 at 02:11:58PM +0200, Christian König wrote: >> On 10.09.25 13:59, Thadeu Lima de Souza Cascardo wrote: >>> When the TTM pool tries to allocate new pages, it stats with max order. If >>> there are no pages ready in the system, the page allocator will start >>> reclaim. If direct reclaim fails, the allocator will reduce the order until >>> it gets all the pages it wants with whatever order the allocator succeeds >>> to reclaim. >>> >>> However, while the allocator is reclaiming, lower order pages might be >>> available, which would work just fine for the pool allocator. Doing direct >>> reclaim just introduces latency in allocating memory. >>> >>> The system should still start reclaiming in the background with kswapd, but >>> the pool allocator should try to allocate a lower order page instead of >>> directly reclaiming. >>> >>> If not even a order-1 page is available, the TTM pool allocator will >>> eventually get to start allocating order-0 pages, at which point it should >>> and will directly reclaim. >> >> Yeah that was discussed before quite a bit but at least for AMD GPUs that is >> absolutely not something we should do. >> >> The performance difference between using high and low order pages can be up >> to 30%. So the added extra latency is just vital for good performance. >> >> We could of course make that depend on the HW you use if it isn't necessary >> for some other GPU, but at least both NVidia and Intel seem to have pretty >> much the same HW restrictions. >> >> NVidia has been working on extending this to even use 1GiB pages to reduce >> the TLB overhead even further. > > But if the system cannot reclaim or is working hard on reclaiming, it will > not allocate that page and the pool allocator will resort to lower order > pages anyway. > > In case the system has pages available, it will use them. I think there is > a balance here and I find this one is reasonable. If the system is not > under pressure, it will allocate those higher order pages, as expected. > > I can look into the behavior when the system might be fragmented, but I > still believe that the pool is offering such a protection by keeping those > higher order pages around. It is when the system is under memory presure > that we need to resort to lower order pages. > > What we are seeing here is on a low memory (4GiB) single node system with > an APU, that it will have lots of latencies trying to allocate memory by > doing direct reclaim trying to allocate order-10 pages, which will fail and > down it goes until it gets to order-4 or order-3. With this change, we > don't see those latencies anymore and memory pressure goes down as well. That reminds me of the scenario I described in the 00862edba135 ("drm/ttm: Use GFP_TRANSHUGE_LIGHT for allocating huge pages") commit log, where taking a filesystem backup could cause Firefox to freeze for on the order of a minute.
Something like that can't just be ignored as "not a problem" for a potential 30% performance gain. -- Earthling Michel Dänzer \ GNOME / Xwayland / Mesa developer https://redhat.com \ Libre software enthusiast