On Tue, 25 Mar 2025, Olivier Certner wrote:

Hi Olivier,

What is the drm code in question?  ttm_pool_alloc -> ttm_pool_alloc_page()?
As all other uses of __GFP_NORETRY in 6.1 (ignoring drm_printf.c) seem to be
in i915.

Yes, this is indeed targeted at ttm_pool_alloc_page() which tries to allocate 
contiguous pages to fill up the pool (but not in 5.10).  TTM pools are used by 
the amdgpu driver to build its translation tables.

Great;  I think with two other work in progress bits the disabled
fallback in there could be used again very soon but that'll be a
discussion for another time and place (and likely people).


Calls to functions other than (linux_)alloc_pages*() are unaffected by the 
change, and if you dig through all the references of __GFP_NORETRY/GFP_RETRY 
(including those from files under 'selftests/'), you'll see the built GFP flags 
are never used with (linux_)alloc_pages*(), except for the only reference you 
mentioned.

Are you sure?

i915_gem_object_get_pages_internal() in drm-6.1 at least seems to
conditionally pass it down:

      17 #define QUIET (__GFP_NORETRY | __GFP_NOWARN)
      ...
      74                         page = alloc_pages(gfp | (order ? QUIET : 
MAYFAIL),

Seems it can deal with allocation failures, lowering order or returning
-ENOMEM from the function so should be fine hopefully.

Yes, I was aware of this piece of code, but obviously it cannot cause any 
problem.

All calls to Linux's alloc_pages*() can fail *whatever* the passed GFP flags 
except for GFP_NOFAIL (and that's the only exception).

Ah I see; that's where the M_WAITOK logic in there comes from.  I was
wondering about it given others like GFP_KERNEL also being mapped to
that.  Sometimes that makes me wonder if we should indeed pass the gfp_t
all the way down to the actual function doing the alloc and only there
filter and convert rather than doing that earlier in wrapper functions.


 Callers always have to cope, and specifically when specifying __GFP_NORETRY it 
would be foolish not too (and that wouldn't be allowed in Linus' tree anyway).

If it wasn't for that, i915_gem_object_get_pages_internal() does the same 
lowering that ttm_pool_alloc_page() does anyway, as you noticed.

My sentence was indeed too strong, as I was still swapping in context for this 
work which was done months ago now.

I know how that feels. I just started looking at something I've done a
year+ ago completely outside my domain.


 I reviewed all callers not only for GFP_NORETRY but also for most others GFP 
flags (I have tweaked grep files for all of them and over multiple Linux 
versions), as I started some work to document what the Linux 
guarantees/behaviors really are and then some other work to rationalize how we 
translate them in FreeBSD (there seems to be several possible improvements 
here).  Unfortunately, I have stalled that last work for weeks now, and 
probably will for a significant while.

Given Linux's contract on __GFP_NORETRY, it is arguably not reasonable to spend 
time compacting memory on such calls, that's a deviation from what drivers are 
supposed to expect.

Oh, and the rest of the commit message also doesn't mention that I also tested 
this change on machines using the i915 driver, without observing any problem or 
change in behavior.

Fantastic!

Thanks you so much for the follow-up.  It clarifies a lot and makes sense and 
is very helpful!

Lots of joy,
Bjoern

--
Bjoern A. Zeeb                                                     r15:7

Reply via email to