On 17.08.24 08:24, Barry Song wrote:
From: Barry Song <v-songbao...@oppo.com>

-v3:
  * collect reviewed-by, acked-by etc. Michal, Christoph, Vlastimil, Davidlohr,
    thanks!
  * use Jason's patch[1] to fix vdpa and refine his changelog.
  * refine changelogs
[1] https://lore.kernel.org/all/20240808054320.10017-1-jasow...@redhat.com/

-v2:
  https://lore.kernel.org/linux-mm/20240731000155.109583-1-21cn...@gmail.com/

  * adjust vpda fix according to Jason and Michal's feedback, I would
    expect Jason to test it, thanks!
  * split BUG_ON of unavoidable failure and the case GFP_ATOMIC |
    __GFP_NOFAIL into two patches according to Vlastimil and Michal.
  * collect Michal's acked-by for patch 2 - the doc;
  * remove the new GFP_NOFAIL from this series, that one would be a
    separate enhancement patchset later on.

-v1:
  https://lore.kernel.org/linux-mm/20240724085544.299090-1-21cn...@gmail.com/

__GFP_NOFAIL carries the semantics of never failing, so its callers
do not check the return value:
   %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
   cannot handle allocation failures. The allocation could block
   indefinitely but will never return with failure. Testing for
   failure is pointless.


You should have called this series "mm: clarify nofail and BUG_ON on with invalid arguments with nofail"

I was a bit surprised to find that many BUG_ON in this series ;)

However, __GFP_NOFAIL can sometimes fail if it exceeds size limits
or is used with GFP_ATOMIC/GFP_NOWAIT in a non-sleepable context.
This can expose security vulnerabilities due to potential NULL
dereferences.

Which code is that that we are concerned about? Some OOT driver? Some untested in-tree driver? Just trying to understand what we are afraid about here. I'm afraid we cannot safe the world with some OOT drivers :(

Are we aware of such security vulnerabilities (out of pure interest!)?

Is the expectation that these code paths are never triggered during early testing where a WARN_ON_ONCE() would just highlight -- in addition to a likely crash -- that something is very wrong?


Since __GFP_NOFAIL does not support non-blocking allocation, we introduce
GFP_NOFAIL with inclusive blocking semantics and encourage using GFP_NOFAIL
as a replacement for __GFP_NOFAIL in non-mm.

I still wonder if specifying GFP_NOFAIL should not simply come with the semantics "implies direct reclaim. If called for non-blocking allocations, bad things will happen (no security bad things though)." ;)


If we must still fail a nofail allocation, we should trigger a BUG rather
than exposing NULL dereferences to callers who do not check the return
value.

I am not convinced that BUG_ON is the right tool here to save the world, but I see how we arrived here.

--
Cheers,

David / dhildenb


Reply via email to