On Thu, Aug 15, 2019 at 3:04 PM Jason Gunthorpe <j...@ziepe.ca> wrote:
> On Thu, Aug 15, 2019 at 10:44:29AM +0200, Michal Hocko wrote:
> > As the oom reaper is the primary guarantee of the oom handling forward
> > progress it cannot be blocked on anything that might depend on blockable
> > memory allocations. These are not really easy to track because they
> > might be indirect - e.g. notifier blocks on a lock which other context
> > holds while allocating memory or waiting for a flusher that needs memory
> > to perform its work.
> But lockdep *does* track all this and fs_reclaim_acquire() was created
> to solve exactly this problem.
> fs_reclaim is a lock and it flows through all the usual lockdep
> schemes like any other lock. Any time the page allocator wants to do
> something the would deadlock with reclaim it takes the lock.
> Failure is expressed by a deadlock cycle in the lockdep map, and
> lockdep can handle arbitary complexity through layers of locks, work
> queues, threads, etc.
> What is missing?

Lockdep doens't seen everything by far. E.g. a wait_event will be
caught by the annotations here, but not by lockdep. Plus lockdep does
not see through the wait_event, and doesn't realize if e.g. that event
will never signal because the worker is part of the deadlock loop.
cross-release was supposed to fix that, but seems like that will never

And since we're talking about mmu notifiers here and gpus/dma engines.
We have dma_fence_wait, which can wait for any hw/driver in the system
that takes part in shared/zero-copy buffer processing. Which at least
on the graphics side is everything. This pulls in enormous amounts of
deadlock potential that lockdep simply is blind about and will never

Arming might_sleep catches them all.

Cheers, Daniel

PS: Don't ask me about why we need these semantics for oom_reaper,
like I said just trying to follow the rules.
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
Intel-gfx mailing list

Reply via email to