On Wed, Nov 27, 2013 at 5:23 AM, Ben Widawsky <b...@bwidawsk.net> wrote: > On Tue, Nov 26, 2013 at 04:55:50PM -0800, Ben Widawsky wrote: >> If we end up calling the shrinker, which in turn requires the OOM >> killer, we may end up infinitely waiting for a process to die if the OOM >> chooses. The case that this prevents occurs in execbuf. The forked >> variants of gem_evict_everything is a good way to hit it. This is >> exacerbated by Daniel's recent patch to give OOM precedence to the GEM >> tests. >> >> It's a twisted form of a deadlock. >> >> What occurs is the following (assume just 2 procs) >> 1. proc A gets to execbuf while out of memory, gets struct_mutex. >> 2. OOM killer comes in and chooses proc B >> 3. proc B closes it's fds, which requires struct mutex, blocks >> 4, OOM killer waits for B to die before killing another process (this >> part is speculative) >> >> Cc: Daniel Vetter <daniel.vet...@ffwll.ch> >> Cc: Chris Wilson <ch...@chris-wilson.co.uk> >> Signed-off-by: Ben Widawsky <b...@bwidawsk.net> > > I'd still like to know if I am crazy, but I'm now trying to defer the > stuff we do on file close without using any allocs. Just an update...
Sound's intrigueing, but tbh I don't really have clue about things. What about adding the relevant stuck task backtraces to the patch and submitting this to a wider audience (lkml, mm-devel) as an akpm-probe? The more botched the patch, the better the probe usually. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx