On Wed, Dec 02, 2015 at 11:42:13AM -0500, Bob Peterson wrote:
> > Please take a look at this
> > again and figure out what the problematic cycle of events is, and then
> > work out how to avoid that happening in the first place. There is no
> > point in replacing one problem with another one, particularly one which
> > would likely be very tricky to debug,
> > Steve.
> Rhe problematic cycle of events is well known:
> gfs2_clear_inode calls gfs2_glock_put() for the inode's glock,
> but if it's the very last put, it calls into dlm, which can block,
> and that's where we get into trouble.
> The livelock goes like this:
> 1. A fence operation needs memory, so it blocks on memory allocation.
> 2. Memory allocation blocks on slab shrinker.
> 3. Slab shrinker calls into vfs inode shrinker to free inodes from memory.
> 7. dlm blocks on a pending fence operation. Goto 1.

Therefore, the fence operation should be doing GFP_NOFS allocations
to prevent re-entry into the DLM via the filesystem via the shrinker....


Dave Chinner

