Hi,

On 03/07/17 15:56, Andreas Gruenbacher wrote:
These are the remaining patches for fixing a cluster-wide GFS2 and DLM
deadlock.

As explained in the previous posting of this patch queue, when inodes
are evicted, GFS2 currently calls into DLM.  Inode eviction can be
triggered by memory pressure, in the context of a random user-space
process.  If DLM happens to block in the process in question (for
example, it that process is a fence agent), GFS2 and DLM will deadlock.

This patch queue stops GFS2 from calling into DLM on the inode evict
path when under memory pressure.  It does so by first decoupling
destroying inodes and putting their associated glocks, which is what
ends up calling into DLM.  Second, when under memory pressure, it moves
putting glocks into work queue context where it cannot block DLM.
Third, when gfs2_drop_inode determines that an inode's link count has
hit zero under memory pressure, it puts that inode on the delete
workqueue (and keeps the inode in the icache) instead of causing
gfs2_evict_inode to delete the inode immediately.  The delete workqueue
will not be processed under memory pressure, so deleting inodes from
there is safe.
Does this mean that all the corner cases are now covered and that this is now passing all the tests? If so that is a really good step forward. I know it has been a real slog to get to this point, but the patch series is looking much better for all the hard work that has gone into it I think,

Steve.

Thanks,
Andreas

Andreas Gruenbacher (4):
   gfs2: gfs2_glock_get: Wait on freeing glocks
   gfs2: Get rid of gfs2_set_nlink
   gfs2: gfs2_evict_inode: Put glocks asynchronously
   gfs2: Defer deleting inodes under memory pressure

  fs/gfs2/glock.c  | 145 ++++++++++++++++++++++++++++++++++++++++++++++---------
  fs/gfs2/glock.h  |   2 +
  fs/gfs2/glops.c  |  28 +----------
  fs/gfs2/incore.h |   1 +
  fs/gfs2/super.c  |  39 ++++++++++++++-
  5 files changed, 162 insertions(+), 53 deletions(-)


Reply via email to