These two patches fix a gfs2 deadlock with gfs2_rename.
Comments for vfs function vfs_rename() explain situations in which AB-BA
deadlocks are possible, but prevented by a mutex, s_vfs_rename_mutex.

While that's all true at a single-node level, gfs2 can make processes
running on two distinct cluster nodes behave as if they're on the same
node. But since different nodes have different s_vfs_rename_mutexes
it's still possible to create a similar deadlock across nodes:
One node's gfs2_rename locks the two directory inodes based on their roles
of parent and child, in that order. But then the other node can reverse
their roles and request the same inodes, but in the opposite order.
IOW, both nodes lock parent-then-child, but since their roles switch,
it creates another AB-BA deadlock in which they wait forever for each other.

This patch set resolves the deadlock by submitting both deadlocks to dlm
asychronously. Then it waits for them both to be locked using a new event
that checks whether both glocks are locked properly.

The new event has a timeout, and if the timeout expires, it returns
-ESTALE, which signifies the locking has a conflict, and vfs retries
the locking operation.

The first patch separates the rgrp glock from the inode glocks because
that needs to happen last, rather than asychronously. The second patch
implements the async glock requests and the new event wait.

Bob Peterson (2):
  gfs2: separate holder for rgrps in gfs2_rename
  gfs2: Use async glocks for rename

 fs/gfs2/glock.c      | 40 ++++++++++++++++++++++++++++++++++++++++
 fs/gfs2/glock.h      |  1 +
 fs/gfs2/incore.h     |  1 +
 fs/gfs2/inode.c      | 28 ++++++++++++++++++++++------
 fs/gfs2/ops_fstype.c |  1 +
 5 files changed, 65 insertions(+), 6 deletions(-)

-- 
2.21.0

Reply via email to