HI,

On 01/12/15 15:42, Bob Peterson wrote:
----- Original Message -----
Hi,

On 25/11/15 14:22, Bob Peterson wrote:
----- Original Message -----
Hi,

On 19/11/15 18:42, Bob Peterson wrote:
This patch changes function gfs2_clear_inode() so that instead
of calling gfs2_glock_put directly() most of the time, it queues
the glock to the delayed work queue. That avoids a possible
deadlock where it calls dlm during a fence operation:
dlm waits for a fence operation, the fence operation waits for
memory, the shrinker waits for gfs2 to free an inode from memory,
but gfs2 waits for dlm.

Signed-off-by: Bob Peterson <rpete...@redhat.com>
---
    fs/gfs2/glock.c | 34 +++++++++++++++++-----------------
    fs/gfs2/glock.h |  1 +
    fs/gfs2/super.c |  5 ++++-
    3 files changed, 22 insertions(+), 18 deletions(-)
[snip]
Most of the patch seems to just rename the workqueue which makes it
tricky to spot the other changes. However, the below code seems to be
the new bit..

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 9d5c3f7..46e5004 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -24,6 +24,7 @@
    #include <linux/crc32.h>
    #include <linux/time.h>
    #include <linux/wait.h>
+#include <linux/workqueue.h>
    #include <linux/writeback.h>
    #include <linux/backing-dev.h>
    #include <linux/kernel.h>
@@ -1614,7 +1615,9 @@ out:
        ip->i_gl->gl_object = NULL;
        flush_delayed_work(&ip->i_gl->gl_work);
        gfs2_glock_add_to_lru(ip->i_gl);
-       gfs2_glock_put(ip->i_gl);
+       if (queue_delayed_work(gfs2_glock_workqueue,
+                              &ip->i_gl->gl_work, 0) == 0)
+               gfs2_glock_put(ip->i_gl);
        ip->i_gl = NULL;
        if (ip->i_iopen_gh.gh_gl) {
                ip->i_iopen_gh.gh_gl->gl_object = NULL;
which replaces a put with a queue & put if the queue fails (due to it
being already on the queue) which doesn't look quite right to be since
if calling gfs2_glock_put() was not safe before, then calling it
conditionally like this is still no safer I think?

Steve.
Hi,

The call to gfs2_glock_put() in this case should be safe.

If queuing the delayed work fails, it means the glock reference count is
greater than 1, to be decremented when the glock state machine runs.
Which means this can't be the final glock_put().
Which means we can't possibly call into DLM, which means we can't block.
Which means it's safe.

Regards,

Bob Peterson
Red Hat File Systems
There is no reason that this cannot be the final glock put, since there
is no synchronization with the work that has been queued, so it might
well have run and decremented the ref count before we return from the
queuing function. It is unlikely that will be the case, but it is still
possible,

Steve.

Hi Steve,

It's kind of an ugly hack, but can we do something like the patch below instead?

Regards,

Bob Peterson
Red Hat File Systems
---
commit 1949050b4b13c1b32ea45987fbf2936ae779609e
Author: Bob Peterson <rpete...@redhat.com>
Date:   Thu Nov 19 12:06:31 2015 -0600

GFS2: Make gfs2_clear_inode() not block on final glock put

This patch changes function gfs2_clear_inode() so that instead
of calling gfs2_glock_put, it calls a new gfs2_glock_put_noblock
function that avoids a possible deadlock that would occur should
it call dlm during a fence operation: dlm waits for a fence
operation, the fence operation waits for memory, the shrinker
waits for gfs2 to free an inode from memory, but gfs2 waits for
dlm. The new non-blocking glock_put does this:

1. It acquires the lockref to ensure no one else is messing with it.
2. If the lockref is put (not locked) it can safely return because
    it is not the last reference to the glock.
3. If this is the last reference, it tries to queue delayed work for
    the glock.
4. If it was able to queue the delayed work, it's safe to return
    because the glock_work_func will run in another process, so
    this one cannot block.
5. If it was unable to queue the delayed work, it needs to schedule
    and start the whole process again.

Signed-off-by: Bob Peterson <rpete...@redhat.com>

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index a4ff7b5..22870c6 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -178,6 +178,27 @@ void gfs2_glock_put(struct gfs2_glock *gl)
  }
/**
+ * gfs2_glock_put_noblock() - Decrement reference count on glock
+ * @gl: The glock to put
+ *
+ * This is the same as gfs2_glock_put() but it's not allowed to block
+ */
+
+void gfs2_glock_put_noblock(struct gfs2_glock *gl)
+{
+       while (1) {
+               if (lockref_put_or_lock(&gl->gl_lockref))
+                       break;
+
+               spin_unlock(&gl->gl_lockref.lock);
That just drops the ref count without doing anything.

+               if (queue_delayed_work(glock_workqueue, &gl->gl_work, 0) != 0)
+                       break;
You can't call queue_delayed_work on a glock for which you don't have a ref count - it might not exist any more. Please take a look at this again and figure out what the problematic cycle of events is, and then work out how to avoid that happening in the first place. There is no point in replacing one problem with another one, particularly one which would likely be very tricky to debug,

Steve.

+
+               cond_resched();
+       }
+}
+
+/**
   * may_grant - check if its ok to grant a new lock
   * @gl: The glock
   * @gh: The lock request which we wish to grant
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index 46ab67f..d786446 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -182,6 +182,7 @@ extern int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
                          const struct gfs2_glock_operations *glops,
                          int create, struct gfs2_glock **glp);
  extern void gfs2_glock_put(struct gfs2_glock *gl);
+extern void gfs2_glock_put_noblock(struct gfs2_glock *gl);
  extern void gfs2_holder_init(struct gfs2_glock *gl, unsigned int state,
                             u16 flags, struct gfs2_holder *gh);
  extern void gfs2_holder_reinit(unsigned int state, u16 flags,
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 03fa155..188f2a5 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1613,7 +1613,7 @@ out:
        ip->i_gl->gl_object = NULL;
        flush_delayed_work(&ip->i_gl->gl_work);
        gfs2_glock_add_to_lru(ip->i_gl);
-       gfs2_glock_put(ip->i_gl);
+       gfs2_glock_put_noblock(ip->i_gl);
        ip->i_gl = NULL;
        if (ip->i_iopen_gh.gh_gl) {
                ip->i_iopen_gh.gh_gl->gl_object = NULL;


Reply via email to