Before this patch, function do_xmote just assumed all the writes
submitted to the journal were finished and successful, and it
called the go_unlock function to release the dlm lock. But if
they're not, and a revoke failed to make its way to the journal,
a journal replay on another node will cause corruption if we
let the go_inval function continue and tell dlm to release the
glock to another node. This patch adds a couple assert_withdraws
in do_xmote after the calls to go_sync and go_inval. The asserts
should cause another node to replay the journal before continuing,
thus protecting rgrp and dinode glocks and maintaining the
integrity of the metadata.

Signed-off-by: Bob Peterson <rpete...@redhat.com>
---
 fs/gfs2/glock.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index ba61bba46785..afb336b65abd 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -566,8 +566,12 @@ __acquires(&gl->gl_lockref.lock)
        spin_unlock(&gl->gl_lockref.lock);
        if (glops->go_sync)
                glops->go_sync(gl);
+       gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_errors) == 0);
        if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags))
                glops->go_inval(gl, target == LM_ST_DEFERRED ? 0 : 
DIO_METADATA);
+
+       if (!gfs2_assert_withdraw(sdp, atomic_read(&sdp->sd_log_errors) == 0))
+               gfs2_assert_withdraw(sdp, !atomic_read(&gl->gl_ail_count));
        clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
 
        gfs2_glock_hold(gl);
-- 
2.20.1

Reply via email to