[Cluster-devel] [GFS2 PATCH] gfs2: fix withdraw sequence deadlock

Bob Peterson Wed, 22 Apr 2020 12:57:13 -0700

Hi,

This patch fixes a deadlock in the withdraw recovery sequence.
Before this patch, any attempt to map an area of the journal inode
except for the inode itself, resulted in -EIO because gfs2_meta_read
checked for blkno != sdp->sd_jdesc->jd_no_addr. But function
signal_our_withdraw repeatedly calls check_journal_clean which reads
the metadata (both dinode and indirect blocks) to see if the entire
journal is mapped. When it returned -EIO to its caller, that would
bubble back up and cause a consistency error, which would try to
withdraw-from-withdraw, which results in a deadlock.


This patch changes to test in gfs2_meta_read so other metadata reads
for the journal are not rejected out-of-hand with -EIO. It accomplishes
this by checking for the journal inode glock which will be the same
for all blocks in the journal. That allows check_journal_clean to do
its work while its journal is recovered without trying to withdraw
recursively.

Signed-off-by: Bob Peterson <rpete...@redhat.com>
---
 fs/gfs2/meta_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index 4b72abcf83b2..c6d885dbe5f0 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -252,7 +252,7 @@ int gfs2_meta_read(struct gfs2_glock *gl, u64 blkno, int 
flags,
        int num = 0;
 
        if (unlikely(gfs2_withdrawn(sdp)) &&
-           (!sdp->sd_jdesc || (blkno != sdp->sd_jdesc->jd_no_addr))) {
+           (!sdp->sd_jdesc || (gl != sdp->sd_jinode_gl))) {
                *bhp = NULL;
                return -EIO;
        }

[Cluster-devel] [GFS2 PATCH] gfs2: fix withdraw sequence deadlock

Reply via email to