Hi, This patch fixes a deadlock in the withdraw recovery sequence. Before this patch, any attempt to map an area of the journal inode except for the inode itself, resulted in -EIO because gfs2_meta_read checked for blkno != sdp->sd_jdesc->jd_no_addr. But function signal_our_withdraw repeatedly calls check_journal_clean which reads the metadata (both dinode and indirect blocks) to see if the entire journal is mapped. When it returned -EIO to its caller, that would bubble back up and cause a consistency error, which would try to withdraw-from-withdraw, which results in a deadlock.
This patch changes to test in gfs2_meta_read so other metadata reads for the journal are not rejected out-of-hand with -EIO. It accomplishes this by checking for the journal inode glock which will be the same for all blocks in the journal. That allows check_journal_clean to do its work while its journal is recovered without trying to withdraw recursively. Signed-off-by: Bob Peterson <rpete...@redhat.com> --- fs/gfs2/meta_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c index 4b72abcf83b2..c6d885dbe5f0 100644 --- a/fs/gfs2/meta_io.c +++ b/fs/gfs2/meta_io.c @@ -252,7 +252,7 @@ int gfs2_meta_read(struct gfs2_glock *gl, u64 blkno, int flags, int num = 0; if (unlikely(gfs2_withdrawn(sdp)) && - (!sdp->sd_jdesc || (blkno != sdp->sd_jdesc->jd_no_addr))) { + (!sdp->sd_jdesc || (gl != sdp->sd_jinode_gl))) { *bhp = NULL; return -EIO; }