Several gfs2 functions failed to reserve enough revoke entries for their transactions in the journal. Function gfs2_trans_remove_revoke unconditionally decrements tr->tr_num_revoke, and if not enough revokes are reserved, the value goes from 0 to 4294967295 (-1, but it's an unsigned int). This is later re-added to the system-wide revoke numbers, thereby decrementing the value (sd_log_commited_revoke) "properly," but by accident. This worked properly most of the time because one transaction would reserve space for revokes, then it would be merged with the system transaction (sdp->sd_log_tr) and it usually did not run out, because you can hold a lot of revoke entries per log descriptor block. Some of the code, such as gfs2_write_revokes, would work around this and somehow got it right most of the time. However, some jdata tests with xfstests generic/269 encountered problems when it actually ran out.
This patch is part of a series that tries to do proper accounting of revokes. This patch adds sanity checking for revoke counts to function gfs2_trans_remove_revoke. Signed-off-by: Bob Peterson <[email protected]> --- fs/gfs2/trans.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/fs/gfs2/trans.c b/fs/gfs2/trans.c index 4d01fe19c125..07c2d2194a9b 100644 --- a/fs/gfs2/trans.c +++ b/fs/gfs2/trans.c @@ -261,10 +261,22 @@ void gfs2_trans_remove_revoke(struct gfs2_sbd *sdp, u64 blkno, unsigned int len) list_del_init(&bd->bd_list); gfs2_assert_withdraw(sdp, sdp->sd_log_num_revoke); sdp->sd_log_num_revoke--; - if (bd->bd_gl) + if (bd->bd_gl) { + sdp->sd_log_commited_revoke--; gfs2_glock_remove_revoke(bd->bd_gl); + } kmem_cache_free(gfs2_bufdata_cachep, bd); - tr->tr_num_revoke--; + if (tr->tr_num_revoke) + tr->tr_num_revoke--; + else if (sdp->sd_log_tr && + sdp->sd_log_tr->tr_num_revoke) + sdp->sd_log_tr->tr_num_revoke--; + else { + fs_warn(sdp, "Removing a revoke that was not " + "reserved! Please investigate.\n"); + gfs2_print_trans(sdp, tr); + BUG(); + } if (--n == 0) break; } -- 2.24.1
