Hi,
Further recovery testing revealed some problems with the withdraw code,
especially related to single-node (lock_nolock) withdraws. This patch set
fixes some of the recent issues.
Bob Peterson (4):
gfs2: fix withdraw sequence deadlock
gfs2: Fix error exit in do_xmote
gfs2: Fix BUG during
When the gfs2_logd daemon withdrew, the withdraw sequence called
into make_fs_ro() to make the file system read-only. That caused the
journal descriptors to be freed. However, those journal descriptors
were used by gfs2_logd's call to gfs2_ail_flush_reqd(). This caused
a use-after free and NULL poi
Before this patch, when the logd daemon was forced to withdraw, it
would try to request its journal be recovered by another cluster node.
However, in single-user cases with lock_nolock, there are no other
nodes to recover the journal. Function signal_our_withdraw() was
recognizing the lock_nolock s
Before this patch , if an error was detected from glock function go_sync by
function do_xmote, it would return. But the function had temporarily
unlocked the gl_lockref spin_lock, and it never re-locked it.
When the caller of do_xmote tried to unlock it again, it was already
unlocked, which resulte
After a gfs2 file system withdraw, any attempt to read metadata is
automatically rejected by function gfs2_meta_read() except for reads
of the journal inode. This turns out to be a problem because function
signal_our_withdraw() repeatedly calls check_journal_clean() which reads
the metadata (both i
- Original Message -
> Hi,
>
> I'm doing some testing on 5.7-rc2 which includes Bob's recovery patches.
> I used a new xfstest (see the end of this mail) which injects some
> IO errors to force the filesystem to be withdrawn and then checks
> that it can be remounted successfully.
>
> How
Hi,
I'm doing some testing on 5.7-rc2 which includes Bob's recovery patches.
I used a new xfstest (see the end of this mail) which injects some
IO errors to force the filesystem to be withdrawn and then checks
that it can be remounted successfully.
However, it hits a BUG() during umount() after i