On Mon, Jan 19, 2015 at 09:30:21AM -0500, Brian Foster wrote:
> The secondary superblock verification in xfs_repair was subject to a bug
> that unnecessarily leads to a brute force superblock scan if the last
> superblock in the fs happens to be corrupt. Normally, xfs_repair handles
> one-off superblock corruption gracefully using a heuristic that finds
> the most consistent superblock content across the set of secondary
> superblocks.
>
> Create a regression test for xfs_repair that corrupts the last
> superblock in the fs. Verify the superblock is updated from the
> previously verified sb content and a brute force scan is not initiated.
> In the event of failure, detect that a brute force scan has started and
> abort the repair in order to fail the test quickly.
>
> To support the test, extend the xfs_repair filter to handle corrupted
> superblock repair output and provide generic test output for arbitrary
> AG counts.
>
> Signed-off-by: Brian Foster <[email protected]>
> ---
>
> Hi all,
>
> This is an xfs_repair regression test to trigger the problem fixed by
> the following previously posted fix:
>
> http://oss.sgi.com/archives/xfs/2015-01/msg00244.html
>
> Thoughts appreciated, thanks.
...
> +# Start and monitor an xfs_repair of the scratch device. This test can
> induce a
> +# time consuming brute force superblock scan. Since a brute force scan means
> +# test failure, detect it and end the repair.
> +_xfs_repair_noscan()
> +{
> + # invoke repair directly so we can kill the process if need be
> + $XFS_REPAIR_PROG $SCRATCH_DEV 2>&1 | tee -a $seqres.full > $tmp.repair &
> + repair_pid=$!
> +
> + # monitor progress for as long as it is running
> + while [ `ps -q $repair_pid > /dev/null; echo $?` == 0 ]; do
while [ `pgrep xfs_repair` -eq 0 ]; do
> + grep "couldn't verify primary superblock" $tmp.repair \
> + > /dev/null 2>&1
> + if [ $? == 0 ]; then
> + # we've started a brute force scan. kill repair and
> + # fail the test
> + kill -9 $repair_pid >> $seqres.full 2>&1
> + wait >> $seqres.full 2>&1
> +
> + _fail "xfs_repair resorted to brute force scan"
> + fi
> +
> + sleep 1
> + done
> +
> + wait
> +
> + cat $tmp.repair | _filter_repair
> +}
> +
> +rm -f $seqres.full
> +
> +# get standard environment, filters and checks
> +. ./common/rc
> +. ./common/filter
> +. ./common/repair
> +
> +# real QA test starts here
> +
> +# Modify as appropriate.
> +_supported_fs xfs
> +_supported_os Linux
> +_require_scratch_nocheck
> +
> +_scratch_mkfs >> $seqres.full 2>&1 || _fail "mkfs failed"
> +
> +# corrupt the last secondary sb in the fs
> +agcount=`$XFS_DB_PROG -c "sb" -c "p agcount" $SCRATCH_DEV | awk '{ print $3
> }'`
scratch_mkfs | _filter_mkfs 2> $tmp.mkfs
. $tmp.mkfs
And now you have the agcount variable already set up (and most other
fs geometry variables that mkfs outputs).
> +last_secondary=$((agcount - 1))
> +$XFS_DB_PROG -x -c "sb $last_secondary" -c "type data" \
you can just use "sb $((agcount - 1))" directly. The comment above
tells us that it's the last secondary sb we are corrupting....
Otherwise look sok.
Cheers,
Dave.
--
Dave Chinner
[email protected]
--
To unsubscribe from this list: send the line "unsubscribe fstests" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html