On Wed, Apr 1, 2015 at 12:11 AM, Filipe Manana <fdman...@suse.com> wrote:
> At the moment we can not reliably and deterministically test that the
> transaction abortion code works as expected. For example in the past [1]
> we had an issue where that code returned the pinned extents to the free
> space caches allowing fstrim to perform a discard against the physical
> locations of the extents, and worse, when the fs was mounted with the
> option -o discard it explicitly did a discard on each pinned extent.
> This resulted in filesystem corruption, leaving the fs unmountable.
>
> This patch adds a debugfs file named abort_transaction, which has a default
> default value of an empty string, can only be written by someone with root
> privileges and when a string is written to it, it makes sure all subsequent
> transaction commits fail at the very end (right before writing the new
> superblock) if that string matches the label of the filesystem.
> This way we can for example write a deterministic fstest for commit [1]
> which looks like:
>
>   _require_btrfs_debugfs()
>   {
>       if [ -d /sys/kernel/debug/btrfs ]; then
>           BTRFS_DEBUG_FS=/sys/kernel/debug/btrfs
>       elif [ -d /debug/btrfs ]; then
>           BTRFS_DEBUG_FS=/debug
>       else
>           _notrun "btrfs debugfs not available"
>       fi
>
>       if [ ! -z $1 ]; then
>           if [ ! -e $BTRFS_DEBUG_FS/$1 ]; then
>               _notrun "btrfs debugfs path $1 not available"
>           fi
>       fi
>   }
>
>   _supported_fs btrfs
>   _supported_os Linux
>   _require_scratch
>   _require_btrfs_debugfs "abort_transaction"
>   _need_to_be_root
>
>   rm -f $seqres.full
>
>   # We will abort a btrfs transaction later, which always produces a warning 
> in
>   # dmesg. We do not want the test to fail because of this.
>   _disable_dmesg_check
>   fslabel="btrfs_fstest_$seq"
>   _scratch_mkfs -L $fslabel >>$seqres.full 2>&1
>   _scratch_mount "-o discard"
>   _require_batched_discard $SCRATCH_MNT
>
>   # Create a file and commit the current transaction.
>   echo -n "hello" > $SCRATCH_MNT/foo
>   sync
>
>   # Now update the file, which forces a COW operation of the fs root, adding
>   # the old root location to the pinned extents list.
>   echo -n " world" >> $SCRATCH_MNT/foo
>
>   # Now abort the current transaction, unmount the fs, mount it again and 
> verify
>   # we can open the file and read its content (which should match what it had
>   # when the last transaction committed successfully). Btrfs used to issue a
>   # discard operation on the extents in the pinned extents list, resulting in
>   # corruption of metadata and data, and used too to return the pinned extents
>   # to the free space caches, allowing future fstrim operations to perform a
>   # discard operation against the pinned exents.
>   echo -n "$fslabel" > $BTRFS_DEBUG_FS/abort_transaction
>   sync
>   echo > $BTRFS_DEBUG_FS/abort_transaction
>   $FSTRIM_PROG $SCRATCH_MNT
>
>   _scratch_unmount
>   _scratch_mount
>   echo "File content after transaction abort + remount: $(cat 
> $SCRATCH_MNT/foo)"
>
> The test's expected output is:
>
>   File content after transaction abort + remount: hello
>
> With patch [1] reverted the test fails with:
>
>   btrfs/088 2s ... - output mismatch (see 
> /home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad)
>       --- tests/btrfs/088.out   2015-03-31 19:31:17.558436298 +0100
>       +++ /home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad    
> 2015-03-31 19:58:12.741403640 +0100
>       @@ -1,2 +1,8 @@
>        QA output created by 088
>       -File content after transaction abort + remount: hello
>       +mount: wrong fs type, bad option, bad superblock on /dev/sdc,
>       +       missing codepage or helper program, or other error
>       +       In some cases useful info is found in syslog - try
>       +       dmesg | tail  or so
>       +
>       ...
>       (Run 'diff -u tests/btrfs/088.out 
> /home/fdmanana/git/hub/xfstests/results//btrfs/088.out.bad'  to see the 
> entire diff)
>   _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent (see 
> /home/fdmanana/git/hub/xfstests/results//btrfs/088.full)
>
>   $ cat /home/fdmanana/git/hub/xfstests/results//btrfs/088.full
>   (...)
>   _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
>   *** fsck.btrfs output ***
>   Check tree block failed, want=29573120, have=0
>   Check tree block failed, want=29573120, have=0
>   Check tree block failed, want=29573120, have=0
>   Check tree block failed, want=29573120, have=0
>   Check tree block failed, want=29573120, have=0
>   read block failed check_tree_block
>   Couldn't read tree root
>   Couldn't open file system
>   *** end fsck.btrfs output
>
> With this feature we can also get a fstest for the issue fixed by the patch
> that fixes log tree corruption when the fs is mounted with -o discard [2].
>
>   "Btrfs: fix log tree corruption when fs mounted with -o discard"
>
> [1] commit 678886bdc637 ("Btrfs: fix fs corruption on transaction abort
>                           if device supports discard")
> [2] "Btrfs: fix log tree corruption when fs mounted with -o discard"
>
> Signed-off-by: Filipe Manana <fdman...@suse.com>

I'll revise this later, so please don't consider this version (nor v1).
Want to make this more flexible and with a different layout to allow
later for triggering other events for testing purposes.

For testing the specific transaction abortion issue, I figured out a
different way to do it using the existing error injection mechanism in
the kernel (test: https://patchwork.kernel.org/patch/6150501/).

Thanks.

> ---
>
> V2: Allow this to select by label which filesystem will have its transaction
>     aborted. The previous version made a transaction abort for every mounted
>     btrfs filesystem. It was not a problem in my test vm since only the 
> devices
>     used by fstests were using a btrfs filesystem (everything else was ext4).
>
>  fs/btrfs/sysfs.c       | 61 
> ++++++++++++++++++++++++++++++++++++++++++++++++--
>  fs/btrfs/sysfs.h       |  2 ++
>  fs/btrfs/transaction.c |  9 ++++++++
>  3 files changed, 70 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index 94edb0a..30cb7a5 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -666,6 +666,7 @@ static struct dentry *btrfs_debugfs_root_dentry;
>
>  /* Debugging tunables and exported data */
>  u64 btrfs_debugfs_test;
> +static char btrfs_debugfs_label_trans_abort[BTRFS_LABEL_SIZE] = { 0 };
>
>  int btrfs_sysfs_add_one(struct btrfs_fs_info *fs_info)
>  {
> @@ -710,19 +711,75 @@ failure:
>         return error;
>  }
>
> +static ssize_t read_fs_label(struct file *file, char __user *user_buf,
> +                            size_t count, loff_t *ppos)
> +{
> +       char *label = file->private_data;
> +
> +       return simple_read_from_buffer(user_buf, count, ppos,
> +                                      label, strlen(label));
> +}
> +
> +static ssize_t write_fs_label(struct file *file, const char __user *user_buf,
> +                             size_t count, loff_t *ppos)
> +{
> +       char buf[BTRFS_LABEL_SIZE];
> +       size_t buf_size;
> +       char *label = file->private_data;
> +
> +       memset(buf, 0, BTRFS_LABEL_SIZE);
> +       buf_size = min(count, sizeof(buf) - 1);
> +       if (copy_from_user(buf, user_buf, buf_size))
> +               return -EFAULT;
> +       if (count > 0 && buf[count - 1] == '\n')
> +               buf[count - 1] = '\0';
> +       memcpy(label, buf, BTRFS_LABEL_SIZE);
> +       return count;
> +}
> +
> +static const struct file_operations fops_fs_label = {
> +       .read =         read_fs_label,
> +       .write =        write_fs_label,
> +       .open =         simple_open,
> +       .llseek =       default_llseek,
> +};
> +
>  static int btrfs_init_debugfs(void)
>  {
>  #ifdef CONFIG_DEBUG_FS
> +       struct dentry *dentry;
> +
>         btrfs_debugfs_root_dentry = debugfs_create_dir("btrfs", NULL);
>         if (!btrfs_debugfs_root_dentry)
>                 return -ENOMEM;
>
> -       debugfs_create_u64("test", S_IRUGO | S_IWUGO, 
> btrfs_debugfs_root_dentry,
> -                       &btrfs_debugfs_test);
> +       dentry = debugfs_create_u64("test", S_IRUGO | S_IWUGO,
> +                                   btrfs_debugfs_root_dentry,
> +                                   &btrfs_debugfs_test);
> +       if (!dentry)
> +               return -ENOMEM;
> +
> +       dentry = debugfs_create_file("abort_transaction",
> +                                    S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH,
> +                                    btrfs_debugfs_root_dentry,
> +                                    btrfs_debugfs_label_trans_abort,
> +                                    &fops_fs_label);
> +       if (!dentry) {
> +               debugfs_remove_recursive(btrfs_debugfs_root_dentry);
> +               return -ENOMEM;
> +       }
>  #endif
>         return 0;
>  }
>
> +bool debugfs_abort_transaction(struct btrfs_fs_info *fs_info)
> +{
> +       if (!btrfs_debugfs_label_trans_abort[0])
> +               return false;
> +       return strcmp(fs_info->super_copy->label,
> +                     btrfs_debugfs_label_trans_abort) == 0;
> +}
> +
>  int btrfs_init_sysfs(void)
>  {
>         int ret;
> diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
> index f7dd298..1faf0b5 100644
> --- a/fs/btrfs/sysfs.h
> +++ b/fs/btrfs/sysfs.h
> @@ -74,4 +74,6 @@ int btrfs_kobj_add_device(struct btrfs_fs_info *fs_info,
>                 struct btrfs_device *one_device);
>  int btrfs_kobj_rm_device(struct btrfs_fs_info *fs_info,
>                  struct btrfs_device *one_device);
> +
> +bool debugfs_abort_transaction(struct btrfs_fs_info *fs_info);
>  #endif /* _BTRFS_SYSFS_H_ */
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 5d8cff8..8bf47cd 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -32,6 +32,7 @@
>  #include "volumes.h"
>  #include "dev-replace.h"
>  #include "qgroup.h"
> +#include "sysfs.h"
>
>  #define BTRFS_ROOT_TRANS_TAG 0
>
> @@ -2029,6 +2030,14 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans,
>                 goto scrub_continue;
>         }
>
> +       if (unlikely(debugfs_abort_transaction(root->fs_info))) {
> +               btrfs_warn(root->fs_info,
> +                          "Aborting transaction due to debugfs request.");
> +               mutex_unlock(&root->fs_info->tree_log_mutex);
> +               ret = -EIO;
> +               goto scrub_continue;
> +       }
> +
>         ret = write_ctree_super(trans, root, 0);
>         if (ret) {
>                 mutex_unlock(&root->fs_info->tree_log_mutex);
> --
> 2.1.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to