RE: [PATCH] btrfs: Continue replace when set_block_ro failed

Zhao Lei Mon, 16 Nov 2015 03:44:12 -0800

Hi, Filipe Manana

> -----Original Message-----
> From: Filipe Manana [mailto:[email protected]]
> Sent: Monday, November 16, 2015 6:57 PM
> To: Zhao Lei <[email protected]>
> Cc: [email protected]
> Subject: Re: [PATCH] btrfs: Continue replace when set_block_ro failed
> 
> On Mon, Nov 16, 2015 at 10:44 AM, Zhao Lei <[email protected]> wrote:
> > Hi, Filipe Manana
> >
> >> -----Original Message-----
> >> From: [email protected]
> >> [mailto:[email protected]] On Behalf Of Filipe Manana
> >> Sent: Monday, November 16, 2015 6:27 PM
> >> To: Zhao Lei <[email protected]>
> >> Cc: [email protected]
> >> Subject: Re: [PATCH] btrfs: Continue replace when set_block_ro failed
> >>
> >> On Mon, Nov 16, 2015 at 10:07 AM, Zhao Lei <[email protected]>
> wrote:
> >> > Hi, Filipe Manana
> >> >
> >> > Thanks for review.
> >> >
> >> >> -----Original Message-----
> >> >> From: Filipe Manana [mailto:[email protected]]
> >> >> Sent: Friday, November 13, 2015 8:02 PM
> >> >> To: Zhao Lei <[email protected]>
> >> >> Cc: [email protected]
> >> >> Subject: Re: [PATCH] btrfs: Continue replace when set_block_ro
> >> >> failed
> >> >>
> >> >> On Fri, Nov 13, 2015 at 11:38 AM, Zhao Lei <[email protected]>
> wrote:
> >> >> > xfstests/011 failed in node with small_size filesystem.
> >> >> > Can be reproduced by following script:
> >> >> >   DEV_LIST="/dev/vdd /dev/vde"
> >> >> >   DEV_REPLACE="/dev/vdf"
> >> >> >
> >> >> >   do_test()
> >> >> >   {
> >> >> >       local mkfs_opt="$1"
> >> >> >       local size="$2"
> >> >> >
> >> >> >       dmesg -c >/dev/null
> >> >> >       umount $SCRATCH_MNT &>/dev/null
> >> >> >
> >> >> >       echo  mkfs.btrfs -f $mkfs_opt "${DEV_LIST[*]}"
> >> >> >       mkfs.btrfs -f $mkfs_opt "${DEV_LIST[@]}" || return 1
> >> >> >       mount "${DEV_LIST[0]}" $SCRATCH_MNT
> >> >> >
> >> >> >       echo -n "Writing big files"
> >> >> >       dd if=/dev/urandom of=$SCRATCH_MNT/t0 bs=1M
> >> >> count=1 >/dev/null 2>&1
> >> >> >       for ((i = 1; i <= size; i++)); do
> >> >> >           echo -n .
> >> >> >           /bin/cp $SCRATCH_MNT/t0 $SCRATCH_MNT/t$i || return
> 1
> >> >> >       done
> >> >> >       echo
> >> >> >
> >> >> >       echo Start replace
> >> >> >       btrfs replace start -Bf "${DEV_LIST[0]}" "$DEV_REPLACE"
> >> >> $SCRATCH_MNT || {
> >> >> >           dmesg
> >> >> >           return 1
> >> >> >       }
> >> >> >       return 0
> >> >> >   }
> >> >> >
> >> >> >   # Set size to value near fs size
> >> >> >   # for example, 1897 can trigger this bug in 2.6G device.
> >> >> >   #
> >> >> >   ./do_test "-d raid1 -m raid1" 1897
> >> >> >
> >> >> > System will report replace fail with following warning in dmesg:
> >> >> >
> >> >> > Reason:
> >> >> >  When big data writen to fs, the whole free space will be
> >> >> > allocated for data chunk.
> >> >> >  And operation as scrub need to set_block_ro(), and when there
> >> >> > is only one metadata chunk in system(or other metadata chunks
> >> >> > are all full), the function will try to allocate a new chunk,
> >> >> > and failed because no space in device.
> >> >> >
> >> >> > Fix:
> >> >> >  When set_block_ro failed for metadata chunk, it is not a
> >> >> > problem because scrub_lock forbids commit_trancaction.
> >> >>
> >> >> Hi Zhao, I'm confused reading this explanation.
> >> >>
> >> >> Why isn't it a problem if the block group gets modified while the
> >> >> transaction is ongoing? Through writepages() for example.
> >> >> Committing a transaction isn't the only way to write data or
> >> >> metadata to a block group.
> >> >>
> >> > Metadata chunks always updated in cow, the writepages() will write
> >> > new data to different place with operation done by replace.
> >>
> >> So why do we try to set the block group ro? Your change is really
> >> confusing - if we fail setting the block group to read-only mode, we
> >> just ignore it and proceed
> >> - why bother in the first place to set it read-only?
> >>
> > I find a problem that set_group_ro will failed for metadata when disk
> > almost full, but we also have a good news that the time when
> > set_group_ro failed is just when it is not necessary.
> 
> Yes - but the code will ignore the failure for any error code, not just 
> -ENOSPC.
> Surely btrfs_inc_block_group_ro() can return error codes different from
> -ENOSPC. You should make it clear in the code (and in the change log) why
> ENOSPC is such a special case - why we need to set the block to RO and why we
> can ignore the ENOSPC failure.
> 
Thanks for notice.
Will make, test and send v2.


Thanks
Zhaolei

> >
> >> Someone reading the code will find it fishy, and going back to git
> >> history, your change log doesn't really explains why this is being
> >> done and why is it safe to do it.
> >>
> > Thanks for notice, it is necessary to add detail explanation into changelog.
> >
> >> You also forgot to paste the warning you mention below "System will
> >> report replace fail with following warning in dmesg:" in the change log.
> >>
> > Agree, I'll paste it in changelog.
> >
> > Thanks
> > Zhaolei
> >
> >> thanks
> >>
> >> >
> >> > Thanks
> >> > Zhaolei
> >> >
> >> >> thanks
> >> >>
> >> >> >  Let replace continue in this case is no problem.
> >> >> >
> >> >> > Tested by above script, and xfstests/011, plus 100 times xfstests/070.
> >> >> >
> >> >> > Signed-off-by: Zhao Lei <[email protected]>
> >> >> > ---
> >> >> >  fs/btrfs/scrub.c | 9 ++++-----
> >> >> >  1 file changed, 4 insertions(+), 5 deletions(-)
> >> >> >
> >> >> > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index
> >> >> > a39f5d1..5a30753 100644
> >> >> > --- a/fs/btrfs/scrub.c
> >> >> > +++ b/fs/btrfs/scrub.c
> >> >> > @@ -3532,6 +3532,7 @@ int scrub_enumerate_chunks(struct
> >> >> > scrub_ctx
> >> >> *sctx,
> >> >> >         u64 length;
> >> >> >         u64 chunk_offset;
> >> >> >         int ret = 0;
> >> >> > +       int ro_set = 0;
> >> >> >         int slot;
> >> >> >         struct extent_buffer *l;
> >> >> >         struct btrfs_key key;
> >> >> > @@ -3617,10 +3618,7 @@ int scrub_enumerate_chunks(struct
> >> >> > scrub_ctx
> >> >> *sctx,
> >> >> >                 scrub_pause_on(fs_info);
> >> >> >                 ret = btrfs_inc_block_group_ro(root, cache);
> >> >> >                 scrub_pause_off(fs_info);
> >> >> > -               if (ret) {
> >> >> > -                       btrfs_put_block_group(cache);
> >> >> > -                       break;
> >> >> > -               }
> >> >> > +               ro_set = !ret;
> >> >> >
> >> >> >                 dev_replace->cursor_right = found_key.offset +
> >> length;
> >> >> >                 dev_replace->cursor_left = found_key.offset; @@
> >> >> > -3660,7 +3658,8 @@ int scrub_enumerate_chunks(struct scrub_ctx
> >> >> > *sctx,
> >> >> >
> >> >> >                 scrub_pause_off(fs_info);
> >> >> >
> >> >> > -               btrfs_dec_block_group_ro(root, cache);
> >> >> > +               if (ro_set)
> >> >> > +                       btrfs_dec_block_group_ro(root, cache);
> >> >> >
> >> >> >                 btrfs_put_block_group(cache);
> >> >> >                 if (ret)
> >> >> > --
> >> >> > 1.8.5.1
> >> >> >
> >> >> > --
> >> >> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
> >> >> > in the body of a message to [email protected] More
> >> >> > majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Filipe David Manana,
> >> >>
> >> >> "Reasonable men adapt themselves to the world.
> >> >>  Unreasonable men adapt the world to themselves.
> >> >>  That's why all progress depends on unreasonable men."
> >> >
> >>
> >>
> >>
> >> --
> >> Filipe David Manana,
> >>
> >> "Reasonable men adapt themselves to the world.
> >>  Unreasonable men adapt the world to themselves.
> >>  That's why all progress depends on unreasonable men."
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe
> >> linux-btrfs" in the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
> 
> 
> 
> --
> Filipe David Manana,
> 
> "Reasonable men adapt themselves to the world.
>  Unreasonable men adapt the world to themselves.
>  That's why all progress depends on unreasonable men."

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH] btrfs: Continue replace when set_block_ro failed

Reply via email to