RE: [PATCH] btrfs: Remove code for no-cow in scrub/replace

2015-10-26 Thread Zhao Lei
Hi, Filipe Manana

> -Original Message-
> From: Filipe Manana [mailto:fdman...@gmail.com]
> Sent: Friday, October 23, 2015 11:17 PM
> To: Jeff Mahoney <je...@suse.com>
> Cc: Zhao Lei <zhao...@cn.fujitsu.com>; linux-btrfs@vger.kernel.org
> Subject: Re: [PATCH] btrfs: Remove code for no-cow in scrub/replace
> 
> On Fri, Oct 23, 2015 at 4:11 PM, Jeff Mahoney <je...@suse.com> wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > On 10/23/15 4:03 AM, Zhao Lei wrote:
> >> Since we set source bg to readonly in scrub/replace, we don't need to
> >> consider confliction of no-cow write in scrub/replace operaion.
> >
> > What happens if there's a read failure?  IIRC the initial purpose of
> > this code was to correct read failures during scrub and device
> > replacement by fetching the bad extent from another device if one is
> > available.
> 
> And we don't have xfstests, or any other automated test suite, to test those
> code paths.
> So completely useless to run xfstests in a loop for 5 days, especially the 
> generic
> tests which never trigger scrub runs...
> 
Completely agree you.

Something which was not writen in patch's comment, I also have a custom test,
which include scrub on bad-block device:

[root@ZLLINUX custom_tests]# ls -l
total 36
-rwxr-xr-x 1 root root 15931 Oct  9 21:08 btrfs_maintance.sh
-rwxr-xr-x 1 root root  2000 Apr 30 18:23 btrfs_out_of_space.sh
-rwxr-xr-x 1 root root  5588 May 26 09:47 btrfs_replace_sumerr.sh
-rwxr-xr-x 1 root root  1042 Sep 18 17:05 __loop_custom.sh
-rwxr-xr-x 1 root root   939 Jul 14 18:15 xfstests.sh
[root@ZLLINUX custom_tests]#

# grep '()' btrfs_maintance.sh
...
redundancy_func_test()
scrub_func_test()
replace_func_test()
many_faildisk_test()
[root@ZLLINUX custom_tests]#

And this patch also tested in above script:

# ./__loop_custom.sh ./btrfs_maintance.sh
Loop 0
replace_func_test: raidtype=raid1 mkfs_opt= mount_opt=-o nodatacow max_disk=
mkfs and mount: raid=raid1 dev_cnt=2 mkfs_opt= mount_opt=-o nodatacow:
Writting some data:
  du: 17, writting 1M file
  du: 18, writting 2M file
  du: 20, writting 4M file
  du: 24, writting 8M file
  du: 32, writting 16M file
  reach enough space: 48%
Start replace /dev/vdc -> /dev/vde
  OO:/dev/vdc OO:/dev/vdd #X:/dev/vde
  #O:/dev/vdc OO:/dev/vdd OO:/dev/vde
  Replace start in dmesg found
  Replace finish in dmesg found
  dmesg: OK
  file contents before remount: same
  dmesg: OK
  file contents after remount: same
Check prune /dev/vdc
  #O:/dev/vdc OO:/dev/vdd OO:/dev/vde
  prune dsk /dev/vdc
  #X:/dev/vdc OO:/dev/vdd OO:/dev/vde
  dmesg: OK
  fs contents: same
Check prune /dev/vdd
  #X:/dev/vdc OO:/dev/vdd OO:/dev/vde
  prune dsk /dev/vdd
  #X:/dev/vdc OX:/dev/vdd OO:/dev/vde
  dmesg: OK
  fs contents: same
Start replace /dev/vde -> /dev/vdc
  #X:/dev/vdc OX:/dev/vdd OO:/dev/vde
  OO:/dev/vdc OX:/dev/vdd #O:/dev/vde
  Replace start in dmesg found
  Replace finish in dmesg found
  dmesg: OK
  file contents before remount: same
  

I'll add above script into xfstests when it is complete
and stable.

Thanks
Zhaolei

> 
> >
> > See commit 0ef8e45158f (btrfs scrub: add fixup code for errors on
> > nodatasum files)
> >
> > - -Jeff
> >
> >> This patch removes special code for no-cow mode in scrub/replace,
> >> reduced 670 lines.
> >>
> >> Tested by continuous xfstests in 5 days, include generic and btrfs
> >> groups with 10 mount options include nodatacow.
> >>
> >> Signed-off-by: Zhao Lei <zhao...@cn.fujitsu.com> ---
> >> fs/btrfs/ctree.h |   1 - fs/btrfs/scrub.c | 669
> >> --- 2 files
> >> changed, 670 deletions(-)
> >>
> >> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index
> >> 938efe3..3387509 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h
> >> @@ -1688,7 +1688,6 @@ struct btrfs_fs_info { int
> >> scrub_workers_refcnt; struct btrfs_workqueue *scrub_workers; struct
> >> btrfs_workqueue *scrub_wr_completion_workers; -   struct
> >> btrfs_workqueue *scrub_nocow_workers; struct btrfs_workqueue
> >> *scrub_parity_workers;
> >>
> >> #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY diff --git a/fs/btrfs/scrub.c
> >> b/fs/btrfs/scrub.c index d64f557..6027679
> >> 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -205,32
> >> +205,6 @@ struct scrub_ctx { atomic_trefs; };
> >>
> >> -struct scrub_fixup_nodatasum { - struct scrub_ctx*sctx; -
> struct
> >> btrfs_device  *dev; - u64 logical; -  struct
> btrfs_root   *root; -
> >> struct btrfs_work work; - int 

RE: [PATCH] btrfs: Remove code for no-cow in scrub/replace

2015-10-26 Thread Zhao Lei
Hi, Jeff Mahoney

Thanks for review!

> -Original Message-
> From: Jeff Mahoney [mailto:je...@suse.com]
> Sent: Friday, October 23, 2015 11:11 PM
> To: Zhao Lei <zhao...@cn.fujitsu.com>; linux-btrfs@vger.kernel.org
> Subject: Re: [PATCH] btrfs: Remove code for no-cow in scrub/replace
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 10/23/15 4:03 AM, Zhao Lei wrote:
> > Since we set source bg to readonly in scrub/replace, we don't need to
> > consider confliction of no-cow write in scrub/replace operaion.
> 
> What happens if there's a read failure?  IIRC the initial purpose of this code
> was to correct read failures during scrub and device replacement by fetching
> the bad extent from another device if one is available.
> 
> See commit 0ef8e45158f (btrfs scrub: add fixup code for errors on nodatasum
> files)
> 
"nodatasum" is used to check "is the data in non-cow state", and the reason
for using inode-writeback is to avoid same-time-writing in the block which is
in scrubing. Comment in newest code scrub.c L1055 can give us the detail.
(Introduced by comment: b5d67f64f)

Since the entire bg was set to readonly in scrub period, there are no same-time
write operation for both cow and non-cow bg, and the bio-based fix
operation can works for all above case.

Thanks
Zhaolei

> - -Jeff
> 
> > This patch removes special code for no-cow mode in scrub/replace,
> > reduced 670 lines.
> >
> > Tested by continuous xfstests in 5 days, include generic and btrfs
> > groups with 10 mount options include nodatacow.
> >
> > Signed-off-by: Zhao Lei <zhao...@cn.fujitsu.com> ---
> > fs/btrfs/ctree.h |   1 - fs/btrfs/scrub.c | 669
> > --- 2 files
> > changed, 670 deletions(-)
> >
> > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index
> > 938efe3..3387509 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h
> > @@ -1688,7 +1688,6 @@ struct btrfs_fs_info { int scrub_workers_refcnt;
> > struct btrfs_workqueue *scrub_workers; struct
> > btrfs_workqueue *scrub_wr_completion_workers; - struct
> > btrfs_workqueue *scrub_nocow_workers; struct btrfs_workqueue
> > *scrub_parity_workers;
> >
> > #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY diff --git a/fs/btrfs/scrub.c
> > b/fs/btrfs/scrub.c index d64f557..6027679
> > 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -205,32
> > +205,6 @@ struct scrub_ctx { atomic_trefs; };
> >
> > -struct scrub_fixup_nodatasum { -   struct scrub_ctx*sctx; -
> > struct
> > btrfs_device*dev; - u64 logical; -  struct 
> > btrfs_root   *root; -
> > struct btrfs_work   work; - int mirror_num; -}; - 
> > -struct
> > scrub_nocow_inode { -   u64 inum; - u64 
> > offset; -   u64
>   root; -
> > struct list_headlist; -}; - -struct scrub_copy_nocow_ctx { -
> > struct scrub_ctx*sctx; -u64 logical; -  
> > u64 len; -
>   int
> > mirror_num; -   u64 physical_for_dev_replace; - 
> > struct
> list_head
> > inodes; -   struct btrfs_work   work; -}; - struct scrub_warning {
> > struct btrfs_path   *path; u64  extent_item_size; @@ 
> > -242,8
> +216,6
> > @@ struct scrub_warning {
> >
> > static void scrub_pending_bio_inc(struct scrub_ctx *sctx); static void
> > scrub_pending_bio_dec(struct scrub_ctx *sctx); -static void
> > scrub_pending_trans_workers_inc(struct scrub_ctx *sctx); -static void
> > scrub_pending_trans_workers_dec(struct scrub_ctx *sctx); static int
> > scrub_handle_errored_block(struct scrub_block *sblock_to_check);
> > static int scrub_setup_recheck_block(struct scrub_block
> > *original_sblock, struct scrub_block *sblocks_for_recheck); @@ -298,13
> > +270,6 @@ static int scrub_add_page_to_wr_bio(struct scrub_ctx *sctx,
> > static void scrub_wr_submit(struct scrub_ctx *sctx); static void
> > scrub_wr_bio_end_io(struct bio *bio, int err); static void
> > scrub_wr_bio_end_io_worker(struct btrfs_work *work); -static int
> > write_page_nocow(struct scrub_ctx *sctx, -  u64
> > physical_for_dev_replace, struct page *page); -static int
> > copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, - struct
> > scrub_copy_nocow_ctx *ctx); -static int copy_nocow_pages(struct
> > scrub_ctx *sctx, u64 logical, u64 len, - int mirror_num, u64
> > physical_for_dev_replace); -static void copy_noc

Re: [PATCH] btrfs: Remove code for no-cow in scrub/replace

2015-10-23 Thread Filipe Manana
On Fri, Oct 23, 2015 at 4:11 PM, Jeff Mahoney  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 10/23/15 4:03 AM, Zhao Lei wrote:
>> Since we set source bg to readonly in scrub/replace, we don't need
>> to consider confliction of no-cow write in scrub/replace operaion.
>
> What happens if there's a read failure?  IIRC the initial purpose of
> this code was to correct read failures during scrub and device
> replacement by fetching the bad extent from another device if one is
> available.

And we don't have xfstests, or any other automated test suite, to test
those code paths.
So completely useless to run xfstests in a loop for 5 days, especially
the generic tests which never trigger scrub runs...


>
> See commit 0ef8e45158f (btrfs scrub: add fixup code for errors on
> nodatasum files)
>
> - -Jeff
>
>> This patch removes special code for no-cow mode in scrub/replace,
>> reduced 670 lines.
>>
>> Tested by continuous xfstests in 5 days, include generic and btrfs
>> groups with 10 mount options include nodatacow.
>>
>> Signed-off-by: Zhao Lei  ---
>> fs/btrfs/ctree.h |   1 - fs/btrfs/scrub.c | 669
>> --- 2 files
>> changed, 670 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index
>> 938efe3..3387509 100644 --- a/fs/btrfs/ctree.h +++
>> b/fs/btrfs/ctree.h @@ -1688,7 +1688,6 @@ struct btrfs_fs_info { int
>> scrub_workers_refcnt; struct btrfs_workqueue *scrub_workers; struct
>> btrfs_workqueue *scrub_wr_completion_workers; -   struct
>> btrfs_workqueue *scrub_nocow_workers; struct btrfs_workqueue
>> *scrub_parity_workers;
>>
>> #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY diff --git
>> a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index d64f557..6027679
>> 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -205,32
>> +205,6 @@ struct scrub_ctx { atomic_trefs; };
>>
>> -struct scrub_fixup_nodatasum { - struct scrub_ctx*sctx; -   
>>  struct
>> btrfs_device  *dev; - u64 logical; -  struct 
>> btrfs_root   *root; -
>> struct btrfs_work work; - int mirror_num; -}; - 
>> -struct
>> scrub_nocow_inode { - u64 inum; - u64
>>  offset; -   u64 root; -
>> struct list_head  list; -}; - -struct scrub_copy_nocow_ctx { -
>> struct scrub_ctx  *sctx; -u64 logical; - 
>>  u64 len; -  int
>> mirror_num; - u64 physical_for_dev_replace; - struct 
>> list_head
>> inodes; - struct btrfs_work   work; -}; - struct scrub_warning {
>> struct btrfs_path *path; u64  extent_item_size; @@ 
>> -242,8 +216,6
>> @@ struct scrub_warning {
>>
>> static void scrub_pending_bio_inc(struct scrub_ctx *sctx); static
>> void scrub_pending_bio_dec(struct scrub_ctx *sctx); -static void
>> scrub_pending_trans_workers_inc(struct scrub_ctx *sctx); -static
>> void scrub_pending_trans_workers_dec(struct scrub_ctx *sctx);
>> static int scrub_handle_errored_block(struct scrub_block
>> *sblock_to_check); static int scrub_setup_recheck_block(struct
>> scrub_block *original_sblock, struct scrub_block
>> *sblocks_for_recheck); @@ -298,13 +270,6 @@ static int
>> scrub_add_page_to_wr_bio(struct scrub_ctx *sctx, static void
>> scrub_wr_submit(struct scrub_ctx *sctx); static void
>> scrub_wr_bio_end_io(struct bio *bio, int err); static void
>> scrub_wr_bio_end_io_worker(struct btrfs_work *work); -static int
>> write_page_nocow(struct scrub_ctx *sctx, -u64
>> physical_for_dev_replace, struct page *page); -static int
>> copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, -
>> struct scrub_copy_nocow_ctx *ctx); -static int
>> copy_nocow_pages(struct scrub_ctx *sctx, u64 logical, u64 len, -
>> int mirror_num, u64 physical_for_dev_replace); -static void
>> copy_nocow_pages_worker(struct btrfs_work *work); static void
>> __scrub_blocked_if_needed(struct btrfs_fs_info *fs_info); static
>> void scrub_blocked_if_needed(struct btrfs_fs_info *fs_info); static
>> void scrub_put_ctx(struct scrub_ctx *sctx); @@ -355,60 +320,6 @@
>> static void scrub_blocked_if_needed(struct btrfs_fs_info *fs_info)
>> scrub_pause_off(fs_info); }
>>
>> -/* - * used for workers that require transaction commits (i.e.,
>> for the - * NOCOW case) - */ -static void
>> scrub_pending_trans_workers_inc(struct scrub_ctx *sctx) -{ -  struct
>> btrfs_fs_info *fs_info = sctx->dev_root->fs_info; - -
>> atomic_inc(>refs); -/* - * increment scrubs_running to
>> prevent cancel requests from - * completing as long as a worker is
>> running. we must also -* increment scrubs_paused to prevent
>> deadlocking on pause - * requests used for transactions commits
>> (as the worker uses a -* transaction context). it is safe to
>> regard the worker -* as paused for all 

Re: [PATCH] btrfs: Remove code for no-cow in scrub/replace

2015-10-23 Thread Jeff Mahoney
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/23/15 4:03 AM, Zhao Lei wrote:
> Since we set source bg to readonly in scrub/replace, we don't need 
> to consider confliction of no-cow write in scrub/replace operaion.

What happens if there's a read failure?  IIRC the initial purpose of
this code was to correct read failures during scrub and device
replacement by fetching the bad extent from another device if one is
available.

See commit 0ef8e45158f (btrfs scrub: add fixup code for errors on
nodatasum files)

- -Jeff

> This patch removes special code for no-cow mode in scrub/replace, 
> reduced 670 lines.
> 
> Tested by continuous xfstests in 5 days, include generic and btrfs 
> groups with 10 mount options include nodatacow.
> 
> Signed-off-by: Zhao Lei  --- 
> fs/btrfs/ctree.h |   1 - fs/btrfs/scrub.c | 669
> --- 2 files
> changed, 670 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index
> 938efe3..3387509 100644 --- a/fs/btrfs/ctree.h +++
> b/fs/btrfs/ctree.h @@ -1688,7 +1688,6 @@ struct btrfs_fs_info { int
> scrub_workers_refcnt; struct btrfs_workqueue *scrub_workers; struct
> btrfs_workqueue *scrub_wr_completion_workers; -   struct
> btrfs_workqueue *scrub_nocow_workers; struct btrfs_workqueue
> *scrub_parity_workers;
> 
> #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY diff --git
> a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index d64f557..6027679
> 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -205,32
> +205,6 @@ struct scrub_ctx { atomic_trefs; };
> 
> -struct scrub_fixup_nodatasum { - struct scrub_ctx*sctx; -
> struct
> btrfs_device  *dev; - u64 logical; -  struct 
> btrfs_root   *root; -
> struct btrfs_work work; - int mirror_num; -}; - 
> -struct
> scrub_nocow_inode { - u64 inum; - u64 
> offset; -   u64 root; -
> struct list_head  list; -}; - -struct scrub_copy_nocow_ctx { -
> struct scrub_ctx  *sctx; -u64 logical; -  
> u64 len; -  int
> mirror_num; - u64 physical_for_dev_replace; - struct 
> list_head
> inodes; - struct btrfs_work   work; -}; - struct scrub_warning { 
> struct btrfs_path *path; u64  extent_item_size; @@ 
> -242,8 +216,6
> @@ struct scrub_warning {
> 
> static void scrub_pending_bio_inc(struct scrub_ctx *sctx); static
> void scrub_pending_bio_dec(struct scrub_ctx *sctx); -static void
> scrub_pending_trans_workers_inc(struct scrub_ctx *sctx); -static
> void scrub_pending_trans_workers_dec(struct scrub_ctx *sctx); 
> static int scrub_handle_errored_block(struct scrub_block
> *sblock_to_check); static int scrub_setup_recheck_block(struct
> scrub_block *original_sblock, struct scrub_block
> *sblocks_for_recheck); @@ -298,13 +270,6 @@ static int
> scrub_add_page_to_wr_bio(struct scrub_ctx *sctx, static void
> scrub_wr_submit(struct scrub_ctx *sctx); static void
> scrub_wr_bio_end_io(struct bio *bio, int err); static void
> scrub_wr_bio_end_io_worker(struct btrfs_work *work); -static int
> write_page_nocow(struct scrub_ctx *sctx, -u64
> physical_for_dev_replace, struct page *page); -static int
> copy_nocow_pages_for_inode(u64 inum, u64 offset, u64 root, -
> struct scrub_copy_nocow_ctx *ctx); -static int
> copy_nocow_pages(struct scrub_ctx *sctx, u64 logical, u64 len, -
> int mirror_num, u64 physical_for_dev_replace); -static void
> copy_nocow_pages_worker(struct btrfs_work *work); static void
> __scrub_blocked_if_needed(struct btrfs_fs_info *fs_info); static
> void scrub_blocked_if_needed(struct btrfs_fs_info *fs_info); static
> void scrub_put_ctx(struct scrub_ctx *sctx); @@ -355,60 +320,6 @@
> static void scrub_blocked_if_needed(struct btrfs_fs_info *fs_info) 
> scrub_pause_off(fs_info); }
> 
> -/* - * used for workers that require transaction commits (i.e.,
> for the - * NOCOW case) - */ -static void
> scrub_pending_trans_workers_inc(struct scrub_ctx *sctx) -{ -  struct
> btrfs_fs_info *fs_info = sctx->dev_root->fs_info; - -
> atomic_inc(>refs); -/* - * increment scrubs_running to
> prevent cancel requests from - * completing as long as a worker is
> running. we must also -* increment scrubs_paused to prevent
> deadlocking on pause - * requests used for transactions commits
> (as the worker uses a -* transaction context). it is safe to
> regard the worker -* as paused for all matters practical.
> effectively, we only - * avoid cancellation requests from
> completing. -  */ -   mutex_lock(_info->scrub_lock); -
> atomic_inc(_info->scrubs_running); -
> atomic_inc(_info->scrubs_paused); -
> mutex_unlock(_info->scrub_lock); - -   /* - * check if
> @scrubs_running=@scrubs_paused condition - * inside wait_event()
> is not an atomic