Re: [PATCH 00/16] btrfs-progs: Split lowmem mode check to its own

2018-01-18 Thread Su Yue



On 01/19/2018 01:37 PM, Qu Wenruo wrote:

The long planned cmds-check re-construction is finally here.

As the original cmds-check.c is getting larger and larger (already over
15K lines), it's always a good idea to split it into its own check/
directory.

This patchset do the following work:
1) Move cmds-check.c to check/main.c
2) Put codes shared by both original and lowmem mode into
check/common.[ch]
3) Put lowmem code into check/lowmem.[ch]
With minor renaming to get rid of unnecessary _v2 suffix.

The modification looks scary, but no functional change at all.

And considering how much the file structure changed, it's a good idea to
put PART1 as quick as possible, and there will be less pressure to
rebase new incoming fsck related codes.

The real move work happens in the 15th patch, which due to its size
(500KB+), it may not be able to reach mail list.
So please fetch the whole patchset from github:
https://github.com/adam900710/btrfs-progs/tree/split_check

There will be a part 2, mostly moving original mode to its own
check/original.[ch], along with extra comment explaining how the two
different modes work.


It's fine to do cleanup and extra comment in part2.
So, all patches except patch[4] with wrong title
are

Reviewed-by: Su Yue 

Qu Wenruo (16):
   btrfs-progs: Moves cmds-check.c to check/main.c
   btrfs-progs: check: Move original mode definitions to check/original.h
   btrfs-progs: check: Move definitions of lowmem mode to check/lowmem.h
   btrfs-progs: check: Move node_ptr structure to check/common.h
   btrfs-progs: check: Export check global variables to check/common.h
   btrfs-progs: check: Move imode_to_type function to check/common.h
   btrfs-progs: check: Move fs_root_objectid function to check/common.h
   btrfs-progs: check: Move count_csum_range function to check/common.c
   btrfs-progs: check: Move __create_inode_item function to
 check/common.c
   btrfs-progs: check: Move link_inode_to_lostfound function to common.c
   btrfs-progs: check: Move check_dev_size_alignment to check/common.c
   btrfs-progs: check: move reada_walk_down to check/common.c
   btrfs-progs: check: Move check_child_node to check/common.c
   btrfs-progs: check: Move reset_cached_block_groups to check/common.c
   btrfs-progs: check: Move lowmem check code to its own
 check/lowmem.[ch]
   btrfs-progs: check/lowmem: Cleanup unnecessary _v2 suffix

  Makefile | 6 +-
  check/common.c   |   351 +
  check/common.h   |   100 +
  check/lowmem.c   |  4571 
  check/lowmem.h   |67 +
  cmds-check.c => check/main.c | 16389 ++---
  check/original.h |   293 +
  7 files changed, 11007 insertions(+), 10770 deletions(-)
  create mode 100644 check/common.c
  create mode 100644 check/common.h
  create mode 100644 check/lowmem.c
  create mode 100644 check/lowmem.h
  rename cmds-check.c => check/main.c (65%)
  create mode 100644 check/original.h




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] btrfs-progs: dir-item: Don't do extra filetype validaction check for btrfs_match_dir_item_name

2018-01-18 Thread Qu Wenruo


On 2018年01月19日 15:39, Su Yue wrote:
> 
> 
> On 01/19/2018 03:25 PM, Qu Wenruo wrote:
>> btrfs_match_dir_item_name() will check if its filetype is valid before
>> doing search, this makes btrfs-progs unable to locate and remove invalid
>> dir_index for btrfs_unlink().
>>
>> This function only affects btrfs_link() and btrfs_unlink() in upper
>> layer, and normal check can find invalid filetype by itself.
>>
> Lowmem mode can't handles wrong filetype well now.
> I'm working on it. And this change is okay for me.

I think you mean *original* mode can't handle it.

As v4.14.1 lowmem mode can detect such problem without problem:
---
checking fs roots
ERROR: root 5 INODE_ITEM[258] index 2 name file1 filetype 34 mismath
ERROR: root 5 DIR INDEX[257 2] missing name file1 filetype 1
ERROR: errors found in fs roots
found 131072 bytes used, error(s) found
---

And patch 1/3 will handle the repair, so it shouldn't be a problem for
lowmem.

Thanks,
Qu
> 
> Reviewed-by: Su Yue 
> 
>> So remove the filetype check is completely safe in this case, and will
>> enhance btrfs_unlink() to remove invalid dir_index/dir_item for repair.
>>
>> Signed-off-by: Qu Wenruo 
>> ---
>>   dir-item.c | 6 --
>>   1 file changed, 6 deletions(-)
>>
>> diff --git a/dir-item.c b/dir-item.c
>> index 462546c0eaf4..e0a0ab4d7a5d 100644
>> --- a/dir-item.c
>> +++ b/dir-item.c
>> @@ -294,12 +294,6 @@ static int verify_dir_item(struct btrfs_root *root,
>>   u16 namelen = BTRFS_NAME_LEN;
>>   u8 type = btrfs_dir_type(leaf, dir_item);
>>   -    if (type >= BTRFS_FT_MAX) {
>> -    fprintf(stderr, "invalid dir item type: %d\n",
>> -   (int)type);
>> -    return 1;
>> -    }
>> -
>>   if (type == BTRFS_FT_XATTR)
>>   namelen = XATTR_NAME_MAX;
>>  
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2 3/3] btrfs-progs: dir-item: Make btrfs_delete_one_dir_name more robust to handle corrupted name len

2018-01-18 Thread Su Yue



On 01/19/2018 03:25 PM, Qu Wenruo wrote:

Function btrfs_delete_one_dir_name() will check if the dir_item is the
last content of the item, and delete the whole item if needed.

However if @name_len of one dir_item/dir_index is corrupted and larger
than the item size, the function will still try to treat it as partly
remove, which will screw up the whole leaf.

This patch will enhance the item deletion check, to cover corrupted name
len, so in that case we just delete the whole item.



Reviewed-by: Su Yue 

Signed-off-by: Qu Wenruo 
---
  dir-item.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/dir-item.c b/dir-item.c
index e0a0ab4d7a5d..35e0615fb423 100644
--- a/dir-item.c
+++ b/dir-item.c
@@ -263,7 +263,6 @@ int btrfs_delete_one_dir_name(struct btrfs_trans_handle 
*trans,
  struct btrfs_path *path,
  struct btrfs_dir_item *di)
  {
-
struct extent_buffer *leaf;
u32 sub_item_len;
u32 item_len;
@@ -273,7 +272,15 @@ int btrfs_delete_one_dir_name(struct btrfs_trans_handle 
*trans,
sub_item_len = sizeof(*di) + btrfs_dir_name_len(leaf, di) +
btrfs_dir_data_len(leaf, di);
item_len = btrfs_item_size_nr(leaf, path->slots[0]);
-   if (sub_item_len == item_len) {
+
+   /*
+* If @sub_item_len is longer than @item_len, then it means the
+* name_len is just corrupted.
+* No good idea to know if there is anything we can recover from
+* the corrupted item.
+* Just delete the item.
+*/
+   if (sub_item_len >= item_len) {
ret = btrfs_del_item(trans, root, path);
} else {
unsigned long ptr = (unsigned long)di;




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/3] btrfs-progs: dir-item: Don't do extra filetype validaction check for btrfs_match_dir_item_name

2018-01-18 Thread Su Yue



On 01/19/2018 03:25 PM, Qu Wenruo wrote:

btrfs_match_dir_item_name() will check if its filetype is valid before
doing search, this makes btrfs-progs unable to locate and remove invalid
dir_index for btrfs_unlink().

This function only affects btrfs_link() and btrfs_unlink() in upper
layer, and normal check can find invalid filetype by itself.


Lowmem mode can't handles wrong filetype well now.
I'm working on it. And this change is okay for me.

Reviewed-by: Su Yue 


So remove the filetype check is completely safe in this case, and will
enhance btrfs_unlink() to remove invalid dir_index/dir_item for repair.

Signed-off-by: Qu Wenruo 
---
  dir-item.c | 6 --
  1 file changed, 6 deletions(-)

diff --git a/dir-item.c b/dir-item.c
index 462546c0eaf4..e0a0ab4d7a5d 100644
--- a/dir-item.c
+++ b/dir-item.c
@@ -294,12 +294,6 @@ static int verify_dir_item(struct btrfs_root *root,
u16 namelen = BTRFS_NAME_LEN;
u8 type = btrfs_dir_type(leaf, dir_item);
  
-	if (type >= BTRFS_FT_MAX) {

-   fprintf(stderr, "invalid dir item type: %d\n",
-  (int)type);
-   return 1;
-   }
-
if (type == BTRFS_FT_XATTR)
namelen = XATTR_NAME_MAX;
  




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] btrfs-progs: lowmem fsck: Remove corupted link before re-add correct link

2018-01-18 Thread Su Yue



On 01/19/2018 03:25 PM, Qu Wenruo wrote:

For repair_ternary_lowmem() used in lowmem mode, if it found 1 of
DIR_INDEX/DIR_ITEM/INODE_REF missing, it will try to insert correct
link.

However for case like invalid type in DIR_INDEX, we should delete the
corrupted DIR_INDEX first before inserting the correct link.

This patch will remove the corrupted link before re-insert.
This should solve the duplicated DIR_INDEX problem in old lowmem mode
repair.

Cc: Sebastian Andrzej Siewior 


Reviewed-by: Su Yue 

Signed-off-by: Qu Wenruo 
---
  cmds-check.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 7fc30da83ea1..f302724dd840 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -4997,6 +4997,10 @@ int repair_ternary_lowmem(struct btrfs_root *root, u64 
dir_ino, u64 ino,
goto out;
}
if (stage == 1) {
+   ret = btrfs_unlink(trans, root, ino, dir_ino, index, name,
+  name_len, 0);
+   if (ret)
+   goto out;
ret = btrfs_add_link(trans, root, ino, dir_ino, name, name_len,
   filetype, &index, 1, 1);
goto out;




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/16] btrfs-progs: check: Export check global variables to check/common.h

2018-01-18 Thread Qu Wenruo


On 2018年01月19日 14:55, Su Yue wrote:
> 
> 
> On 01/19/2018 01:37 PM, Qu Wenruo wrote:
>> There are a dozen of variables which are used as "check global"
>> variables, like @total_csum_bytes or @no_holes.
>>
>> These variables are used freely across the check code, however since
>> we're splitting check code, they need to be exported so they can be used
>> in other files.
>>
>> This patch just export them and add declarations for them in
>> check/common.h.
>>
>> Signed-off-by: Qu Wenruo 
>> ---
>>   check/common.h | 17 +
>>   check/main.c   | 32 
>>   2 files changed, 33 insertions(+), 16 deletions(-)
>>
>> diff --git a/check/common.h b/check/common.h
>> index 25874aec597b..8d93ddbf4afb 100644
>> --- a/check/common.h
>> +++ b/check/common.h
>> @@ -36,4 +36,21 @@ struct node_refs {
>>   int full_backref[BTRFS_MAX_LEVEL];
>>   };
>>   +extern u64 bytes_used;
>> +extern u64 total_csum_bytes;
>> +extern u64 total_btree_bytes;
>> +extern u64 total_fs_tree_bytes;
>> +extern u64 total_extent_tree_bytes;
>> +extern u64 btree_space_waste;
>> +extern u64 data_bytes_allocated;
>> +extern u64 data_bytes_referenced;
>> +extern struct list_head duplicate_extents;
>> +extern struct list_head delete_items;
>> +extern int no_holes;
>> +extern int init_extent_tree;
>> +extern int check_data_csum;
>> +extern struct btrfs_fs_info *global_info;
>> +extern struct task_ctx ctx;
>> +extern struct cache_tree *roots_info_cache;
>> +
>>   #endif
>> diff --git a/check/main.c b/check/main.c
>> index fbd73c42bee8..bb927ecc87ee 100644
>> --- a/check/main.c
>> +++ b/check/main.c
>> @@ -61,22 +61,22 @@ struct task_ctx {
>>   struct task_info *info;
>>   };
>>   -static u64 bytes_used = 0;
>> -static u64 total_csum_bytes = 0;
>> -static u64 total_btree_bytes = 0;
>> -static u64 total_fs_tree_bytes = 0;
>> -static u64 total_extent_tree_bytes = 0;
>> -static u64 btree_space_waste = 0;
>> -static u64 data_bytes_allocated = 0;
>> -static u64 data_bytes_referenced = 0;
>> -static LIST_HEAD(duplicate_extents);
>> -static LIST_HEAD(delete_items);
>> -static int no_holes = 0;
>> -static int init_extent_tree = 0;
>> -static int check_data_csum = 0;
>> -static struct btrfs_fs_info *global_info;
>> -static struct task_ctx ctx = { 0 };
>> -static struct cache_tree *roots_info_cache = NULL;
>> +u64 bytes_used = 0;
>> +u64 total_csum_bytes = 0;
>> +u64 total_btree_bytes = 0;
>> +u64 total_fs_tree_bytes = 0;
>> +u64 total_extent_tree_bytes = 0;
>> +u64 btree_space_waste = 0;
>> +u64 data_bytes_allocated = 0;
>> +u64 data_bytes_referenced = 0;
>> +LIST_HEAD(duplicate_extents);
>> +LIST_HEAD(delete_items);
>> +int no_holes = 0;
>> +int init_extent_tree = 0;
>> +int check_data_csum = 0;
> 
> Just a small suggestion:
> Since the patchset only splits cmds-check.c without functional changes,
> Maybe it's a good timing to adjust those lines of old code according by
> errors and warnings which reported by checkpatch?

I'll do it in PART2, with new comment and format cleanup.

Right now I prefer this get merged first to provide the basis for later
cleanup.

Thanks,
Qu

> 
> Thanks,
> Su
> 
>> +struct btrfs_fs_info *global_info;
>> +struct task_ctx ctx = { 0 };
>> +struct cache_tree *roots_info_cache = NULL;
>>     enum btrfs_check_mode {
>>   CHECK_MODE_ORIGINAL,
>>
> 
> 



signature.asc
Description: OpenPGP digital signature


[PATCH v2 0/3] Lowmem fsck repair to fix filetype mismatch

2018-01-18 Thread Qu Wenruo
Sebastian reported a filesystem corruption where DIR_INDEX has wrong
filetype against INODE_ITEM.

Lowmem mode normally handles such problem by checking DIR_INDEX,
DIR_ITEM and INODE_REF/INODE_ITEM to determine the correct file type.
In such case, lowmem mode fsck can get the correct filetype.

When fixing the problem, lowmem mode will try to re-insert correct
(DIR_INDEX, DIR_ITEM, INODE_REF) tuple, and if existing correct
DIR_ITEM and INODE_REF is found, btrfs_link() will just skip and only
insert correct DIR_INDEX.

However, when inserting correct DIR_INDEX, due to extra DIR_INDEX
validation, incorrect one will be skiped and correct one will be
inserted after invalid one.

This leads to lowmem mode repair to create duplicated DIR_INDEX.

This patch will fix it by removing the whole (DIR_INDEX, DIR_ITEM,
INODE_REF) tuple before inserting correct tuple.
And the removing part, btrfs_unlink(), will be enhanced to handle
incorrect tuple member more robust.

Please note that, due a bug in lowmem mode repair, btrfs check will
still show "error(s) found in fs tree" even repair is done successfully.

And test case for this repair still needs extra work for original mode
to support such repair, or test case won't pass original mode test.

Changelog:
v2:
  No longer play tricks to add new parameters to let btrfs_unlink() to
  locate invalid dir_index, but remove the unnecessary filetype check in
  verify_dir_item().
  Since user of functions in dir-items.c are all btrfs check, either
  repairing or checking, and both original mode and lowmem mode can
  handle it well.

Qu Wenruo (3):
  btrfs-progs: lowmem fsck: Remove corupted link before re-add correct
link
  btrfs-progs: dir-item: Don't do extra filetype validaction check for
btrfs_match_dir_item_name
  btrfs-progs: dir-item: Make btrfs_delete_one_dir_name more robust to
handle corrupted name len

 cmds-check.c |  4 
 dir-item.c   | 17 +
 2 files changed, 13 insertions(+), 8 deletions(-)

-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/3] btrfs-progs: dir-item: Don't do extra filetype validaction check for btrfs_match_dir_item_name

2018-01-18 Thread Qu Wenruo
btrfs_match_dir_item_name() will check if its filetype is valid before
doing search, this makes btrfs-progs unable to locate and remove invalid
dir_index for btrfs_unlink().

This function only affects btrfs_link() and btrfs_unlink() in upper
layer, and normal check can find invalid filetype by itself.

So remove the filetype check is completely safe in this case, and will
enhance btrfs_unlink() to remove invalid dir_index/dir_item for repair.

Signed-off-by: Qu Wenruo 
---
 dir-item.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/dir-item.c b/dir-item.c
index 462546c0eaf4..e0a0ab4d7a5d 100644
--- a/dir-item.c
+++ b/dir-item.c
@@ -294,12 +294,6 @@ static int verify_dir_item(struct btrfs_root *root,
u16 namelen = BTRFS_NAME_LEN;
u8 type = btrfs_dir_type(leaf, dir_item);
 
-   if (type >= BTRFS_FT_MAX) {
-   fprintf(stderr, "invalid dir item type: %d\n",
-  (int)type);
-   return 1;
-   }
-
if (type == BTRFS_FT_XATTR)
namelen = XATTR_NAME_MAX;
 
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] btrfs-progs: lowmem fsck: Remove corupted link before re-add correct link

2018-01-18 Thread Qu Wenruo
For repair_ternary_lowmem() used in lowmem mode, if it found 1 of
DIR_INDEX/DIR_ITEM/INODE_REF missing, it will try to insert correct
link.

However for case like invalid type in DIR_INDEX, we should delete the
corrupted DIR_INDEX first before inserting the correct link.

This patch will remove the corrupted link before re-insert.
This should solve the duplicated DIR_INDEX problem in old lowmem mode
repair.

Cc: Sebastian Andrzej Siewior 
Signed-off-by: Qu Wenruo 
---
 cmds-check.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 7fc30da83ea1..f302724dd840 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -4997,6 +4997,10 @@ int repair_ternary_lowmem(struct btrfs_root *root, u64 
dir_ino, u64 ino,
goto out;
}
if (stage == 1) {
+   ret = btrfs_unlink(trans, root, ino, dir_ino, index, name,
+  name_len, 0);
+   if (ret)
+   goto out;
ret = btrfs_add_link(trans, root, ino, dir_ino, name, name_len,
   filetype, &index, 1, 1);
goto out;
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/3] btrfs-progs: dir-item: Make btrfs_delete_one_dir_name more robust to handle corrupted name len

2018-01-18 Thread Qu Wenruo
Function btrfs_delete_one_dir_name() will check if the dir_item is the
last content of the item, and delete the whole item if needed.

However if @name_len of one dir_item/dir_index is corrupted and larger
than the item size, the function will still try to treat it as partly
remove, which will screw up the whole leaf.

This patch will enhance the item deletion check, to cover corrupted name
len, so in that case we just delete the whole item.

Signed-off-by: Qu Wenruo 
---
 dir-item.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/dir-item.c b/dir-item.c
index e0a0ab4d7a5d..35e0615fb423 100644
--- a/dir-item.c
+++ b/dir-item.c
@@ -263,7 +263,6 @@ int btrfs_delete_one_dir_name(struct btrfs_trans_handle 
*trans,
  struct btrfs_path *path,
  struct btrfs_dir_item *di)
 {
-
struct extent_buffer *leaf;
u32 sub_item_len;
u32 item_len;
@@ -273,7 +272,15 @@ int btrfs_delete_one_dir_name(struct btrfs_trans_handle 
*trans,
sub_item_len = sizeof(*di) + btrfs_dir_name_len(leaf, di) +
btrfs_dir_data_len(leaf, di);
item_len = btrfs_item_size_nr(leaf, path->slots[0]);
-   if (sub_item_len == item_len) {
+
+   /*
+* If @sub_item_len is longer than @item_len, then it means the
+* name_len is just corrupted.
+* No good idea to know if there is anything we can recover from
+* the corrupted item.
+* Just delete the item.
+*/
+   if (sub_item_len >= item_len) {
ret = btrfs_del_item(trans, root, path);
} else {
unsigned long ptr = (unsigned long)di;
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 05/16] btrfs-progs: check: Export check global variables to check/common.h

2018-01-18 Thread Su Yue



On 01/19/2018 01:37 PM, Qu Wenruo wrote:

There are a dozen of variables which are used as "check global"
variables, like @total_csum_bytes or @no_holes.

These variables are used freely across the check code, however since
we're splitting check code, they need to be exported so they can be used
in other files.

This patch just export them and add declarations for them in
check/common.h.

Signed-off-by: Qu Wenruo 
---
  check/common.h | 17 +
  check/main.c   | 32 
  2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/check/common.h b/check/common.h
index 25874aec597b..8d93ddbf4afb 100644
--- a/check/common.h
+++ b/check/common.h
@@ -36,4 +36,21 @@ struct node_refs {
int full_backref[BTRFS_MAX_LEVEL];
  };
  
+extern u64 bytes_used;

+extern u64 total_csum_bytes;
+extern u64 total_btree_bytes;
+extern u64 total_fs_tree_bytes;
+extern u64 total_extent_tree_bytes;
+extern u64 btree_space_waste;
+extern u64 data_bytes_allocated;
+extern u64 data_bytes_referenced;
+extern struct list_head duplicate_extents;
+extern struct list_head delete_items;
+extern int no_holes;
+extern int init_extent_tree;
+extern int check_data_csum;
+extern struct btrfs_fs_info *global_info;
+extern struct task_ctx ctx;
+extern struct cache_tree *roots_info_cache;
+
  #endif
diff --git a/check/main.c b/check/main.c
index fbd73c42bee8..bb927ecc87ee 100644
--- a/check/main.c
+++ b/check/main.c
@@ -61,22 +61,22 @@ struct task_ctx {
struct task_info *info;
  };
  
-static u64 bytes_used = 0;

-static u64 total_csum_bytes = 0;
-static u64 total_btree_bytes = 0;
-static u64 total_fs_tree_bytes = 0;
-static u64 total_extent_tree_bytes = 0;
-static u64 btree_space_waste = 0;
-static u64 data_bytes_allocated = 0;
-static u64 data_bytes_referenced = 0;
-static LIST_HEAD(duplicate_extents);
-static LIST_HEAD(delete_items);
-static int no_holes = 0;
-static int init_extent_tree = 0;
-static int check_data_csum = 0;
-static struct btrfs_fs_info *global_info;
-static struct task_ctx ctx = { 0 };
-static struct cache_tree *roots_info_cache = NULL;
+u64 bytes_used = 0;
+u64 total_csum_bytes = 0;
+u64 total_btree_bytes = 0;
+u64 total_fs_tree_bytes = 0;
+u64 total_extent_tree_bytes = 0;
+u64 btree_space_waste = 0;
+u64 data_bytes_allocated = 0;
+u64 data_bytes_referenced = 0;
+LIST_HEAD(duplicate_extents);
+LIST_HEAD(delete_items);
+int no_holes = 0;
+int init_extent_tree = 0;
+int check_data_csum = 0;


Just a small suggestion:
Since the patchset only splits cmds-check.c without functional changes,
Maybe it's a good timing to adjust those lines of old code according by 
errors and warnings which reported by checkpatch?


Thanks,
Su


+struct btrfs_fs_info *global_info;
+struct task_ctx ctx = { 0 };
+struct cache_tree *roots_info_cache = NULL;
  
  enum btrfs_check_mode {

CHECK_MODE_ORIGINAL,




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/16] btrfs-progs: check: Move count_csum_range function to check/common.c

2018-01-18 Thread Su Yue




On 01/19/2018 01:37 PM, Qu Wenruo wrote:

Despite of moving it to check/common.c, also:

1) Add extra comment of the function
2) Change @root paramter to @fs_info


'paramter' may be misspelled - perhaps 'parameter'?

Thanks,
Su

Since @root is never used, csum_root is picked from fs_info anyway.

Signed-off-by: Qu Wenruo 
---
  Makefile   |   2 +-
  check/common.c | 101 +
  check/common.h |   3 ++
  check/main.c   |  77 ++-
  4 files changed, 108 insertions(+), 75 deletions(-)
  create mode 100644 check/common.c

diff --git a/Makefile b/Makefile
index c4e2dc5b68a9..a00a982a18df 100644
--- a/Makefile
+++ b/Makefile
@@ -113,7 +113,7 @@ cmds_objects = cmds-subvolume.o cmds-filesystem.o 
cmds-device.o cmds-scrub.o \
   cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \
   cmds-property.o cmds-fi-usage.o cmds-inspect-dump-tree.o \
   cmds-inspect-dump-super.o cmds-inspect-tree-stats.o cmds-fi-du.o 
\
-  mkfs/common.o
+  mkfs/common.o check/common.o
  libbtrfs_objects = send-stream.o send-utils.o kernel-lib/rbtree.o 
btrfs-list.o \
   kernel-lib/crc32c.o messages.o \
   uuid-tree.o utils-lib.o rbtree-utils.o
diff --git a/check/common.c b/check/common.c
new file mode 100644
index ..ed4f2a40bac2
--- /dev/null
+++ b/check/common.c
@@ -0,0 +1,101 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#include "ctree.h"
+#include "internal.h"
+#include "check/common.h"
+
+/*
+ * Search in csum tree to find how many bytes of range [@start, @start + @len)
+ * has the corresponding csum item.
+ *
+ * @start: range start
+ * @len:   range length
+ * @found: return value of found csum bytes
+ * unit is BYTE.
+ */
+int count_csum_range(struct btrfs_fs_info *fs_info, u64 start,
+u64 len, u64 *found)
+{
+   struct btrfs_key key;
+   struct btrfs_path path;
+   struct extent_buffer *leaf;
+   int ret;
+   size_t size;
+   *found = 0;
+   u64 csum_end;
+   u16 csum_size = btrfs_super_csum_size(fs_info->super_copy);
+
+   btrfs_init_path(&path);
+
+   key.objectid = BTRFS_EXTENT_CSUM_OBJECTID;
+   key.offset = start;
+   key.type = BTRFS_EXTENT_CSUM_KEY;
+
+   ret = btrfs_search_slot(NULL, fs_info->csum_root,
+   &key, &path, 0, 0);
+   if (ret < 0)
+   goto out;
+   if (ret > 0 && path.slots[0] > 0) {
+   leaf = path.nodes[0];
+   btrfs_item_key_to_cpu(leaf, &key, path.slots[0] - 1);
+   if (key.objectid == BTRFS_EXTENT_CSUM_OBJECTID &&
+   key.type == BTRFS_EXTENT_CSUM_KEY)
+   path.slots[0]--;
+   }
+
+   while (len > 0) {
+   leaf = path.nodes[0];
+   if (path.slots[0] >= btrfs_header_nritems(leaf)) {
+   ret = btrfs_next_leaf(fs_info->csum_root, &path);
+   if (ret > 0)
+   break;
+   else if (ret < 0)
+   goto out;
+   leaf = path.nodes[0];
+   }
+
+   btrfs_item_key_to_cpu(leaf, &key, path.slots[0]);
+   if (key.objectid != BTRFS_EXTENT_CSUM_OBJECTID ||
+   key.type != BTRFS_EXTENT_CSUM_KEY)
+   break;
+
+   btrfs_item_key_to_cpu(leaf, &key, path.slots[0]);
+   if (key.offset >= start + len)
+   break;
+
+   if (key.offset > start)
+   start = key.offset;
+
+   size = btrfs_item_size_nr(leaf, path.slots[0]);
+   csum_end = key.offset + (size / csum_size) *
+  fs_info->sectorsize;
+   if (csum_end > start) {
+   size = min(csum_end - start, len);
+   len -= size;
+   start += size;
+   *found += size;
+   }
+
+   path.slots[0]++;
+   }
+out:
+   btrfs_release_path(&path);
+   if (ret < 0)
+   return ret;
+   return 0;
+}
+
diff --git a/check/common.h b/check/common.h
index 77a0ab54166

Re: [PATCH 04/16] btrfs-progs: check: Move node_ptr structure to check/common.h

2018-01-18 Thread Qu Wenruo


On 2018年01月19日 13:52, Su Yue wrote:
> The structure name is 'node_refs' not 'node_ptr'.
> Misspelt the patch name?
> 

You got me!

I'll just update the title in github.

Thanks,
Qu

> Thanks,
> Su
> 
> On 01/19/2018 01:37 PM, Qu Wenruo wrote:
>> Signed-off-by: Qu Wenruo 
>> ---
>>   check/common.h | 39 +++
>>   check/main.c   | 11 +--
>>   2 files changed, 40 insertions(+), 10 deletions(-)
>>   create mode 100644 check/common.h
>>
>> diff --git a/check/common.h b/check/common.h
>> new file mode 100644
>> index ..25874aec597b
>> --- /dev/null
>> +++ b/check/common.h
>> @@ -0,0 +1,39 @@
>> +/*
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public
>> + * License v2 as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public
>> + * License along with this program; if not, write to the
>> + * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
>> + * Boston, MA 021110-1307, USA.
>> + */
>> +
>> +/*
>> + * Defines and function declarations for code shared by both lowmem and
>> + * original mode
>> + */
>> +#ifndef __BTRFS_CHECK_COMMON_H__
>> +#define __BTRFS_CHECK_COMMON_H__
>> +#include "ctree.h"
>> +
>> +/*
>> + * Use for tree walk to walk through trees whose leaves/nodes can be
>> shared
>> + * between different trees. (Namely subvolume/fs trees)
>> + */
>> +struct node_refs {
>> +    u64 bytenr[BTRFS_MAX_LEVEL];
>> +    u64 refs[BTRFS_MAX_LEVEL];
>> +    int need_check[BTRFS_MAX_LEVEL];
>> +    /* field for checking all trees */
>> +    int checked[BTRFS_MAX_LEVEL];
>> +    /* the corresponding extent should be marked as full backref or
>> not */
>> +    int full_backref[BTRFS_MAX_LEVEL];
>> +};
>> +
>> +#endif
>> diff --git a/check/main.c b/check/main.c
>> index dbd2b755c48f..fbd73c42bee8 100644
>> --- a/check/main.c
>> +++ b/check/main.c
>> @@ -45,6 +45,7 @@
>>   #include "help.h"
>>   #include "check/original.h"
>>   #include "check/lowmem.h"
>> +#include "check/common.h"
>>     enum task_position {
>>   TASK_EXTENTS,
>> @@ -1667,16 +1668,6 @@ static int process_one_leaf(struct btrfs_root
>> *root, struct extent_buffer *eb,
>>   return ret;
>>   }
>>   -struct node_refs {
>> -    u64 bytenr[BTRFS_MAX_LEVEL];
>> -    u64 refs[BTRFS_MAX_LEVEL];
>> -    int need_check[BTRFS_MAX_LEVEL];
>> -    /* field for checking all trees */
>> -    int checked[BTRFS_MAX_LEVEL];
>> -    /* the corresponding extent should be marked as full backref or
>> not */
>> -    int full_backref[BTRFS_MAX_LEVEL];
>> -};
>> -
>>   static int update_nodes_refs(struct btrfs_root *root, u64 bytenr,
>>    struct extent_buffer *eb, struct node_refs *nrefs,
>>    u64 level, int check_all);
>>
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 04/16] btrfs-progs: check: Move node_ptr structure to check/common.h

2018-01-18 Thread Su Yue

The structure name is 'node_refs' not 'node_ptr'.
Misspelt the patch name?

Thanks,
Su

On 01/19/2018 01:37 PM, Qu Wenruo wrote:

Signed-off-by: Qu Wenruo 
---
  check/common.h | 39 +++
  check/main.c   | 11 +--
  2 files changed, 40 insertions(+), 10 deletions(-)
  create mode 100644 check/common.h

diff --git a/check/common.h b/check/common.h
new file mode 100644
index ..25874aec597b
--- /dev/null
+++ b/check/common.h
@@ -0,0 +1,39 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+/*
+ * Defines and function declarations for code shared by both lowmem and
+ * original mode
+ */
+#ifndef __BTRFS_CHECK_COMMON_H__
+#define __BTRFS_CHECK_COMMON_H__
+#include "ctree.h"
+
+/*
+ * Use for tree walk to walk through trees whose leaves/nodes can be shared
+ * between different trees. (Namely subvolume/fs trees)
+ */
+struct node_refs {
+   u64 bytenr[BTRFS_MAX_LEVEL];
+   u64 refs[BTRFS_MAX_LEVEL];
+   int need_check[BTRFS_MAX_LEVEL];
+   /* field for checking all trees */
+   int checked[BTRFS_MAX_LEVEL];
+   /* the corresponding extent should be marked as full backref or not */
+   int full_backref[BTRFS_MAX_LEVEL];
+};
+
+#endif
diff --git a/check/main.c b/check/main.c
index dbd2b755c48f..fbd73c42bee8 100644
--- a/check/main.c
+++ b/check/main.c
@@ -45,6 +45,7 @@
  #include "help.h"
  #include "check/original.h"
  #include "check/lowmem.h"
+#include "check/common.h"
  
  enum task_position {

TASK_EXTENTS,
@@ -1667,16 +1668,6 @@ static int process_one_leaf(struct btrfs_root *root, 
struct extent_buffer *eb,
return ret;
  }
  
-struct node_refs {

-   u64 bytenr[BTRFS_MAX_LEVEL];
-   u64 refs[BTRFS_MAX_LEVEL];
-   int need_check[BTRFS_MAX_LEVEL];
-   /* field for checking all trees */
-   int checked[BTRFS_MAX_LEVEL];
-   /* the corresponding extent should be marked as full backref or not */
-   int full_backref[BTRFS_MAX_LEVEL];
-};
-
  static int update_nodes_refs(struct btrfs_root *root, u64 bytenr,
 struct extent_buffer *eb, struct node_refs *nrefs,
 u64 level, int check_all);




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/16] btrfs-progs: check: Move reset_cached_block_groups to check/common.c

2018-01-18 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 check/common.c | 25 +
 check/common.h |  1 +
 check/main.c   | 27 ---
 3 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/check/common.c b/check/common.c
index 4cdc46b0ba7c..d6abf6d6733c 100644
--- a/check/common.c
+++ b/check/common.c
@@ -324,3 +324,28 @@ int check_child_node(struct extent_buffer *parent, int 
slot,
}
return ret;
 }
+
+void reset_cached_block_groups(struct btrfs_fs_info *fs_info)
+{
+   struct btrfs_block_group_cache *cache;
+   u64 start, end;
+   int ret;
+
+   while (1) {
+   ret = find_first_extent_bit(&fs_info->free_space_cache, 0,
+   &start, &end, EXTENT_DIRTY);
+   if (ret)
+   break;
+   clear_extent_dirty(&fs_info->free_space_cache, start, end);
+   }
+
+   start = 0;
+   while (1) {
+   cache = btrfs_lookup_first_block_group(fs_info, start);
+   if (!cache)
+   break;
+   if (cache->cached)
+   cache->cached = 0;
+   start = cache->key.objectid + cache->key.offset;
+   }
+}
diff --git a/check/common.h b/check/common.h
index d200a0c90e38..09745af4932f 100644
--- a/check/common.h
+++ b/check/common.h
@@ -95,5 +95,6 @@ void reada_walk_down(struct btrfs_root *root, struct 
extent_buffer *node,
 int slot);
 int check_child_node(struct extent_buffer *parent, int slot,
 struct extent_buffer *child);
+void reset_cached_block_groups(struct btrfs_fs_info *fs_info);
 
 #endif
diff --git a/check/main.c b/check/main.c
index af4e54857fbf..3c556db90c30 100644
--- a/check/main.c
+++ b/check/main.c
@@ -412,8 +412,6 @@ static void free_file_extent_holes(struct rb_root *holes)
}
 }
 
-static void reset_cached_block_groups(struct btrfs_fs_info *fs_info);
-
 static void record_root_in_trans(struct btrfs_trans_handle *trans,
 struct btrfs_root *root)
 {
@@ -10196,31 +10194,6 @@ static int prune_corrupt_blocks(struct btrfs_fs_info 
*info)
return 0;
 }
 
-static void reset_cached_block_groups(struct btrfs_fs_info *fs_info)
-{
-   struct btrfs_block_group_cache *cache;
-   u64 start, end;
-   int ret;
-
-   while (1) {
-   ret = find_first_extent_bit(&fs_info->free_space_cache, 0,
-   &start, &end, EXTENT_DIRTY);
-   if (ret)
-   break;
-   clear_extent_dirty(&fs_info->free_space_cache, start, end);
-   }
-
-   start = 0;
-   while (1) {
-   cache = btrfs_lookup_first_block_group(fs_info, start);
-   if (!cache)
-   break;
-   if (cache->cached)
-   cache->cached = 0;
-   start = cache->key.objectid + cache->key.offset;
-   }
-}
-
 static int check_extent_refs(struct btrfs_root *root,
 struct cache_tree *extent_cache)
 {
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/16] btrfs-progs: check: Move check_child_node to check/common.c

2018-01-18 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 check/common.c | 49 +
 check/common.h |  2 ++
 check/main.c   | 49 -
 3 files changed, 51 insertions(+), 49 deletions(-)

diff --git a/check/common.c b/check/common.c
index 1ea9f506713a..4cdc46b0ba7c 100644
--- a/check/common.c
+++ b/check/common.c
@@ -275,3 +275,52 @@ void reada_walk_down(struct btrfs_root *root, struct 
extent_buffer *node,
readahead_tree_block(fs_info, bytenr, ptr_gen);
}
 }
+
+/*
+ * Check the child node/leaf by the following condition:
+ * 1. the first item key of the node/leaf should be the same with the one
+ *in parent.
+ * 2. block in parent node should match the child node/leaf.
+ * 3. generation of parent node and child's header should be consistent.
+ *
+ * Or the child node/leaf pointed by the key in parent is not valid.
+ *
+ * We hope to check leaf owner too, but since subvol may share leaves,
+ * which makes leaf owner check not so strong, key check should be
+ * sufficient enough for that case.
+ */
+int check_child_node(struct extent_buffer *parent, int slot,
+struct extent_buffer *child)
+{
+   struct btrfs_key parent_key;
+   struct btrfs_key child_key;
+   int ret = 0;
+
+   btrfs_node_key_to_cpu(parent, &parent_key, slot);
+   if (btrfs_header_level(child) == 0)
+   btrfs_item_key_to_cpu(child, &child_key, 0);
+   else
+   btrfs_node_key_to_cpu(child, &child_key, 0);
+
+   if (memcmp(&parent_key, &child_key, sizeof(parent_key))) {
+   ret = -EINVAL;
+   fprintf(stderr,
+   "Wrong key of child node/leaf, wanted: (%llu, %u, 
%llu), have: (%llu, %u, %llu)\n",
+   parent_key.objectid, parent_key.type, parent_key.offset,
+   child_key.objectid, child_key.type, child_key.offset);
+   }
+   if (btrfs_header_bytenr(child) != btrfs_node_blockptr(parent, slot)) {
+   ret = -EINVAL;
+   fprintf(stderr, "Wrong block of child node/leaf, wanted: %llu, 
have: %llu\n",
+   btrfs_node_blockptr(parent, slot),
+   btrfs_header_bytenr(child));
+   }
+   if (btrfs_node_ptr_generation(parent, slot) !=
+   btrfs_header_generation(child)) {
+   ret = -EINVAL;
+   fprintf(stderr, "Wrong generation of child node/leaf, wanted: 
%llu, have: %llu\n",
+   btrfs_header_generation(child),
+   btrfs_node_ptr_generation(parent, slot));
+   }
+   return ret;
+}
diff --git a/check/common.h b/check/common.h
index 34b2f8a9cd87..d200a0c90e38 100644
--- a/check/common.h
+++ b/check/common.h
@@ -93,5 +93,7 @@ int link_inode_to_lostfound(struct btrfs_trans_handle *trans,
 void check_dev_size_alignment(u64 devid, u64 total_bytes, u32 sectorsize);
 void reada_walk_down(struct btrfs_root *root, struct extent_buffer *node,
 int slot);
+int check_child_node(struct extent_buffer *parent, int slot,
+struct extent_buffer *child);
 
 #endif
diff --git a/check/main.c b/check/main.c
index aa7098a0be96..af4e54857fbf 100644
--- a/check/main.c
+++ b/check/main.c
@@ -1667,55 +1667,6 @@ out:
return ret;
 }
 
-/*
- * Check the child node/leaf by the following condition:
- * 1. the first item key of the node/leaf should be the same with the one
- *in parent.
- * 2. block in parent node should match the child node/leaf.
- * 3. generation of parent node and child's header should be consistent.
- *
- * Or the child node/leaf pointed by the key in parent is not valid.
- *
- * We hope to check leaf owner too, but since subvol may share leaves,
- * which makes leaf owner check not so strong, key check should be
- * sufficient enough for that case.
- */
-static int check_child_node(struct extent_buffer *parent, int slot,
-   struct extent_buffer *child)
-{
-   struct btrfs_key parent_key;
-   struct btrfs_key child_key;
-   int ret = 0;
-
-   btrfs_node_key_to_cpu(parent, &parent_key, slot);
-   if (btrfs_header_level(child) == 0)
-   btrfs_item_key_to_cpu(child, &child_key, 0);
-   else
-   btrfs_node_key_to_cpu(child, &child_key, 0);
-
-   if (memcmp(&parent_key, &child_key, sizeof(parent_key))) {
-   ret = -EINVAL;
-   fprintf(stderr,
-   "Wrong key of child node/leaf, wanted: (%llu, %u, 
%llu), have: (%llu, %u, %llu)\n",
-   parent_key.objectid, parent_key.type, parent_key.offset,
-   child_key.objectid, child_key.type, child_key.offset);
-   }
-   if (btrfs_header_bytenr(child) != btrfs_node_blockptr(parent, slot)) {
-   ret = -EINVAL;
-   fprintf(stderr, "Wrong block of child node/leaf, wanted: %llu, 
have: %llu\n",
- 

[PATCH 10/16] btrfs-progs: check: Move link_inode_to_lostfound function to common.c

2018-01-18 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 check/common.c | 93 ++
 check/common.h |  5 
 check/main.c   | 92 -
 3 files changed, 98 insertions(+), 92 deletions(-)

diff --git a/check/common.c b/check/common.c
index 6a7d86dfd8f7..9051936a61cb 100644
--- a/check/common.c
+++ b/check/common.c
@@ -19,6 +19,7 @@
 #include "internal.h"
 #include "messages.h"
 #include "transaction.h"
+#include "utils.h"
 #include "check/common.h"
 
 /*
@@ -144,3 +145,95 @@ int insert_inode_item(struct btrfs_trans_handle *trans,
 
return 0;
 }
+
+static int get_highest_inode(struct btrfs_trans_handle *trans,
+struct btrfs_root *root, struct btrfs_path *path,
+u64 *highest_ino)
+{
+   struct btrfs_key key, found_key;
+   int ret;
+
+   btrfs_init_path(path);
+   key.objectid = BTRFS_LAST_FREE_OBJECTID;
+   key.offset = -1;
+   key.type = BTRFS_INODE_ITEM_KEY;
+   ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
+   if (ret == 1) {
+   btrfs_item_key_to_cpu(path->nodes[0], &found_key,
+   path->slots[0] - 1);
+   *highest_ino = found_key.objectid;
+   ret = 0;
+   }
+   if (*highest_ino >= BTRFS_LAST_FREE_OBJECTID)
+   ret = -EOVERFLOW;
+   btrfs_release_path(path);
+   return ret;
+}
+
+/*
+ * Link inode to dir 'lost+found'. Increase @ref_count.
+ *
+ * Returns 0 means success.
+ * Returns <0 means failure.
+ */
+int link_inode_to_lostfound(struct btrfs_trans_handle *trans,
+   struct btrfs_root *root,
+   struct btrfs_path *path,
+   u64 ino, char *namebuf, u32 name_len,
+   u8 filetype, u64 *ref_count)
+{
+   char *dir_name = "lost+found";
+   u64 lost_found_ino;
+   int ret;
+   u32 mode = 0700;
+
+   btrfs_release_path(path);
+   ret = get_highest_inode(trans, root, path, &lost_found_ino);
+   if (ret < 0)
+   goto out;
+   lost_found_ino++;
+
+   ret = btrfs_mkdir(trans, root, dir_name, strlen(dir_name),
+ BTRFS_FIRST_FREE_OBJECTID, &lost_found_ino,
+ mode);
+   if (ret < 0) {
+   error("failed to create '%s' dir: %s", dir_name, 
strerror(-ret));
+   goto out;
+   }
+   ret = btrfs_add_link(trans, root, ino, lost_found_ino,
+namebuf, name_len, filetype, NULL, 1, 0);
+   /*
+* Add ".INO" suffix several times to handle case where
+* "FILENAME.INO" is already taken by another file.
+*/
+   while (ret == -EEXIST) {
+   /*
+* Conflicting file name, add ".INO" as suffix * +1 for '.'
+*/
+   if (name_len + count_digits(ino) + 1 > BTRFS_NAME_LEN) {
+   ret = -EFBIG;
+   goto out;
+   }
+   snprintf(namebuf + name_len, BTRFS_NAME_LEN - name_len,
+".%llu", ino);
+   name_len += count_digits(ino) + 1;
+   ret = btrfs_add_link(trans, root, ino, lost_found_ino, namebuf,
+name_len, filetype, NULL, 1, 0);
+   }
+   if (ret < 0) {
+   error("failed to link the inode %llu to %s dir: %s",
+ ino, dir_name, strerror(-ret));
+   goto out;
+   }
+
+   ++*ref_count;
+   printf("Moving file '%.*s' to '%s' dir since it has no valid backref\n",
+  name_len, namebuf, dir_name);
+out:
+   btrfs_release_path(path);
+   if (ret)
+   error("failed to move file '%.*s' to '%s' dir", name_len,
+   namebuf, dir_name);
+   return ret;
+}
+
diff --git a/check/common.h b/check/common.h
index efab05ad6b68..9a3488ae365a 100644
--- a/check/common.h
+++ b/check/common.h
@@ -85,5 +85,10 @@ int count_csum_range(struct btrfs_fs_info *fs_info, u64 
start,
 int insert_inode_item(struct btrfs_trans_handle *trans,
  struct btrfs_root *root, u64 ino, u64 size,
  u64 nbytes, u64 nlink, u32 mode);
+int link_inode_to_lostfound(struct btrfs_trans_handle *trans,
+   struct btrfs_root *root,
+   struct btrfs_path *path,
+   u64 ino, char *namebuf, u32 name_len,
+   u8 filetype, u64 *ref_count);
 
 #endif
diff --git a/check/main.c b/check/main.c
index e594b5986a47..0da25c460336 100644
--- a/check/main.c
+++ b/check/main.c
@@ -2960,98 +2960,6 @@ out:
return ret;
 }
 
-static int get_highest_inode(struct btrfs_trans_handle *trans,
-   struct btrfs_root *root,
-   struct btrfs_path *pat

[PATCH 16/16] btrfs-progs: check/lowmem: Cleanup unnecessary _v2 suffix

2018-01-18 Thread Qu Wenruo
There used to be some functions with _v2 suffix to distinguish them from
original mode similar functions.

However now moved lowmem code to their own check/lowmem.[ch], cleanup
such _v2 suffixes, and for functions really needs to be distinguished
from original mode (exported functions), change the _v2 suffix to
_lowmem.

Signed-off-by: Qu Wenruo 
---
 check/lowmem.c | 46 +++---
 check/lowmem.h |  4 ++--
 check/main.c   |  4 ++--
 3 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/check/lowmem.c b/check/lowmem.c
index 7fb2016edb4a..fdd4d624881e 100644
--- a/check/lowmem.c
+++ b/check/lowmem.c
@@ -28,8 +28,8 @@
 #include "check/common.h"
 #include "check/lowmem.h"
 
-static int calc_extent_flag_v2(struct btrfs_root *root, struct extent_buffer 
*eb,
-  u64 *flags_ret)
+static int calc_extent_flag(struct btrfs_root *root, struct extent_buffer *eb,
+   u64 *flags_ret)
 {
struct btrfs_root *extent_root = root->fs_info->extent_root;
struct btrfs_root_item *ri = &root->root_item;
@@ -225,7 +225,7 @@ static int update_nodes_refs(struct btrfs_root *root, u64 
bytenr,
}
 
if (check_all && eb) {
-   calc_extent_flag_v2(root, eb, &flags);
+   calc_extent_flag(root, eb, &flags);
if (flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)
nrefs->full_backref[level] = 1;
}
@@ -2084,8 +2084,8 @@ out:
  * Returns <0  Fatal error, must exit the whole check
  * Returns 0   No errors found
  */
-static int process_one_leaf_v2(struct btrfs_root *root, struct btrfs_path 
*path,
-  struct node_refs *nrefs, int *level, int ext_ref)
+static int process_one_leaf(struct btrfs_root *root, struct btrfs_path *path,
+   struct node_refs *nrefs, int *level, int ext_ref)
 {
struct extent_buffer *cur = path->nodes[0];
struct btrfs_key key;
@@ -3898,10 +3898,10 @@ out:
  * Returns <0  Fatal error, must exit the whole check
  * Returns 0   No errors found
  */
-static int walk_down_tree_v2(struct btrfs_trans_handle *trans,
-struct btrfs_root *root, struct btrfs_path *path,
-int *level, struct node_refs *nrefs, int ext_ref,
-int check_all)
+static int walk_down_tree(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root, struct btrfs_path *path,
+ int *level, struct node_refs *nrefs, int ext_ref,
+ int check_all)
 {
enum btrfs_tree_block_status status;
u64 bytenr;
@@ -3971,8 +3971,8 @@ static int walk_down_tree_v2(struct btrfs_trans_handle 
*trans,
 
ret = 0;
if (!check_all)
-   ret = process_one_leaf_v2(root, path, nrefs,
- level, ext_ref);
+   ret = process_one_leaf(root, path, nrefs,
+  level, ext_ref);
else
ret = check_leaf_items(trans, root, path,
   nrefs, account_file_data);
@@ -3996,7 +3996,7 @@ static int walk_down_tree_v2(struct btrfs_trans_handle 
*trans,
if (ret < 0)
break;
/*
-* check all trees in check_chunks_and_extent_v2
+* check all trees in check_chunks_and_extent
 * check shared node once in check_fs_roots
 */
if (!check_all && !nrefs->need_check[*level - 1]) {
@@ -4049,8 +4049,8 @@ static int walk_down_tree_v2(struct btrfs_trans_handle 
*trans,
return err;
 }
 
-static int walk_up_tree_v2(struct btrfs_root *root, struct btrfs_path *path,
-  int *level)
+static int walk_up_tree(struct btrfs_root *root, struct btrfs_path *path,
+   int *level)
 {
int i;
struct extent_buffer *leaf;
@@ -4203,7 +4203,7 @@ out:
 }
 
 /*
- * This function calls walk_down_tree_v2 and walk_up_tree_v2 to check tree
+ * This function calls walk_down_tree and walk_up_tree to check tree
  * blocks and integrity of fs tree items.
  *
  * @root: the root of the tree to be checked.
@@ -4260,8 +4260,8 @@ static int check_btrfs_root(struct btrfs_trans_handle 
*trans,
}
 
while (1) {
-   ret = walk_down_tree_v2(trans, root, &path, &level, &nrefs,
-   ext_ref, check_all);
+   ret = walk_down_tree(trans, root, &path, &level, &nrefs,
+ext_ref, check_all);
 
err |= !!ret;
 
@@ -4271,7 +4271,7 @@ static int check_btrfs_root(struct btrfs_trans_handle 
*trans,
break;
  

[PATCH 09/16] btrfs-progs: check: Move __create_inode_item function to check/common.c

2018-01-18 Thread Qu Wenruo
Move __create_inode_item() function to check/common.c and rename it to
insert_inode_item(), with comment added.

Signed-off-by: Qu Wenruo 
---
 check/common.c | 45 +
 check/common.h |  3 +++
 check/main.c   | 36 ++--
 3 files changed, 50 insertions(+), 34 deletions(-)

diff --git a/check/common.c b/check/common.c
index ed4f2a40bac2..6a7d86dfd8f7 100644
--- a/check/common.c
+++ b/check/common.c
@@ -14,8 +14,11 @@
  * Boston, MA 021110-1307, USA.
  */
 
+#include 
 #include "ctree.h"
 #include "internal.h"
+#include "messages.h"
+#include "transaction.h"
 #include "check/common.h"
 
 /*
@@ -99,3 +102,45 @@ out:
return 0;
 }
 
+/*
+ * Wrapper to insert one inode item into given @root
+ * Timestamp will be set to current time.
+ *
+ * @root:  the root to insert inode item into
+ * @ino:   inode number
+ * @size:  inode size
+ * @nbytes:nbytes (real used size, without hole)
+ * @nlink: number of links
+ * @mode:  file mode, including S_IF* bits
+ */
+int insert_inode_item(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root, u64 ino, u64 size,
+ u64 nbytes, u64 nlink, u32 mode)
+{
+   struct btrfs_inode_item ii;
+   time_t now = time(NULL);
+   int ret;
+
+   btrfs_set_stack_inode_size(&ii, size);
+   btrfs_set_stack_inode_nbytes(&ii, nbytes);
+   btrfs_set_stack_inode_nlink(&ii, nlink);
+   btrfs_set_stack_inode_mode(&ii, mode);
+   btrfs_set_stack_inode_generation(&ii, trans->transid);
+   btrfs_set_stack_timespec_nsec(&ii.atime, 0);
+   btrfs_set_stack_timespec_sec(&ii.ctime, now);
+   btrfs_set_stack_timespec_nsec(&ii.ctime, 0);
+   btrfs_set_stack_timespec_sec(&ii.mtime, now);
+   btrfs_set_stack_timespec_nsec(&ii.mtime, 0);
+   btrfs_set_stack_timespec_sec(&ii.otime, 0);
+   btrfs_set_stack_timespec_nsec(&ii.otime, 0);
+
+   ret = btrfs_insert_inode(trans, root, ino, &ii);
+   ASSERT(!ret);
+
+   warning("root %llu inode %llu recreating inode item, this may "
+   "be incomplete, please check permissions and content after "
+   "the fsck completes.\n", (unsigned long long)root->objectid,
+   (unsigned long long)ino);
+
+   return 0;
+}
diff --git a/check/common.h b/check/common.h
index cd64798f4804..efab05ad6b68 100644
--- a/check/common.h
+++ b/check/common.h
@@ -82,5 +82,8 @@ static inline int fs_root_objectid(u64 objectid)
 
 int count_csum_range(struct btrfs_fs_info *fs_info, u64 start,
 u64 len, u64 *found);
+int insert_inode_item(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root, u64 ino, u64 size,
+ u64 nbytes, u64 nlink, u32 mode);
 
 #endif
diff --git a/check/main.c b/check/main.c
index b891f4815d30..e594b5986a47 100644
--- a/check/main.c
+++ b/check/main.c
@@ -2691,45 +2691,13 @@ static int delete_dir_index(struct btrfs_root *root,
return ret;
 }
 
-static int __create_inode_item(struct btrfs_trans_handle *trans,
-  struct btrfs_root *root, u64 ino, u64 size,
-  u64 nbytes, u64 nlink, u32 mode)
-{
-   struct btrfs_inode_item ii;
-   time_t now = time(NULL);
-   int ret;
-
-   btrfs_set_stack_inode_size(&ii, size);
-   btrfs_set_stack_inode_nbytes(&ii, nbytes);
-   btrfs_set_stack_inode_nlink(&ii, nlink);
-   btrfs_set_stack_inode_mode(&ii, mode);
-   btrfs_set_stack_inode_generation(&ii, trans->transid);
-   btrfs_set_stack_timespec_nsec(&ii.atime, 0);
-   btrfs_set_stack_timespec_sec(&ii.ctime, now);
-   btrfs_set_stack_timespec_nsec(&ii.ctime, 0);
-   btrfs_set_stack_timespec_sec(&ii.mtime, now);
-   btrfs_set_stack_timespec_nsec(&ii.mtime, 0);
-   btrfs_set_stack_timespec_sec(&ii.otime, 0);
-   btrfs_set_stack_timespec_nsec(&ii.otime, 0);
-
-   ret = btrfs_insert_inode(trans, root, ino, &ii);
-   ASSERT(!ret);
-
-   warning("root %llu inode %llu recreating inode item, this may "
-   "be incomplete, please check permissions and content after "
-   "the fsck completes.\n", (unsigned long long)root->objectid,
-   (unsigned long long)ino);
-
-   return 0;
-}
-
 static int create_inode_item_lowmem(struct btrfs_trans_handle *trans,
struct btrfs_root *root, u64 ino,
u8 filetype)
 {
u32 mode = (filetype == BTRFS_FT_DIR ? S_IFDIR : S_IFREG) | 0755;
 
-   return __create_inode_item(trans, root, ino, 0, 0, 0, mode);
+   return insert_inode_item(trans, root, ino, 0, 0, 0, mode);
 }
 
 static int create_inode_item(struct btrfs_root *root,
@@ -2762,7 +2730,7 @@ static int create_inode_item(struct btrfs_root *root,
mode =  S_IFREG | 0755;
}
 
-   ret = __create_inode_item(trans, root, rec->i

[PATCH 12/16] btrfs-progs: check: move reada_walk_down to check/common.c

2018-01-18 Thread Qu Wenruo
Both original and lowmem mode shares this function to do readahead.

Signed-off-by: Qu Wenruo 
---
 check/common.c | 23 +++
 check/common.h |  2 ++
 check/main.c   | 22 --
 3 files changed, 25 insertions(+), 22 deletions(-)

diff --git a/check/common.c b/check/common.c
index 7392ed6b472f..1ea9f506713a 100644
--- a/check/common.c
+++ b/check/common.c
@@ -20,6 +20,7 @@
 #include "messages.h"
 #include "transaction.h"
 #include "utils.h"
+#include "disk-io.h"
 #include "check/common.h"
 
 /*
@@ -252,3 +253,25 @@ void check_dev_size_alignment(u64 devid, u64 total_bytes, 
u32 sectorsize)
warning("this can be fixed by 'btrfs rescue fix-device-size'");
}
 }
+
+void reada_walk_down(struct btrfs_root *root, struct extent_buffer *node,
+int slot)
+{
+   struct btrfs_fs_info *fs_info = root->fs_info;
+   u64 bytenr;
+   u64 ptr_gen;
+   u32 nritems;
+   int i;
+   int level;
+
+   level = btrfs_header_level(node);
+   if (level != 1)
+   return;
+
+   nritems = btrfs_header_nritems(node);
+   for (i = slot; i < nritems; i++) {
+   bytenr = btrfs_node_blockptr(node, i);
+   ptr_gen = btrfs_node_ptr_generation(node, i);
+   readahead_tree_block(fs_info, bytenr, ptr_gen);
+   }
+}
diff --git a/check/common.h b/check/common.h
index 72146b444a79..34b2f8a9cd87 100644
--- a/check/common.h
+++ b/check/common.h
@@ -91,5 +91,7 @@ int link_inode_to_lostfound(struct btrfs_trans_handle *trans,
u64 ino, char *namebuf, u32 name_len,
u8 filetype, u64 *ref_count);
 void check_dev_size_alignment(u64 devid, u64 total_bytes, u32 sectorsize);
+void reada_walk_down(struct btrfs_root *root, struct extent_buffer *node,
+int slot);
 
 #endif
diff --git a/check/main.c b/check/main.c
index a8155630df18..aa7098a0be96 100644
--- a/check/main.c
+++ b/check/main.c
@@ -1667,28 +1667,6 @@ out:
return ret;
 }
 
-static void reada_walk_down(struct btrfs_root *root,
-   struct extent_buffer *node, int slot)
-{
-   struct btrfs_fs_info *fs_info = root->fs_info;
-   u64 bytenr;
-   u64 ptr_gen;
-   u32 nritems;
-   int i;
-   int level;
-
-   level = btrfs_header_level(node);
-   if (level != 1)
-   return;
-
-   nritems = btrfs_header_nritems(node);
-   for (i = slot; i < nritems; i++) {
-   bytenr = btrfs_node_blockptr(node, i);
-   ptr_gen = btrfs_node_ptr_generation(node, i);
-   readahead_tree_block(fs_info, bytenr, ptr_gen);
-   }
-}
-
 /*
  * Check the child node/leaf by the following condition:
  * 1. the first item key of the node/leaf should be the same with the one
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/16] btrfs-progs: check: Move check_dev_size_alignment to check/common.c

2018-01-18 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 check/common.c | 15 +++
 check/common.h |  1 +
 check/main.c   | 16 
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/check/common.c b/check/common.c
index 9051936a61cb..7392ed6b472f 100644
--- a/check/common.c
+++ b/check/common.c
@@ -237,3 +237,18 @@ out:
return ret;
 }
 
+/*
+ * Extra (optional) check for dev_item size to report possbile problem on a new
+ * kernel.
+ */
+void check_dev_size_alignment(u64 devid, u64 total_bytes, u32 sectorsize)
+{
+   if (!IS_ALIGNED(total_bytes, sectorsize)) {
+   warning(
+"unaligned total_bytes detected for devid %llu, have %llu should be aligned to 
%u",
+   devid, total_bytes, sectorsize);
+   warning(
+"this is OK for older kernel, but may cause kernel warning for newer kernels");
+   warning("this can be fixed by 'btrfs rescue fix-device-size'");
+   }
+}
diff --git a/check/common.h b/check/common.h
index 9a3488ae365a..72146b444a79 100644
--- a/check/common.h
+++ b/check/common.h
@@ -90,5 +90,6 @@ int link_inode_to_lostfound(struct btrfs_trans_handle *trans,
struct btrfs_path *path,
u64 ino, char *namebuf, u32 name_len,
u8 filetype, u64 *ref_count);
+void check_dev_size_alignment(u64 devid, u64 total_bytes, u32 sectorsize);
 
 #endif
diff --git a/check/main.c b/check/main.c
index 0da25c460336..a8155630df18 100644
--- a/check/main.c
+++ b/check/main.c
@@ -10708,22 +10708,6 @@ static int check_device_used(struct device_record 
*dev_rec,
}
 }
 
-/*
- * Extra (optional) check for dev_item size to report possbile problem on a new
- * kernel.
- */
-static void check_dev_size_alignment(u64 devid, u64 total_bytes, u32 
sectorsize)
-{
-   if (!IS_ALIGNED(total_bytes, sectorsize)) {
-   warning(
-"unaligned total_bytes detected for devid %llu, have %llu should be aligned to 
%u",
-   devid, total_bytes, sectorsize);
-   warning(
-"this is OK for older kernel, but may cause kernel warning for newer kernels");
-   warning("this can be fixed by 'btrfs rescue fix-device-size'");
-   }
-}
-
 /*
  * Unlike device size alignment check above, some super total_bytes check
  * failure can lead to mount failure for newer kernel.
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/16] btrfs-progs: check: Move count_csum_range function to check/common.c

2018-01-18 Thread Qu Wenruo
Despite of moving it to check/common.c, also:

1) Add extra comment of the function
2) Change @root paramter to @fs_info
   Since @root is never used, csum_root is picked from fs_info anyway.

Signed-off-by: Qu Wenruo 
---
 Makefile   |   2 +-
 check/common.c | 101 +
 check/common.h |   3 ++
 check/main.c   |  77 ++-
 4 files changed, 108 insertions(+), 75 deletions(-)
 create mode 100644 check/common.c

diff --git a/Makefile b/Makefile
index c4e2dc5b68a9..a00a982a18df 100644
--- a/Makefile
+++ b/Makefile
@@ -113,7 +113,7 @@ cmds_objects = cmds-subvolume.o cmds-filesystem.o 
cmds-device.o cmds-scrub.o \
   cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \
   cmds-property.o cmds-fi-usage.o cmds-inspect-dump-tree.o \
   cmds-inspect-dump-super.o cmds-inspect-tree-stats.o cmds-fi-du.o 
\
-  mkfs/common.o
+  mkfs/common.o check/common.o
 libbtrfs_objects = send-stream.o send-utils.o kernel-lib/rbtree.o btrfs-list.o 
\
   kernel-lib/crc32c.o messages.o \
   uuid-tree.o utils-lib.o rbtree-utils.o
diff --git a/check/common.c b/check/common.c
new file mode 100644
index ..ed4f2a40bac2
--- /dev/null
+++ b/check/common.c
@@ -0,0 +1,101 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#include "ctree.h"
+#include "internal.h"
+#include "check/common.h"
+
+/*
+ * Search in csum tree to find how many bytes of range [@start, @start + @len)
+ * has the corresponding csum item.
+ *
+ * @start: range start
+ * @len:   range length
+ * @found: return value of found csum bytes
+ * unit is BYTE.
+ */
+int count_csum_range(struct btrfs_fs_info *fs_info, u64 start,
+u64 len, u64 *found)
+{
+   struct btrfs_key key;
+   struct btrfs_path path;
+   struct extent_buffer *leaf;
+   int ret;
+   size_t size;
+   *found = 0;
+   u64 csum_end;
+   u16 csum_size = btrfs_super_csum_size(fs_info->super_copy);
+
+   btrfs_init_path(&path);
+
+   key.objectid = BTRFS_EXTENT_CSUM_OBJECTID;
+   key.offset = start;
+   key.type = BTRFS_EXTENT_CSUM_KEY;
+
+   ret = btrfs_search_slot(NULL, fs_info->csum_root,
+   &key, &path, 0, 0);
+   if (ret < 0)
+   goto out;
+   if (ret > 0 && path.slots[0] > 0) {
+   leaf = path.nodes[0];
+   btrfs_item_key_to_cpu(leaf, &key, path.slots[0] - 1);
+   if (key.objectid == BTRFS_EXTENT_CSUM_OBJECTID &&
+   key.type == BTRFS_EXTENT_CSUM_KEY)
+   path.slots[0]--;
+   }
+
+   while (len > 0) {
+   leaf = path.nodes[0];
+   if (path.slots[0] >= btrfs_header_nritems(leaf)) {
+   ret = btrfs_next_leaf(fs_info->csum_root, &path);
+   if (ret > 0)
+   break;
+   else if (ret < 0)
+   goto out;
+   leaf = path.nodes[0];
+   }
+
+   btrfs_item_key_to_cpu(leaf, &key, path.slots[0]);
+   if (key.objectid != BTRFS_EXTENT_CSUM_OBJECTID ||
+   key.type != BTRFS_EXTENT_CSUM_KEY)
+   break;
+
+   btrfs_item_key_to_cpu(leaf, &key, path.slots[0]);
+   if (key.offset >= start + len)
+   break;
+
+   if (key.offset > start)
+   start = key.offset;
+
+   size = btrfs_item_size_nr(leaf, path.slots[0]);
+   csum_end = key.offset + (size / csum_size) *
+  fs_info->sectorsize;
+   if (csum_end > start) {
+   size = min(csum_end - start, len);
+   len -= size;
+   start += size;
+   *found += size;
+   }
+
+   path.slots[0]++;
+   }
+out:
+   btrfs_release_path(&path);
+   if (ret < 0)
+   return ret;
+   return 0;
+}
+
diff --git a/check/common.h b/check/common.h
index 77a0ab54166f..cd64798f4804 100644
--- a/check/common.h
+++ b/check/common.h
@@ -80,4 +80,7 @@ static inline int fs_root_objectid(u64

[PATCH 01/16] btrfs-progs: Moves cmds-check.c to check/main.c

2018-01-18 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 Makefile | 4 ++--
 cmds-check.c => check/main.c | 0
 2 files changed, 2 insertions(+), 2 deletions(-)
 rename cmds-check.c => check/main.c (100%)

diff --git a/Makefile b/Makefile
index 6369e8f4209c..c4e2dc5b68a9 100644
--- a/Makefile
+++ b/Makefile
@@ -109,7 +109,7 @@ objects = ctree.o disk-io.o kernel-lib/radix-tree.o 
extent-tree.o print-tree.o \
  fsfeatures.o kernel-lib/tables.o kernel-lib/raid56.o transaction.o
 cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \
   cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \
-  cmds-quota.o cmds-qgroup.o cmds-replace.o cmds-check.o \
+  cmds-quota.o cmds-qgroup.o cmds-replace.o check/main.o \
   cmds-restore.o cmds-rescue.o chunk-recover.o super-recover.o \
   cmds-property.o cmds-fi-usage.o cmds-inspect-dump-tree.o \
   cmds-inspect-dump-super.o cmds-inspect-tree-stats.o cmds-fi-du.o 
\
@@ -544,7 +544,7 @@ clean: $(CLEANDIRS)
kernel-shared/*.o kernel-shared/*.o.d \
image/*.o image/*.o.d \
convert/*.o convert/*.o.d \
-   mkfs/*.o mkfs/*.o.d \
+   mkfs/*.o mkfs/*.o.d check/*.o check/*.o.d \
  dir-test ioctl-test quick-test library-test library-test-static \
   mktables btrfs.static mkfs.btrfs.static fssum \
  $(check_defs) \
diff --git a/cmds-check.c b/check/main.c
similarity index 100%
rename from cmds-check.c
rename to check/main.c
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/16] btrfs-progs: Split lowmem mode check to its own

2018-01-18 Thread Qu Wenruo
The long planned cmds-check re-construction is finally here.

As the original cmds-check.c is getting larger and larger (already over
15K lines), it's always a good idea to split it into its own check/
directory.

This patchset do the following work:
1) Move cmds-check.c to check/main.c
2) Put codes shared by both original and lowmem mode into
   check/common.[ch]
3) Put lowmem code into check/lowmem.[ch]
   With minor renaming to get rid of unnecessary _v2 suffix.

The modification looks scary, but no functional change at all.

And considering how much the file structure changed, it's a good idea to
put PART1 as quick as possible, and there will be less pressure to
rebase new incoming fsck related codes.

The real move work happens in the 15th patch, which due to its size
(500KB+), it may not be able to reach mail list.
So please fetch the whole patchset from github:
https://github.com/adam900710/btrfs-progs/tree/split_check

There will be a part 2, mostly moving original mode to its own
check/original.[ch], along with extra comment explaining how the two
different modes work.

Qu Wenruo (16):
  btrfs-progs: Moves cmds-check.c to check/main.c
  btrfs-progs: check: Move original mode definitions to check/original.h
  btrfs-progs: check: Move definitions of lowmem mode to check/lowmem.h
  btrfs-progs: check: Move node_ptr structure to check/common.h
  btrfs-progs: check: Export check global variables to check/common.h
  btrfs-progs: check: Move imode_to_type function to check/common.h
  btrfs-progs: check: Move fs_root_objectid function to check/common.h
  btrfs-progs: check: Move count_csum_range function to check/common.c
  btrfs-progs: check: Move __create_inode_item function to
check/common.c
  btrfs-progs: check: Move link_inode_to_lostfound function to common.c
  btrfs-progs: check: Move check_dev_size_alignment to check/common.c
  btrfs-progs: check: move reada_walk_down to check/common.c
  btrfs-progs: check: Move check_child_node to check/common.c
  btrfs-progs: check: Move reset_cached_block_groups to check/common.c
  btrfs-progs: check: Move lowmem check code to its own
check/lowmem.[ch]
  btrfs-progs: check/lowmem: Cleanup unnecessary _v2 suffix

 Makefile | 6 +-
 check/common.c   |   351 +
 check/common.h   |   100 +
 check/lowmem.c   |  4571 
 check/lowmem.h   |67 +
 cmds-check.c => check/main.c | 16389 ++---
 check/original.h |   293 +
 7 files changed, 11007 insertions(+), 10770 deletions(-)
 create mode 100644 check/common.c
 create mode 100644 check/common.h
 create mode 100644 check/lowmem.c
 create mode 100644 check/lowmem.h
 rename cmds-check.c => check/main.c (65%)
 create mode 100644 check/original.h

-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/16] btrfs-progs: check: Move definitions of lowmem mode to check/lowmem.h

2018-01-18 Thread Qu Wenruo
Unlike original mode, lowmem mode mostly uses normal tree operations, so
no structure definitions, only a lot of random error bits.

Signed-off-by: Qu Wenruo 
---
 check/lowmem.h | 62 ++
 check/main.c   | 39 +---
 2 files changed, 63 insertions(+), 38 deletions(-)
 create mode 100644 check/lowmem.h

diff --git a/check/lowmem.h b/check/lowmem.h
new file mode 100644
index ..e6ca7634022c
--- /dev/null
+++ b/check/lowmem.h
@@ -0,0 +1,62 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+/*
+ * Defines and function declarations for lowmem mode check.
+ */
+#ifndef __BTRFS_CHECK_LOWMEM_H__
+#define __BTRFS_CHECK_LOWMEM_H__
+
+#define ROOT_DIR_ERROR (1<<1)  /* bad ROOT_DIR */
+#define DIR_ITEM_MISSING   (1<<2)  /* DIR_ITEM not found */
+#define DIR_ITEM_MISMATCH  (1<<3)  /* DIR_ITEM found but not match */
+#define INODE_REF_MISSING  (1<<4)  /* INODE_REF/INODE_EXTREF not found */
+#define INODE_ITEM_MISSING (1<<5)  /* INODE_ITEM not found */
+#define INODE_ITEM_MISMATCH(1<<6)  /* INODE_ITEM found but not match */
+#define FILE_EXTENT_ERROR  (1<<7)  /* bad FILE_EXTENT */
+#define ODD_CSUM_ITEM  (1<<8)  /* CSUM_ITEM error */
+#define CSUM_ITEM_MISSING  (1<<9)  /* CSUM_ITEM not found */
+#define LINK_COUNT_ERROR   (1<<10) /* INODE_ITEM nlink count error */
+#define NBYTES_ERROR   (1<<11) /* INODE_ITEM nbytes count error */
+#define ISIZE_ERROR(1<<12) /* INODE_ITEM size count error */
+#define ORPHAN_ITEM(1<<13) /* INODE_ITEM no reference */
+#define NO_INODE_ITEM  (1<<14) /* no inode_item */
+#define LAST_ITEM  (1<<15) /* Complete this tree traversal */
+#define ROOT_REF_MISSING   (1<<16) /* ROOT_REF not found */
+#define ROOT_REF_MISMATCH  (1<<17) /* ROOT_REF found but not match */
+#define DIR_INDEX_MISSING   (1<<18) /* INODE_INDEX not found */
+#define DIR_INDEX_MISMATCH  (1<<19) /* INODE_INDEX found but not match */
+#define DIR_COUNT_AGAIN (1<<20) /* DIR isize should be recalculated */
+#define BG_ACCOUNTING_ERROR (1<<21) /* Block group accounting error */
+
+/*
+ * Error bit for low memory mode check.
+ *
+ * Currently no caller cares about it yet.  Just internal use for error
+ * classification.
+ */
+#define BACKREF_MISSING(1 << 0) /* Backref missing in extent 
tree */
+#define BACKREF_MISMATCH   (1 << 1) /* Backref exists but does not match */
+#define BYTES_UNALIGNED(1 << 2) /* Some bytes are not aligned 
*/
+#define REFERENCER_MISSING (1 << 3) /* Referencer not found */
+#define REFERENCER_MISMATCH(1 << 4) /* Referenceer found but does not 
match */
+#define CROSSING_STRIPE_BOUNDARY (1 << 4) /* For kernel scrub workaround */
+#define ITEM_SIZE_MISMATCH (1 << 5) /* Bad item size */
+#define UNKNOWN_TYPE   (1 << 6) /* Unknown type */
+#define ACCOUNTING_MISMATCH(1 << 7) /* Used space accounting error */
+#define CHUNK_TYPE_MISMATCH(1 << 8)
+
+#endif
diff --git a/check/main.c b/check/main.c
index c91a949ff7cc..dbd2b755c48f 100644
--- a/check/main.c
+++ b/check/main.c
@@ -44,6 +44,7 @@
 #include "hash.h"
 #include "help.h"
 #include "check/original.h"
+#include "check/lowmem.h"
 
 enum task_position {
TASK_EXTENTS,
@@ -85,28 +86,6 @@ enum btrfs_check_mode {
 
 static enum btrfs_check_mode check_mode = CHECK_MODE_DEFAULT;
 
-#define ROOT_DIR_ERROR (1<<1)  /* bad ROOT_DIR */
-#define DIR_ITEM_MISSING   (1<<2)  /* DIR_ITEM not found */
-#define DIR_ITEM_MISMATCH  (1<<3)  /* DIR_ITEM found but not match */
-#define INODE_REF_MISSING  (1<<4)  /* INODE_REF/INODE_EXTREF not found */
-#define INODE_ITEM_MISSING (1<<5)  /* INODE_ITEM not found */
-#define INODE_ITEM_MISMATCH(1<<6)  /* INODE_ITEM found but not match */
-#define FILE_EXTENT_ERROR  (1<<7)  /* bad FILE_EXTENT */
-#define ODD_CSUM_ITEM  (1<<8)  /* CSUM_ITEM error */
-#define CSUM_ITEM_MISSING  (1<<9)  /* CSUM_ITEM not found */
-#define LINK_COUNT_ERROR   (1<<10) /* INODE_ITEM nlink count error */
-#define NBYTES_ERROR   (1<<11) /* INODE_ITEM nbytes count error */
-#define ISIZE_ERROR(1<<12) /* INODE_ITEM size count error */
-#define ORPHAN_ITEM(1<<13) /* INODE_ITEM no refer

[PATCH 05/16] btrfs-progs: check: Export check global variables to check/common.h

2018-01-18 Thread Qu Wenruo
There are a dozen of variables which are used as "check global"
variables, like @total_csum_bytes or @no_holes.

These variables are used freely across the check code, however since
we're splitting check code, they need to be exported so they can be used
in other files.

This patch just export them and add declarations for them in
check/common.h.

Signed-off-by: Qu Wenruo 
---
 check/common.h | 17 +
 check/main.c   | 32 
 2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/check/common.h b/check/common.h
index 25874aec597b..8d93ddbf4afb 100644
--- a/check/common.h
+++ b/check/common.h
@@ -36,4 +36,21 @@ struct node_refs {
int full_backref[BTRFS_MAX_LEVEL];
 };
 
+extern u64 bytes_used;
+extern u64 total_csum_bytes;
+extern u64 total_btree_bytes;
+extern u64 total_fs_tree_bytes;
+extern u64 total_extent_tree_bytes;
+extern u64 btree_space_waste;
+extern u64 data_bytes_allocated;
+extern u64 data_bytes_referenced;
+extern struct list_head duplicate_extents;
+extern struct list_head delete_items;
+extern int no_holes;
+extern int init_extent_tree;
+extern int check_data_csum;
+extern struct btrfs_fs_info *global_info;
+extern struct task_ctx ctx;
+extern struct cache_tree *roots_info_cache;
+
 #endif
diff --git a/check/main.c b/check/main.c
index fbd73c42bee8..bb927ecc87ee 100644
--- a/check/main.c
+++ b/check/main.c
@@ -61,22 +61,22 @@ struct task_ctx {
struct task_info *info;
 };
 
-static u64 bytes_used = 0;
-static u64 total_csum_bytes = 0;
-static u64 total_btree_bytes = 0;
-static u64 total_fs_tree_bytes = 0;
-static u64 total_extent_tree_bytes = 0;
-static u64 btree_space_waste = 0;
-static u64 data_bytes_allocated = 0;
-static u64 data_bytes_referenced = 0;
-static LIST_HEAD(duplicate_extents);
-static LIST_HEAD(delete_items);
-static int no_holes = 0;
-static int init_extent_tree = 0;
-static int check_data_csum = 0;
-static struct btrfs_fs_info *global_info;
-static struct task_ctx ctx = { 0 };
-static struct cache_tree *roots_info_cache = NULL;
+u64 bytes_used = 0;
+u64 total_csum_bytes = 0;
+u64 total_btree_bytes = 0;
+u64 total_fs_tree_bytes = 0;
+u64 total_extent_tree_bytes = 0;
+u64 btree_space_waste = 0;
+u64 data_bytes_allocated = 0;
+u64 data_bytes_referenced = 0;
+LIST_HEAD(duplicate_extents);
+LIST_HEAD(delete_items);
+int no_holes = 0;
+int init_extent_tree = 0;
+int check_data_csum = 0;
+struct btrfs_fs_info *global_info;
+struct task_ctx ctx = { 0 };
+struct cache_tree *roots_info_cache = NULL;
 
 enum btrfs_check_mode {
CHECK_MODE_ORIGINAL,
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/16] btrfs-progs: check: Move fs_root_objectid function to check/common.h

2018-01-18 Thread Qu Wenruo
Just another small wrapper shared between original and lowmem mode.

Signed-off-by: Qu Wenruo 
---
 check/common.h |  8 
 check/main.c   | 10 --
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/check/common.h b/check/common.h
index 3e0a5ebee54b..77a0ab54166f 100644
--- a/check/common.h
+++ b/check/common.h
@@ -72,4 +72,12 @@ static inline u8 imode_to_type(u32 imode)
 #undef S_SHIFT
 }
 
+static inline int fs_root_objectid(u64 objectid)
+{
+   if (objectid == BTRFS_TREE_RELOC_OBJECTID ||
+   objectid == BTRFS_DATA_RELOC_TREE_OBJECTID)
+   return 1;
+   return is_fstree(objectid);
+}
+
 #endif
diff --git a/check/main.c b/check/main.c
index eaa8e7fbde20..9ecbac8f19c3 100644
--- a/check/main.c
+++ b/check/main.c
@@ -2167,8 +2167,6 @@ out:
return err;
 }
 
-static int fs_root_objectid(u64 objectid);
-
 /*
  * Update global fs information.
  */
@@ -4250,14 +4248,6 @@ skip_walking:
return ret;
 }
 
-static int fs_root_objectid(u64 objectid)
-{
-   if (objectid == BTRFS_TREE_RELOC_OBJECTID ||
-   objectid == BTRFS_DATA_RELOC_TREE_OBJECTID)
-   return 1;
-   return is_fstree(objectid);
-}
-
 static int check_fs_roots(struct btrfs_fs_info *fs_info,
  struct cache_tree *root_cache)
 {
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/16] btrfs-progs: check: Move imode_to_type function to check/common.h

2018-01-18 Thread Qu Wenruo
This function is shared between original and lowmem mode, and it's small
enough, so move it to check/common.h.

Signed-off-by: Qu Wenruo 
---
 check/common.h | 19 +++
 check/main.c   | 17 -
 2 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/check/common.h b/check/common.h
index 8d93ddbf4afb..3e0a5ebee54b 100644
--- a/check/common.h
+++ b/check/common.h
@@ -20,6 +20,8 @@
  */
 #ifndef __BTRFS_CHECK_COMMON_H__
 #define __BTRFS_CHECK_COMMON_H__
+
+#include 
 #include "ctree.h"
 
 /*
@@ -53,4 +55,21 @@ extern struct btrfs_fs_info *global_info;
 extern struct task_ctx ctx;
 extern struct cache_tree *roots_info_cache;
 
+static inline u8 imode_to_type(u32 imode)
+{
+#define S_SHIFT 12
+   static unsigned char btrfs_type_by_mode[S_IFMT >> S_SHIFT] = {
+   [S_IFREG >> S_SHIFT]= BTRFS_FT_REG_FILE,
+   [S_IFDIR >> S_SHIFT]= BTRFS_FT_DIR,
+   [S_IFCHR >> S_SHIFT]= BTRFS_FT_CHRDEV,
+   [S_IFBLK >> S_SHIFT]= BTRFS_FT_BLKDEV,
+   [S_IFIFO >> S_SHIFT]= BTRFS_FT_FIFO,
+   [S_IFSOCK >> S_SHIFT]   = BTRFS_FT_SOCK,
+   [S_IFLNK >> S_SHIFT]= BTRFS_FT_SYMLINK,
+   };
+
+   return btrfs_type_by_mode[(imode & S_IFMT) >> S_SHIFT];
+#undef S_SHIFT
+}
+
 #endif
diff --git a/check/main.c b/check/main.c
index bb927ecc87ee..eaa8e7fbde20 100644
--- a/check/main.c
+++ b/check/main.c
@@ -425,23 +425,6 @@ static void record_root_in_trans(struct btrfs_trans_handle 
*trans,
}
 }
 
-static u8 imode_to_type(u32 imode)
-{
-#define S_SHIFT 12
-   static unsigned char btrfs_type_by_mode[S_IFMT >> S_SHIFT] = {
-   [S_IFREG >> S_SHIFT]= BTRFS_FT_REG_FILE,
-   [S_IFDIR >> S_SHIFT]= BTRFS_FT_DIR,
-   [S_IFCHR >> S_SHIFT]= BTRFS_FT_CHRDEV,
-   [S_IFBLK >> S_SHIFT]= BTRFS_FT_BLKDEV,
-   [S_IFIFO >> S_SHIFT]= BTRFS_FT_FIFO,
-   [S_IFSOCK >> S_SHIFT]   = BTRFS_FT_SOCK,
-   [S_IFLNK >> S_SHIFT]= BTRFS_FT_SYMLINK,
-   };
-
-   return btrfs_type_by_mode[(imode & S_IFMT) >> S_SHIFT];
-#undef S_SHIFT
-}
-
 static int device_record_compare(struct rb_node *node1, struct rb_node *node2)
 {
struct device_record *rec1;
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/16] btrfs-progs: check: Move node_ptr structure to check/common.h

2018-01-18 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 check/common.h | 39 +++
 check/main.c   | 11 +--
 2 files changed, 40 insertions(+), 10 deletions(-)
 create mode 100644 check/common.h

diff --git a/check/common.h b/check/common.h
new file mode 100644
index ..25874aec597b
--- /dev/null
+++ b/check/common.h
@@ -0,0 +1,39 @@
+/*
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+/*
+ * Defines and function declarations for code shared by both lowmem and
+ * original mode
+ */
+#ifndef __BTRFS_CHECK_COMMON_H__
+#define __BTRFS_CHECK_COMMON_H__
+#include "ctree.h"
+
+/*
+ * Use for tree walk to walk through trees whose leaves/nodes can be shared
+ * between different trees. (Namely subvolume/fs trees)
+ */
+struct node_refs {
+   u64 bytenr[BTRFS_MAX_LEVEL];
+   u64 refs[BTRFS_MAX_LEVEL];
+   int need_check[BTRFS_MAX_LEVEL];
+   /* field for checking all trees */
+   int checked[BTRFS_MAX_LEVEL];
+   /* the corresponding extent should be marked as full backref or not */
+   int full_backref[BTRFS_MAX_LEVEL];
+};
+
+#endif
diff --git a/check/main.c b/check/main.c
index dbd2b755c48f..fbd73c42bee8 100644
--- a/check/main.c
+++ b/check/main.c
@@ -45,6 +45,7 @@
 #include "help.h"
 #include "check/original.h"
 #include "check/lowmem.h"
+#include "check/common.h"
 
 enum task_position {
TASK_EXTENTS,
@@ -1667,16 +1668,6 @@ static int process_one_leaf(struct btrfs_root *root, 
struct extent_buffer *eb,
return ret;
 }
 
-struct node_refs {
-   u64 bytenr[BTRFS_MAX_LEVEL];
-   u64 refs[BTRFS_MAX_LEVEL];
-   int need_check[BTRFS_MAX_LEVEL];
-   /* field for checking all trees */
-   int checked[BTRFS_MAX_LEVEL];
-   /* the corresponding extent should be marked as full backref or not */
-   int full_backref[BTRFS_MAX_LEVEL];
-};
-
 static int update_nodes_refs(struct btrfs_root *root, u64 bytenr,
 struct extent_buffer *eb, struct node_refs *nrefs,
 u64 level, int check_all);
-- 
2.15.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/16] btrfs-progs: check: Move original mode definitions to check/original.h

2018-01-18 Thread Qu Wenruo
Signed-off-by: Qu Wenruo 
---
 check/main.c | 269 +-
 check/original.h | 293 +++
 2 files changed, 294 insertions(+), 268 deletions(-)
 create mode 100644 check/original.h

diff --git a/check/main.c b/check/main.c
index 7fc30da83ea1..c91a949ff7cc 100644
--- a/check/main.c
+++ b/check/main.c
@@ -43,6 +43,7 @@
 #include "kernel-shared/ulist.h"
 #include "hash.h"
 #include "help.h"
+#include "check/original.h"
 
 enum task_position {
TASK_EXTENTS,
@@ -84,35 +85,6 @@ enum btrfs_check_mode {
 
 static enum btrfs_check_mode check_mode = CHECK_MODE_DEFAULT;
 
-struct extent_backref {
-   struct rb_node node;
-   unsigned int is_data:1;
-   unsigned int found_extent_tree:1;
-   unsigned int full_backref:1;
-   unsigned int found_ref:1;
-   unsigned int broken:1;
-};
-
-static inline struct extent_backref* rb_node_to_extent_backref(struct rb_node 
*node)
-{
-   return rb_entry(node, struct extent_backref, node);
-}
-
-struct data_backref {
-   struct extent_backref node;
-   union {
-   u64 parent;
-   u64 root;
-   };
-   u64 owner;
-   u64 offset;
-   u64 disk_bytenr;
-   u64 bytes;
-   u64 ram_bytes;
-   u32 num_refs;
-   u32 found_ref;
-};
-
 #define ROOT_DIR_ERROR (1<<1)  /* bad ROOT_DIR */
 #define DIR_ITEM_MISSING   (1<<2)  /* DIR_ITEM not found */
 #define DIR_ITEM_MISMATCH  (1<<3)  /* DIR_ITEM found but not match */
@@ -135,11 +107,6 @@ struct data_backref {
 #define DIR_COUNT_AGAIN (1<<20) /* DIR isize should be recalculated */
 #define BG_ACCOUNTING_ERROR (1<<21) /* Block group accounting error */
 
-static inline struct data_backref* to_data_backref(struct extent_backref *back)
-{
-   return container_of(back, struct data_backref, node);
-}
-
 static int compare_data_backref(struct rb_node *node1, struct rb_node *node2)
 {
struct extent_backref *ext1 = rb_node_to_extent_backref(node1);
@@ -185,34 +152,6 @@ static int compare_data_backref(struct rb_node *node1, 
struct rb_node *node2)
return 0;
 }
 
-/*
- * Much like data_backref, just removed the undetermined members
- * and change it to use list_head.
- * During extent scan, it is stored in root->orphan_data_extent.
- * During fs tree scan, it is then moved to inode_rec->orphan_data_extents.
- */
-struct orphan_data_extent {
-   struct list_head list;
-   u64 root;
-   u64 objectid;
-   u64 offset;
-   u64 disk_bytenr;
-   u64 disk_len;
-};
-
-struct tree_backref {
-   struct extent_backref node;
-   union {
-   u64 parent;
-   u64 root;
-   };
-};
-
-static inline struct tree_backref* to_tree_backref(struct extent_backref *back)
-{
-   return container_of(back, struct tree_backref, node);
-}
-
 static int compare_tree_backref(struct rb_node *node1, struct rb_node *node2)
 {
struct extent_backref *ext1 = rb_node_to_extent_backref(node1);
@@ -254,212 +193,6 @@ static int compare_extent_backref(struct rb_node *node1, 
struct rb_node *node2)
return compare_tree_backref(node1, node2);
 }
 
-/* Explicit initialization for extent_record::flag_block_full_backref */
-enum { FLAG_UNSET = 2 };
-
-struct extent_record {
-   struct list_head backrefs;
-   struct list_head dups;
-   struct rb_root backref_tree;
-   struct list_head list;
-   struct cache_extent cache;
-   struct btrfs_disk_key parent_key;
-   u64 start;
-   u64 max_size;
-   u64 nr;
-   u64 refs;
-   u64 extent_item_refs;
-   u64 generation;
-   u64 parent_generation;
-   u64 info_objectid;
-   u32 num_duplicates;
-   u8 info_level;
-   unsigned int flag_block_full_backref:2;
-   unsigned int found_rec:1;
-   unsigned int content_checked:1;
-   unsigned int owner_ref_checked:1;
-   unsigned int is_root:1;
-   unsigned int metadata:1;
-   unsigned int bad_full_backref:1;
-   unsigned int crossing_stripes:1;
-   unsigned int wrong_chunk_type:1;
-};
-
-static inline struct extent_record* to_extent_record(struct list_head *entry)
-{
-   return container_of(entry, struct extent_record, list);
-}
-
-struct inode_backref {
-   struct list_head list;
-   unsigned int found_dir_item:1;
-   unsigned int found_dir_index:1;
-   unsigned int found_inode_ref:1;
-   u8 filetype;
-   u8 ref_type;
-   int errors;
-   u64 dir;
-   u64 index;
-   u16 namelen;
-   char name[0];
-};
-
-static inline struct inode_backref* to_inode_backref(struct list_head *entry)
-{
-   return list_entry(entry, struct inode_backref, list);
-}
-
-struct root_item_record {
-   struct list_head list;
-   u64 objectid;
-   u64 bytenr;
-   u64 last_snapshot;
-   u8 level;
-   u8 drop_level;
-   struct btrfs_key drop_key;
-};
-
-#de

Re: [PATCH] btrfs: Use IS_ALIGNED in btrfs_truncate_block instead of opencoding it

2018-01-18 Thread Qu Wenruo


On 2018年01月18日 20:47, Nikolay Borisov wrote:
> No functional changes, just makes the code more readable
> 
> Signed-off-by: Nikolay Borisov 

Reviewed-by: Qu Wenruo 

Thanks,
Qu

> ---
>  fs/btrfs/inode.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 269d129ffb1f..e9690e2aba09 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4775,8 +4775,8 @@ int btrfs_truncate_block(struct inode *inode, loff_t 
> from, loff_t len,
>   u64 block_start;
>   u64 block_end;
>  
> - if ((offset & (blocksize - 1)) == 0 &&
> - (!len || ((len & (blocksize - 1)) == 0)))
> + if (IS_ALIGNED(offset, blocksize) &&
> + (!len || IS_ALIGNED(len, blocksize)))
>   goto out;
>  
>   block_start = round_down(from, blocksize);
> 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v5 01/19] fs: new API for handling inode->i_version

2018-01-18 Thread Jeff Layton
On Thu, 2018-01-18 at 16:38 -0500, J. Bruce Fields wrote:
> On Tue, Jan 09, 2018 at 09:10:41AM -0500, Jeff Layton wrote:
> > --- /dev/null
> > +++ b/include/linux/iversion.h
> > @@ -0,0 +1,236 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _LINUX_IVERSION_H
> > +#define _LINUX_IVERSION_H
> > +
> > +#include 
> > +
> > +/*
> > + * The change attribute (i_version) is mandated by NFSv4 and is mostly for
> > + * knfsd, but is also used for other purposes (e.g. IMA). The i_version 
> > must
> > + * appear different to observers if there was a change to the inode's data 
> > or
> > + * metadata since it was last queried.
> > + *
> > + * Observers see the i_version as a 64-bit number that never changes.
> 
> I don't understand that sentence.
> 

That's because it's utter nonsense. I noticed that the other day and
fixed it in my tree. It now reads:

* Observers see the i_version as a 64-bit number that never decreases.

> > If it
> > + * remains the same since it was last checked, then nothing has changed in 
> > the
> > + * inode. If it's different then something has changed. Observers cannot 
> > infer
> > + * anything about the nature or magnitude of the changes from the value, 
> > only
> > + * that the inode has changed in some fashion.
> 
> As we've discussed before, there may be brief windows where the first
> two statements aren't quite correct.  I think that would be worth a
> mention if we can keep it concise.  Maybe add something like this?:
> 
>   It may be impractical for filesystems to keep i_version updates
>   atomic with respect to the changes that cause them.  They
>   should, however, guarantee that i_version updates are never
>   visible before the changes that caused them.  Also, i_version
>   updates should never be delayed longer than it takes the
>   original change to reach disk.

That makes sense. I added it in pretty much verbatim. I think we mostly
follow the latter should already.

> Or maybe those details are best left to documentation on the relevant
> parts of the api below (maybe inode_maybe_inc_iversion?).
> 
> I dunno if it's also worth mentioning that nfsd doesn't actually use the
> raw i_version--it mixes it with ctime to prevent i_version reuse after
> reboot.  Presumably that doesn't matter to IMA since it doesn't compare
> i_version across reboots.
> 

I think I won't document that here. nfsd is a consumer of i_version.
What it does with it is sort of its own business. Might be good to have
a comment blurb in the nfsd code about it though.

> The documentation here is all very helpful, thanks.

Thanks for all of the suggestions so far!
-- 
Jeff Layton 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 02/19] fs: don't take the i_lock in inode_inc_iversion

2018-01-18 Thread J. Bruce Fields
On Tue, Jan 09, 2018 at 09:10:42AM -0500, Jeff Layton wrote:
> From: Jeff Layton 
> 
> The rationale for taking the i_lock when incrementing this value is
> lost in antiquity. The readers of the field don't take it (at least
> not universally), so my assumption is that it was only done here to
> serialize incrementors.
> 
> If that is indeed the case, then we can drop the i_lock from this
> codepath and treat it as a atomic64_t for the purposes of
> incrementing it. This allows us to use inode_inc_iversion without
> any danger of lock inversion.
> 
> Note that the read side is not fetched atomically with this change.
> The assumption here is that that is not a critical issue since the
> i_version is not fully synchronized with anything else anyway.

So I guess it's theoretically possible that e.g. if you read while it's
incrementing from 2^32-1 to 2^32 you could read 0, 1, or 2^32+1?

If so then you could see an i_version value reused and incorrectly
decide that a file hadn't changed.

But it's such a tiny case, and I think you convert this to atomic64_t
later anyway, so, whatever.

--b.

> 
> Signed-off-by: Jeff Layton 
> ---
>  include/linux/iversion.h | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/iversion.h b/include/linux/iversion.h
> index d09cc3a08740..5ad9eaa3a9b0 100644
> --- a/include/linux/iversion.h
> +++ b/include/linux/iversion.h
> @@ -104,12 +104,13 @@ inode_set_iversion_queried(struct inode *inode, u64 new)
>  static inline bool
>  inode_maybe_inc_iversion(struct inode *inode, bool force)
>  {
> - spin_lock(&inode->i_lock);
> - inode->i_version++;
> - spin_unlock(&inode->i_lock);
> + atomic64_t *ivp = (atomic64_t *)&inode->i_version;
> +
> + atomic64_inc(ivp);
>   return true;
>  }
>  
> +
>  /**
>   * inode_inc_iversion - forcibly increment i_version
>   * @inode: inode that needs to be updated
> -- 
> 2.14.3
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 01/19] fs: new API for handling inode->i_version

2018-01-18 Thread J. Bruce Fields
On Tue, Jan 09, 2018 at 09:10:41AM -0500, Jeff Layton wrote:
> --- /dev/null
> +++ b/include/linux/iversion.h
> @@ -0,0 +1,236 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_IVERSION_H
> +#define _LINUX_IVERSION_H
> +
> +#include 
> +
> +/*
> + * The change attribute (i_version) is mandated by NFSv4 and is mostly for
> + * knfsd, but is also used for other purposes (e.g. IMA). The i_version must
> + * appear different to observers if there was a change to the inode's data or
> + * metadata since it was last queried.
> + *
> + * Observers see the i_version as a 64-bit number that never changes.

I don't understand that sentence.

> If it
> + * remains the same since it was last checked, then nothing has changed in 
> the
> + * inode. If it's different then something has changed. Observers cannot 
> infer
> + * anything about the nature or magnitude of the changes from the value, only
> + * that the inode has changed in some fashion.

As we've discussed before, there may be brief windows where the first
two statements aren't quite correct.  I think that would be worth a
mention if we can keep it concise.  Maybe add something like this?:

It may be impractical for filesystems to keep i_version updates
atomic with respect to the changes that cause them.  They
should, however, guarantee that i_version updates are never
visible before the changes that caused them.  Also, i_version
updates should never be delayed longer than it takes the
original change to reach disk.

Or maybe those details are best left to documentation on the relevant
parts of the api below (maybe inode_maybe_inc_iversion?).

I dunno if it's also worth mentioning that nfsd doesn't actually use the
raw i_version--it mixes it with ctime to prevent i_version reuse after
reboot.  Presumably that doesn't matter to IMA since it doesn't compare
i_version across reboots.

The documentation here is all very helpful, thanks.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RESEND v4 0/4] device_list_add() peparation to add reappearing missing device

2018-01-18 Thread David Sterba
On Thu, Jan 18, 2018 at 10:02:32PM +0800, Anand Jain wrote:
> (Apply on top of my patchset
>[PATCH v4 0/6] preparatory work to add device forget
>  for conflict free apply. They don't actually depend on
>  each other though).

> Cleanup of device_list_add(), mainly in preparation to handle
> reappearing missing device which its next reroll will be sent
> separately.

I'm adding the two patchsets to the 4.16 queue but will push the updated
branch after the current tests finish and I also test the updated branch
as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 00/99] XArray version 6

2018-01-18 Thread Matthew Wilcox
On Thu, Jan 18, 2018 at 05:56:12PM +0100, David Sterba wrote:
> On Thu, Jan 18, 2018 at 08:48:43AM -0800, Matthew Wilcox wrote:
> > Thank you!  I shall attempt to debug.  Was this with a btrfs root
> > filesystem?  I'm most suspicious of those patches right now, since they've
> > received next to no testing.  I'm going to put together a smaller patchset
> > which just does the page cache conversion and nothing else in the hope
> > that we can get that merged this year.
> 
> No, the root is ext3 and there was no btrfs filesytem mounted at the
> time.

Found it; I was missing a prerequisite patch.  New (smaller) patch series
coming soon.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 00/99] XArray version 6

2018-01-18 Thread David Sterba
On Thu, Jan 18, 2018 at 08:48:43AM -0800, Matthew Wilcox wrote:
> Thank you!  I shall attempt to debug.  Was this with a btrfs root
> filesystem?  I'm most suspicious of those patches right now, since they've
> received next to no testing.  I'm going to put together a smaller patchset
> which just does the page cache conversion and nothing else in the hope
> that we can get that merged this year.

No, the root is ext3 and there was no btrfs filesytem mounted at the
time.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/10] bugfixes and regression tests of btrfs_get_extent

2018-01-18 Thread David Sterba
On Mon, Jan 08, 2018 at 08:57:34PM +0100, David Sterba wrote:
> On Fri, Jan 05, 2018 at 12:51:07PM -0700, Liu Bo wrote:
> > Although
> > commit e6c4efd87ab0 ("btrfs: Fix and enhance merge_extent_mapping() to 
> > insert best fitted extent map")
> > fixed up the negetive em->len, it has introduced several regressions, 
> > several has been fixed by
> > 
> > commit 32be3a1ac6d0 ("btrfs: Fix the wrong condition judgment about subset 
> > extent map"),
> > commit 8dff9c853410 ("Btrfs: deal with duplciates during extent_map 
> > insertion in btrfs_get_extent") and
> > commit 8e2bd3b7fac9 ("Btrfs: deal with existing encompassing extent map in 
> > btrfs_get_extent()").
> > 
> > Unfortunately, there is one more regression which is caught recently by a
> > user's workloads.
> > 
> > While debugging the above issue, I found that all of these bugs are caused
> > by some racy situations, which can be very tricky to reproduce, so I
> > created several extent map specific test cases in btrfs's selftest
> > framework.
> > 
> > Patch 1-2 are fixing two bugs.
> > Patch 3-4 are some preparatory work.
> > Patch 3-5 are regression tests about the logic of handling EEXIST from
> > adding extent map.
> > Patch 8-10 are debugging wise, one is a direct tracepoint and the other is
> > to enable kprobe on merge_extent_mapping.
> > 
> > v2:
> > - Improve commit log to provide more details about the bug.
> > - Adjust bugfixes to the front so that we can merge them firstly.
> 
> Patchset updated in for-next. Expected merge target is 4.16, review is
> still needed (and welcome).

Patchset added to 4.16.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 00/99] XArray version 6

2018-01-18 Thread Matthew Wilcox
On Thu, Jan 18, 2018 at 05:07:50PM +0100, David Sterba wrote:
> On Wed, Jan 17, 2018 at 12:20:24PM -0800, Matthew Wilcox wrote:
> > From: Matthew Wilcox 
> > 
> > This version of the XArray has no known bugs.
> 
> I've booted this patchset on 2 boxes, both had random problems during
> boot. On one I was not able to diagnose what went wrong. On the other
> one the system booted up to userspace and failed to set up networking.
> Serial console worked and the network service complained about wrong
> format of /usr/share/wicked/schema/team.xml . That's supposed to be a
> text file, though hexdump showed me lots of zeros. Trimmed output:
> 
>   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
> *
> (similar output here)
> *
> 0a10  00 00 00 00 00 00 00 00  11 03 00 00 00 00 00 00  ||
> 0a20  20 8b 7f 01 00 00 00 00  a0 84 7d 01 00 00 00 00  | .}.|
> 0a30  00 00 00 00 00 00 00 00  10 89 7f 01 00 00 00 00  ||
> 0a40  a0 84 7d 01 00 00 00 00  00 00 00 00 00 00 00 00  |..}.|
> 0a50  80 8a 7f 01 00 00 00 00  e0 cf 7d 01 00 00 00 00  |..}.|
> 0a60  00 00 00 00 00 00 00 00  60 8a 7f 01 00 00 00 00  |`...|
> 0a70  a0 84 7d 01 00 00 00 00  00 00 00 00 00 00 00 00  |..}.|
> 0a80  30 89 7f 01 00 00 00 00  a0 84 7d 01 00 00 00 00  |0.}.|
> 0a90  00 00 00 00 00 00 00 00  60 f2 7f 01 00 00 00 00  |`...|
> 0aa0  40 fd 7e 01 00 00 00 00  00 00 00 00 00 00 00 00  |@.~.|
> 0ab0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
> *
> 1000  3e 0a 20 20 3c 2f 6d 65  74 68 6f 64 3e 0a 3c 2f  |>.  . 1010  73 65 72 76 69 63 65 3e  0a   |service>.|
> 
> There's something at the end of the file that does look like a xml fragment.
> The file size is 4121. This looks to me like exactly the first page of the 
> file
> was not read correctly.
> 
> The xml file is supposed to be read-only during startup, so there was no write
> in flight. 'rpm -Vv' reported only this file corrupted. Booting to other
> kernels was fine, network up, and the file was ok again. So the
> corruption happened only in memory, which leads me to conclusion that
> there is an unknown bug in your patchset.

Thank you!  I shall attempt to debug.  Was this with a btrfs root
filesystem?  I'm most suspicious of those patches right now, since they've
received next to no testing.  I'm going to put together a smaller patchset
which just does the page cache conversion and nothing else in the hope
that we can get that merged this year.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 2/4] btrfs: cleanup btrfs_mount() using btrfs_mount_root()

2018-01-18 Thread David Sterba
On Thu, Jan 18, 2018 at 12:48:37PM +0800, Anand Jain wrote:
> > On 2018/01/16 20:45, Anand Jain wrote:
> >> On 01/16/2018 03:26 AM, David Sterba wrote:
> >>> On Fri, Jan 12, 2018 at 06:14:40PM +0800, Anand Jain wrote:
> 
>  Misono,
> 
>  This change is causing subsequent (subvol) mount to fail when device
>  option is specified. The simplest eg for failure is ..
>    mkfs.btrfs -qf /dev/sdc /dev/sdb
>    mount -o device=/dev/sdb /dev/sdc /btrfs
>    mount -o device=/dev/sdb /dev/sdc /btrfs1
>   mount: /dev/sdc is already mounted or /btrfs1 busy
> 
>   Looks like
> blkdev_get_by_path() <-- is failing.
> btrfs_scan_one_device()
> btrfs_parse_early_options()
> btrfs_mount()
> 
>  Which is due to different holders (viz. btrfs_root_fs_type and
>  btrfs_fs_type) one is used for vfs_mount and other for scan,
>  so they form different holders and can't let EXCL open which
>  is needed for both scan and open.
> >>> This looks close to what I see in the random test failures. I've
> >>> reverted your patch "btrfs: optimize move uuid_mutex closer to the
> >>> critical section" as I bisected to it. The uuid mutex around
> >>> blkdev_get_path probably protected the concurrent mount and scan so they
> >>> did not ask for EXCL at the same time.
> >>>
> >>> Reverting (or removing the patch from the current misc-next) queue is
> >>> simpler for me ATM as I want to get to a stable base now, we can add it
> >>> later if we understand the issue with the mount/scan.
> >>Right. I don't see above test case failing on your branch [1] which
> >>does not have the uuid_mutex patch.
> 
>   Sorry I was wrong. Looks like I have booted wrong kernel to test.
>   So I see the same problem even you have reverted the patch:
> 'btrfs: optimize move uuid_mutex closer to the critical section'
>   in [1].

Yeah, the revert was result of an unreliable bisect, though I tried to
run the reproducers repeatedly. I'm going to consider the patch again.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 00/99] XArray version 6

2018-01-18 Thread David Sterba
On Wed, Jan 17, 2018 at 12:20:24PM -0800, Matthew Wilcox wrote:
> From: Matthew Wilcox 
> 
> This version of the XArray has no known bugs.

I've booted this patchset on 2 boxes, both had random problems during
boot. On one I was not able to diagnose what went wrong. On the other
one the system booted up to userspace and failed to set up networking.
Serial console worked and the network service complained about wrong
format of /usr/share/wicked/schema/team.xml . That's supposed to be a
text file, though hexdump showed me lots of zeros. Trimmed output:

  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
(similar output here)
*
0a10  00 00 00 00 00 00 00 00  11 03 00 00 00 00 00 00  ||
0a20  20 8b 7f 01 00 00 00 00  a0 84 7d 01 00 00 00 00  | .}.|
0a30  00 00 00 00 00 00 00 00  10 89 7f 01 00 00 00 00  ||
0a40  a0 84 7d 01 00 00 00 00  00 00 00 00 00 00 00 00  |..}.|
0a50  80 8a 7f 01 00 00 00 00  e0 cf 7d 01 00 00 00 00  |..}.|
0a60  00 00 00 00 00 00 00 00  60 8a 7f 01 00 00 00 00  |`...|
0a70  a0 84 7d 01 00 00 00 00  00 00 00 00 00 00 00 00  |..}.|
0a80  30 89 7f 01 00 00 00 00  a0 84 7d 01 00 00 00 00  |0.}.|
0a90  00 00 00 00 00 00 00 00  60 f2 7f 01 00 00 00 00  |`...|
0aa0  40 fd 7e 01 00 00 00 00  00 00 00 00 00 00 00 00  |@.~.|
0ab0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
1000  3e 0a 20 20 3c 2f 6d 65  74 68 6f 64 3e 0a 3c 2f  |>.  ..|

There's something at the end of the file that does look like a xml fragment.
The file size is 4121. This looks to me like exactly the first page of the file
was not read correctly.

The xml file is supposed to be read-only during startup, so there was no write
in flight. 'rpm -Vv' reported only this file corrupted. Booting to other
kernels was fine, network up, and the file was ok again. So the
corruption happened only in memory, which leads me to conclusion that
there is an unknown bug in your patchset.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: do not cache rbio pages if using raid6 recover

2018-01-18 Thread David Sterba
On Fri, Jan 12, 2018 at 06:07:02PM -0700, Liu Bo wrote:
> Since raid6 recover tries all possible combinations of failed stripes,
> 
> - when raid6 rebuild algorithm is used, i.e. raid6_datap_recov() and
>   raid6_2data_recov(), it may change the in-memory content of failed
>   stripes, if such a raid bio is cached, a later raid write rmw or recover
>   can steal @stripe_pages from it instead of reading from disks, such that
>   it carries the wrong content to do write rmw or recovery and ends up
>   with corruption or recovery failures.
> 
> - when raid5 rebuild algorithm is used, i.e. xor, raid bio can be cached
>   because the only failed stripe which contains @rbio->bio_pages gets
>   modified, others remain the same so that their in-memory content is
>   consistent with their on-disk content.
> 
> This adds a check to skip caching rbio if using raid6 recover.
> 
> Signed-off-by: Liu Bo 

Added to 4.16 queue, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: raid56: iterate raid56 internal bio with bio_for_each_segment_all

2018-01-18 Thread David Sterba
On Fri, Jan 12, 2018 at 06:07:01PM -0700, Liu Bo wrote:
> Bio iterated by set_bio_pages_uptodate() is raid56 internal one, so it
> will never be a BIO_CLONED bio, and since this is called by end_io
> functions, bio->bi_iter.bi_size is zero, we mustn't use
> bio_for_each_segment() as that is a no-op if bi_size is zero.
> 
> Fixes: 6592e58c6b68e61f003a01ba29a3716e7e2e9484 ("Btrfs: fix write corruption 
> due to bio cloning on raid5/6")
> Cc:  # v4.12-rc6+
> Signed-off-by: Liu Bo 

Tested and added to 4.16 queue, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs: remove unused arg from parse_subvol_options()

2018-01-18 Thread David Sterba
On Wed, Jan 17, 2018 at 05:38:31PM +0900, Misono, Tomohiro wrote:
> Remove unused arg 'holder' from parse_subvol_options(), which has been
> forgotten to be cleaned in the commit b99beb110e2d ("btrfs: split
> parse_early_options() in two").
> 
> Signed-off-by: Tomohiro Misono 

Thanks, added to the rest of mount patches.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: fix the bug of device scan/ready for mounted, filesystem

2018-01-18 Thread David Sterba
On Wed, Jan 17, 2018 at 05:37:39PM +0900, Misono, Tomohiro wrote:
> commit ae3acc5fc0bf ("btrfs: cleanup btrfs_mount() using
> btrfs_mount_root()") introduces a bug that "btrfs device scan/ready" for
> mounted filesystem fails.
> 
> This is because fs_info->bdev_holder has been changed to hold
> btrfs_root_fs_type instead of btrfs_fs_type by this commit, but ioctl
> for device scan/ready still uses btrfs_fs_type to call
> btrfs_scan_one_device(). This leads to failiure of blkdev_get_by_path()
> for mounted filesystem because of different holder type.
> 
> Fix this by specifying btrfs_root_fs_type for btrfs_scan_one_device() in
> the path of device ready/scan ioctl.
> 
> Signed-off-by: Tomohiro Misono 

Thanks. I'd rather fold that in to "btrfs: cleanup btrfs_mount() using
btrfs_mount_root()" where the semantics of holder actually changes.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 85/99] btrfs: Remove unused spinlock

2018-01-18 Thread David Sterba
On Wed, Jan 17, 2018 at 12:21:49PM -0800, Matthew Wilcox wrote:
> From: Matthew Wilcox 
> 
> The reada_lock in struct btrfs_device was only initialised, and not
> actually used.  That's good because there's another lock also called
> reada_lock in the btrfs_fs_info that was quite heavily used.  Remove
> this one.
> 
> Signed-off-by: Matthew Wilcox 

I'll pick this one now, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: fix missing inode i_size update after zero range operation

2018-01-18 Thread David Sterba
On Thu, Jan 18, 2018 at 11:34:20AM +, fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> For a fallocate's zero range operation that targets a range with an end
> that is not aligned to the sector size, we can end up not updating the
> inode's i_size. This happens when the last page of the range maps to an
> unwritten (prealloc) extent and before that last page we have either a
> hole or a written extent. This is because in this scenario we relied
> on a call to btrfs_prealloc_file_range() to update the inode's i_size,
> however it can only update the i_size to the "down aligned" end of the
> range.
> 
> Example:
> 
>  $ mkfs.btrfs -f /dev/sdc
>  $ mount /dev/sdc /mnt
>  $ xfs_io -f -c "pwrite -S 0xff 0 428K" /mnt/foobar
>  $ xfs_io -c "falloc -k 428K 4K" /mnt/foobar
>  $ xfs_io -c "fzero 0 430K" /mnt/foobar
>  $ du --bytes /mnt/foobar
>  438272   /mnt/foobar
> 
> The inode's i_size was left as 428Kb (438272 bytes) when it should have
> been updated to 430Kb (440320 bytes).
> Fix this by always updating the inode's i_size explicitly after zeroing
> the range.
> 
> Signed-off-by: Filipe Manana 

Added to 4.16 queue, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix space leak after fallocate and zero range operations

2018-01-18 Thread David Sterba
On Thu, Jan 18, 2018 at 01:45:20PM +0200, Nikolay Borisov wrote:
> >  [95795.015577] R10: 06b4 R11: 0246 R12: 
> > 7fa6791bae64
> >  [95795.016569] R13:  R14: 563386706210 R15: 
> > 7ffccf0ab160
> >  [95795.017662] Code: 00 00 00 4c 8b a3 98 25 00 00 49 83 bc 24 60 ff ff ff 
> > 00 75 16 49 83 bc 24 68 ff ff ff 00 75 0b 49 83 bc 24 70 ff ff ff 00 74 16 
> > <0f> ff 49 8d b4 24 18 ff ff ff 31 c9 31 d2 48 89 df e8 93 7a ff
> >  [95795.020538] ---[ end trace e95877675c6ec00d ]---
> >  [95795.021259] BTRFS info (device sdi): space_info 4 has 1072775168 free, 
> > is not full
> >  [95795.022390] BTRFS info (device sdi): space_info total=1073741824, 
> > used=114688, pinned=0, reserved=0, may_use=786432, readonly=65536
> > 
> > Fix this by ensuring the zero range operation does not call
> > btrfs_truncate_block() if the corresponding extent is an unwritten one
> > (it's pointless anyway, since reading from an unwritten extent yields
> > zeroes).
> > 
> > Signed-off-by: Filipe Manana 

Thanks.

> This fixes the leaks me and Josef has seen so:
> 
> Tested-by: Nikolay Borisov 

Thanks.

Added to current misc-next (that's going to be added to 4.16 queue after
I let it pass through fstests).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fstests: btrfs/011 fix device mounted when tests aborts

2018-01-18 Thread Anand Jain



On 01/17/2018 04:10 AM, Liu Bo wrote:

One of btrfs tests, btrfs/011, uses SCRATCH_DEV_POOL and puts a
non-SCRATCH_DEV device as the first one when doing mkfs, and this makes
_require_scratch{_nocheck} confused since it checks mount point with
SCRATCH_DEV only.

This adds _scratch_umount to cleanup() to umount SCRATCH_MNT by
011 itself.

Signed-off-by: Liu Bo 


Reviewed-by: Anand Jain 
Tested-by: Anand Jain 

Thanks, Anand


---
  tests/btrfs/011 | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tests/btrfs/011 b/tests/btrfs/011
index 28f1388..f4c5309 100755
--- a/tests/btrfs/011
+++ b/tests/btrfs/011
@@ -51,6 +51,7 @@ _cleanup()
fi
wait
rm -f $tmp.tmp
+   _scratch_unmount > /dev/null 2>&1
  }
  trap "_cleanup; exit \$status" 0 1 2 3 15
  


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] btrfs: set the total_devices in device_list_add()

2018-01-18 Thread Anand Jain
There is no other parent for device_list_add() except for
btrfs_scan_one_device(), which would set btrfs_fs_devices::total_devices
if device_list_add is successful and this can be done with in
device_list_add() itself.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 0b145276ff46..66e5dada2d74 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -843,6 +843,8 @@ static noinline int device_list_add(const char *path,
if (!fs_devices->opened)
device->generation = found_transid;
 
+   fs_devices->total_devices = btrfs_super_num_devices(disk_super);
+
*fs_devices_ret = fs_devices;
 
return 0;
@@ -1184,7 +1186,6 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
struct page *page;
int ret;
u64 devid;
-   u64 total_devices;
u64 bytenr;
 
/*
@@ -1206,12 +1207,9 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
}
 
devid = btrfs_stack_device_id(&disk_super->dev_item);
-   total_devices = btrfs_super_num_devices(disk_super);
 
mutex_lock(&uuid_mutex);
ret = device_list_add(path, disk_super, devid, fs_devices_ret);
-   if (!ret && fs_devices_ret)
-   (*fs_devices_ret)->total_devices = total_devices;
mutex_unlock(&uuid_mutex);
 
btrfs_release_disk_super(page);
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] btrfs: drop devid as device_list_add() arg

2018-01-18 Thread Anand Jain
As struct btrfs_disk_super is being passed, so it can get devid
the same way its parent does.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d93ee0b91ad9..e947e47f8fff 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -730,12 +730,13 @@ static int btrfs_open_one_device(struct btrfs_fs_devices 
*fs_devices,
  * error pointer when failed
  */
 static noinline struct btrfs_device *device_list_add(const char *path,
-  struct btrfs_super_block *disk_super, u64 devid)
+  struct btrfs_super_block *disk_super)
 {
struct btrfs_device *device;
struct btrfs_fs_devices *fs_devices;
struct rcu_string *name;
u64 found_transid = btrfs_super_generation(disk_super);
+   u64 devid = btrfs_stack_device_id(&disk_super->dev_item);
 
fs_devices = find_fsid(disk_super->fsid);
if (!fs_devices) {
@@ -1183,7 +1184,6 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
struct block_device *bdev;
struct page *page;
int ret = 0;
-   u64 devid;
u64 bytenr;
 
/*
@@ -1204,10 +1204,8 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
goto error_bdev_put;
}
 
-   devid = btrfs_stack_device_id(&disk_super->dev_item);
-
mutex_lock(&uuid_mutex);
-   device = device_list_add(path, disk_super, devid);
+   device = device_list_add(path, disk_super);
mutex_unlock(&uuid_mutex);
if (IS_ERR(device))
ret = PTR_ERR(device);
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] btrfs: get device pointer from device_list_add()

2018-01-18 Thread Anand Jain
Instead of pointer to btrfs_fs_devices as an arg in device_list_add()
better to get pointer to btrfs_device as return value, then we have
both, pointer to btrfs_device and btrfs_fs_devices. btrfs_device is
needed to handle reappearing missing device.

Signed-off-by: Anand Jain 
---
 fs/btrfs/volumes.c | 34 ++
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 66e5dada2d74..d93ee0b91ad9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -726,12 +726,11 @@ static int btrfs_open_one_device(struct btrfs_fs_devices 
*fs_devices,
  * Add new device to list of registered devices
  *
  * Returns:
- * 0   - device already known or newly added
- * < 0 - error
+ * device pointer which was just added or updated when successful
+ * error pointer when failed
  */
-static noinline int device_list_add(const char *path,
-  struct btrfs_super_block *disk_super,
-  u64 devid, struct btrfs_fs_devices **fs_devices_ret)
+static noinline struct btrfs_device *device_list_add(const char *path,
+  struct btrfs_super_block *disk_super, u64 devid)
 {
struct btrfs_device *device;
struct btrfs_fs_devices *fs_devices;
@@ -742,7 +741,7 @@ static noinline int device_list_add(const char *path,
if (!fs_devices) {
fs_devices = alloc_fs_devices(disk_super->fsid);
if (IS_ERR(fs_devices))
-   return PTR_ERR(fs_devices);
+   return ERR_PTR(PTR_ERR(fs_devices));
 
list_add(&fs_devices->list, &fs_uuids);
 
@@ -754,19 +753,19 @@ static noinline int device_list_add(const char *path,
 
if (!device) {
if (fs_devices->opened)
-   return -EBUSY;
+   return ERR_PTR(-EBUSY);
 
device = btrfs_alloc_device(NULL, &devid,
disk_super->dev_item.uuid);
if (IS_ERR(device)) {
/* we can safely leave the fs_devices entry around */
-   return PTR_ERR(device);
+   return device;
}
 
name = rcu_string_strdup(path, GFP_NOFS);
if (!name) {
free_device(device);
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
}
rcu_assign_pointer(device->name, name);
 
@@ -820,12 +819,12 @@ static noinline int device_list_add(const char *path,
 * with larger generation number or the last-in if
 * generation are equal.
 */
-   return -EEXIST;
+   return ERR_PTR(-EEXIST);
}
 
name = rcu_string_strdup(path, GFP_NOFS);
if (!name)
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
rcu_string_free(device->name);
rcu_assign_pointer(device->name, name);
if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) {
@@ -845,9 +844,7 @@ static noinline int device_list_add(const char *path,
 
fs_devices->total_devices = btrfs_super_num_devices(disk_super);
 
-   *fs_devices_ret = fs_devices;
-
-   return 0;
+   return device;
 }
 
 static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig)
@@ -1182,9 +1179,10 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
  struct btrfs_fs_devices **fs_devices_ret)
 {
struct btrfs_super_block *disk_super;
+   struct btrfs_device *device;
struct block_device *bdev;
struct page *page;
-   int ret;
+   int ret = 0;
u64 devid;
u64 bytenr;
 
@@ -1209,8 +1207,12 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
devid = btrfs_stack_device_id(&disk_super->dev_item);
 
mutex_lock(&uuid_mutex);
-   ret = device_list_add(path, disk_super, devid, fs_devices_ret);
+   device = device_list_add(path, disk_super, devid);
mutex_unlock(&uuid_mutex);
+   if (IS_ERR(device))
+   ret = PTR_ERR(device);
+
+   *fs_devices_ret = device->fs_devices;
 
btrfs_release_disk_super(page);
 
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] btrfs: move pr_info into device_list_add

2018-01-18 Thread Anand Jain
Commit 60999ca4b403 ("btrfs: make device scan less noisy")
adds return value 1 to device_list_add(), so that parent function can
call pr_info only when new device is added. Move the pr_info() part
into device_list_add() so that this function can be kept simple.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 29 +++--
 1 file changed, 11 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 68674da7f5fc..0b145276ff46 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -726,8 +726,7 @@ static int btrfs_open_one_device(struct btrfs_fs_devices 
*fs_devices,
  * Add new device to list of registered devices
  *
  * Returns:
- * 1   - first time device is seen
- * 0   - device already known
+ * 0   - device already known or newly added
  * < 0 - error
  */
 static noinline int device_list_add(const char *path,
@@ -737,7 +736,6 @@ static noinline int device_list_add(const char *path,
struct btrfs_device *device;
struct btrfs_fs_devices *fs_devices;
struct rcu_string *name;
-   int ret = 0;
u64 found_transid = btrfs_super_generation(disk_super);
 
fs_devices = find_fsid(disk_super->fsid);
@@ -777,9 +775,16 @@ static noinline int device_list_add(const char *path,
fs_devices->num_devices++;
mutex_unlock(&fs_devices->device_list_mutex);
 
-   ret = 1;
device->fs_devices = fs_devices;
btrfs_free_stale_devices(path, device);
+
+   if (disk_super->label[0])
+   pr_info("BTRFS: device label %s devid %llu transid %llu 
%s\n",
+   disk_super->label, devid, found_transid, path);
+   else
+   pr_info("BTRFS: device fsid %pU devid %llu transid %llu 
%s\n",
+   disk_super->fsid, devid, found_transid, path);
+
} else if (!device->name || strcmp(device->name->str, path)) {
/*
 * When FS is already mounted.
@@ -840,7 +845,7 @@ static noinline int device_list_add(const char *path,
 
*fs_devices_ret = fs_devices;
 
-   return ret;
+   return 0;
 }
 
 static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig)
@@ -1179,7 +1184,6 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
struct page *page;
int ret;
u64 devid;
-   u64 transid;
u64 total_devices;
u64 bytenr;
 
@@ -1202,25 +1206,14 @@ int btrfs_scan_one_device(const char *path, fmode_t 
flags, void *holder,
}
 
devid = btrfs_stack_device_id(&disk_super->dev_item);
-   transid = btrfs_super_generation(disk_super);
total_devices = btrfs_super_num_devices(disk_super);
 
mutex_lock(&uuid_mutex);
ret = device_list_add(path, disk_super, devid, fs_devices_ret);
-   if (ret >= 0 && fs_devices_ret)
+   if (!ret && fs_devices_ret)
(*fs_devices_ret)->total_devices = total_devices;
mutex_unlock(&uuid_mutex);
 
-   if (ret > 0) {
-   if (disk_super->label[0])
-   pr_info("BTRFS: device label %s ", disk_super->label);
-   else
-   pr_info("BTRFS: device fsid %pU ", disk_super->fsid);
-
-   pr_cont("devid %llu transid %llu %s\n", devid, transid, path);
-   ret = 0;
-   }
-
btrfs_release_disk_super(page);
 
 error_bdev_put:
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND v4 0/4] device_list_add() peparation to add reappearing missing device

2018-01-18 Thread Anand Jain
(Apply on top of my patchset
   [PATCH v4 0/6] preparatory work to add device forget
 for conflict free apply. They don't actually depend on
 each other though).

v3->v4:
 @3/4: Just return device instead of PTR_ERR(ERR_PTR(device));

v2->v3:
 Fix device_list_add() fn description which was still referring to the
 previous return values.

v1->v2:
 Drop patch 5/5 for uuid_mutex optimize. That was wrong. Thanks Josef.
 In patch 3/5 make btrfs_device * as return.

Cleanup of device_list_add(), mainly in preparation to handle
reappearing missing device which its next reroll will be sent
separately.

Anand Jain (4):
  btrfs: move pr_info into device_list_add
  btrfs: set the total_devices in device_list_add()
  btrfs: get device pointer from device_list_add()
  btrfs: drop devid as device_list_add() arg

 fs/btrfs/volumes.c | 63 +++---
 1 file changed, 27 insertions(+), 36 deletions(-)

-- 
2.7.0
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 0/6] preparatory work to add device forget

2018-01-18 Thread Anand Jain
v4->v5:
 Fix fn name btrfs_free_stale_device() to btrfs_free_stale_devices()
 in the comments and commit title. No code change.
 Add received reviewed-by.

v3->v4:
 Mainly fix as per comments from Josef.
 @3/6: rename btrfs_free_stale_device() to btrfs_free_stale_devices()
 @4/6: reorg logic, init not_found = 0; drop else part
 @5/6: added new in v4. Renames arg cur_dev to skip_dev
 @6/6: v3:5/6 is merged to v4:6/6
 checkpath error fixes.

v2->v3:
 @ 6/6:
 add btrfs_free_stale_device() fn description, suggested by Nikolay
 Fix line with longer than 80 char
 
v1->v2:
 @ 6/6:
 btrfs_device::name is null when we have missing device and
 unmounted. So we still need to check for dev->name.

We can reuse the function btrfs_free_stale_device() to add feature
to forget a scanned device or all stale devices. So this patch set
proposes following changes to it.


Anand Jain (6):
  btrfs: cleanup btrfs_free_stale_device() usage
  btrfs: no need to check for btrfs_fs_devices::seeding
  btrfs: make btrfs_free_stale_device() to iterate all stales
  btrfs: make btrfs_free_stale_devices() argument optional
  btrfs: rename btrfs_free_stale_devices() arg to skip_dev
  btrfs: make btrfs_free_stale_devices() to match the path

 fs/btrfs/volumes.c | 59 +++---
 1 file changed, 25 insertions(+), 34 deletions(-)

-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] btrfs: rename btrfs_free_stale_devices() arg to skip_dev

2018-01-18 Thread Anand Jain
No functional changes.
Rename btrfs_free_stale_devices() arg to skip_dev, so that it
reflects what that arg for.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bba98d043402..a3edd4d92c57 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -606,7 +606,7 @@ static void pending_bios_fn(struct btrfs_work *work)
 }
 
 
-static void btrfs_free_stale_devices(struct btrfs_device *cur_dev)
+static void btrfs_free_stale_devices(struct btrfs_device *skip_dev)
 {
struct btrfs_fs_devices *fs_devs, *tmp_fs_devs;
struct btrfs_device *dev, *tmp_dev;
@@ -620,7 +620,7 @@ static void btrfs_free_stale_devices(struct btrfs_device 
*cur_dev)
 &fs_devs->devices, dev_list) {
int not_found = 0;
 
-   if (cur_dev && (cur_dev == dev || !dev->name))
+   if (skip_dev && (skip_dev == dev || !dev->name))
continue;
 
/*
@@ -630,9 +630,9 @@ static void btrfs_free_stale_devices(struct btrfs_device 
*cur_dev)
 * either use mapper or non mapper path throughout.
 */
rcu_read_lock();
-   if (cur_dev)
+   if (skip_dev)
not_found = strcmp(rcu_str_deref(dev->name),
-  
rcu_str_deref(cur_dev->name));
+  
rcu_str_deref(skip_dev->name));
rcu_read_unlock();
if (not_found)
continue;
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] btrfs: no need to check for btrfs_fs_devices::seeding

2018-01-18 Thread Anand Jain
There is no need to check for btrfs_fs_devices::seeding when we
have checked for btrfs_fs_devices::opened, because we can't sprout
without its seed FS being opened.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 25b91776d036..3f481da9cae7 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -619,8 +619,6 @@ static void btrfs_free_stale_device(struct btrfs_device 
*cur_dev)
 
if (fs_devs->opened)
continue;
-   if (fs_devs->seeding)
-   continue;
 
list_for_each_entry(dev, &fs_devs->devices, dev_list) {
 
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] btrfs: make btrfs_free_stale_devices() to match the path

2018-01-18 Thread Anand Jain
From: Anand Jain 

The btrfs_free_stale_devices() is updated to match for the given
device path and delete it. (It searches for only unmounted list of
devices.) Also drop the comment about different path being used
for the same device, since now we will have cli to clean any
device that's not a concern any more.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index a3edd4d92c57..68674da7f5fc 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -605,8 +605,17 @@ static void pending_bios_fn(struct btrfs_work *work)
run_scheduled_bios(device);
 }
 
-
-static void btrfs_free_stale_devices(struct btrfs_device *skip_dev)
+/*
+ * btrfs_free_stale_devices()
+ *  Search and remove all stale (devices which are not mounted) devices.
+ *  When both inputs are NULL, it will search and release all stale devices.
+ *  path:  Optional. When provided will it release all unmounted devices
+ * matching this path only.
+ *  skip_dev:  Optional. Will skip this device when searching for the stale
+ * devices.
+ */
+static void btrfs_free_stale_devices(const char *path,
+struct btrfs_device *skip_dev)
 {
struct btrfs_fs_devices *fs_devs, *tmp_fs_devs;
struct btrfs_device *dev, *tmp_dev;
@@ -620,19 +629,15 @@ static void btrfs_free_stale_devices(struct btrfs_device 
*skip_dev)
 &fs_devs->devices, dev_list) {
int not_found = 0;
 
-   if (skip_dev && (skip_dev == dev || !dev->name))
+   if (skip_dev && skip_dev == dev)
+   continue;
+   if (path && !dev->name)
continue;
 
-   /*
-* Todo: This won't be enough. What if the same device
-* comes back (with new uuid and) with its mapper path?
-* But for now, this does help as mostly an admin will
-* either use mapper or non mapper path throughout.
-*/
rcu_read_lock();
-   if (skip_dev)
+   if (path)
not_found = strcmp(rcu_str_deref(dev->name),
-  
rcu_str_deref(skip_dev->name));
+  path);
rcu_read_unlock();
if (not_found)
continue;
@@ -774,7 +779,7 @@ static noinline int device_list_add(const char *path,
 
ret = 1;
device->fs_devices = fs_devices;
-   btrfs_free_stale_devices(device);
+   btrfs_free_stale_devices(path, device);
} else if (!device->name || strcmp(device->name->str, path)) {
/*
 * When FS is already mounted.
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] btrfs: make btrfs_free_stale_devices() argument optional

2018-01-18 Thread Anand Jain
From: Anand Jain 

This updates btrfs_free_stale_devices() helper function to delete all
unmouted devices, when arg is NULL.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index cd234a5dc763..bba98d043402 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -611,9 +611,6 @@ static void btrfs_free_stale_devices(struct btrfs_device 
*cur_dev)
struct btrfs_fs_devices *fs_devs, *tmp_fs_devs;
struct btrfs_device *dev, *tmp_dev;
 
-   if (!cur_dev->name)
-   return;
-
list_for_each_entry_safe(fs_devs, tmp_fs_devs, &fs_uuids, list) {
 
if (fs_devs->opened)
@@ -621,11 +618,9 @@ static void btrfs_free_stale_devices(struct btrfs_device 
*cur_dev)
 
list_for_each_entry_safe(dev, tmp_dev,
 &fs_devs->devices, dev_list) {
-   int not_found;
+   int not_found = 0;
 
-   if (dev == cur_dev)
-   continue;
-   if (!dev->name)
+   if (cur_dev && (cur_dev == dev || !dev->name))
continue;
 
/*
@@ -635,8 +630,9 @@ static void btrfs_free_stale_devices(struct btrfs_device 
*cur_dev)
 * either use mapper or non mapper path throughout.
 */
rcu_read_lock();
-   not_found = strcmp(rcu_str_deref(dev->name),
-   rcu_str_deref(cur_dev->name));
+   if (cur_dev)
+   not_found = strcmp(rcu_str_deref(dev->name),
+  
rcu_str_deref(cur_dev->name));
rcu_read_unlock();
if (not_found)
continue;
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] btrfs: make btrfs_free_stale_device() to iterate all stales

2018-01-18 Thread Anand Jain
From: Anand Jain 

Let the list iterator iterate further and find other stale
devices and delete it. This is in preparation to add support
for user land request-able stale devices cleanup. Also rename
btrfs_free_stale_device() to btrfs_free_stale_devices().

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 3f481da9cae7..cd234a5dc763 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -606,21 +606,22 @@ static void pending_bios_fn(struct btrfs_work *work)
 }
 
 
-static void btrfs_free_stale_device(struct btrfs_device *cur_dev)
+static void btrfs_free_stale_devices(struct btrfs_device *cur_dev)
 {
-   struct btrfs_fs_devices *fs_devs;
-   struct btrfs_device *dev;
+   struct btrfs_fs_devices *fs_devs, *tmp_fs_devs;
+   struct btrfs_device *dev, *tmp_dev;
 
if (!cur_dev->name)
return;
 
-   list_for_each_entry(fs_devs, &fs_uuids, list) {
-   int del = 1;
+   list_for_each_entry_safe(fs_devs, tmp_fs_devs, &fs_uuids, list) {
 
if (fs_devs->opened)
continue;
 
-   list_for_each_entry(dev, &fs_devs->devices, dev_list) {
+   list_for_each_entry_safe(dev, tmp_dev,
+&fs_devs->devices, dev_list) {
+   int not_found;
 
if (dev == cur_dev)
continue;
@@ -634,14 +635,12 @@ static void btrfs_free_stale_device(struct btrfs_device 
*cur_dev)
 * either use mapper or non mapper path throughout.
 */
rcu_read_lock();
-   del = strcmp(rcu_str_deref(dev->name),
+   not_found = strcmp(rcu_str_deref(dev->name),
rcu_str_deref(cur_dev->name));
rcu_read_unlock();
-   if (!del)
-   break;
-   }
+   if (not_found)
+   continue;
 
-   if (!del) {
/* delete the stale device */
if (fs_devs->num_devices == 1) {
btrfs_sysfs_remove_fsid(fs_devs);
@@ -652,7 +651,6 @@ static void btrfs_free_stale_device(struct btrfs_device 
*cur_dev)
list_del(&dev->dev_list);
free_device(dev);
}
-   break;
}
}
 }
@@ -780,7 +778,7 @@ static noinline int device_list_add(const char *path,
 
ret = 1;
device->fs_devices = fs_devices;
-   btrfs_free_stale_device(device);
+   btrfs_free_stale_devices(device);
} else if (!device->name || strcmp(device->name->str, path)) {
/*
 * When FS is already mounted.
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] btrfs: cleanup btrfs_free_stale_device() usage

2018-01-18 Thread Anand Jain
We call btrfs_free_stale_device() only when we alloc a new
struct btrfs_device (ret=1), so move it closer to where we
alloc the new device. Also drop the comments.

Signed-off-by: Anand Jain 
Reviewed-by: Josef Bacik 
---
 fs/btrfs/volumes.c | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d393808071d5..25b91776d036 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -782,6 +782,7 @@ static noinline int device_list_add(const char *path,
 
ret = 1;
device->fs_devices = fs_devices;
+   btrfs_free_stale_device(device);
} else if (!device->name || strcmp(device->name->str, path)) {
/*
 * When FS is already mounted.
@@ -840,13 +841,6 @@ static noinline int device_list_add(const char *path,
if (!fs_devices->opened)
device->generation = found_transid;
 
-   /*
-* if there is new btrfs on an already registered device,
-* then remove the stale device entry.
-*/
-   if (ret > 0)
-   btrfs_free_stale_device(device);
-
*fs_devices_ret = fs_devices;
 
return ret;
-- 
2.7.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: fix device order consistency

2018-01-18 Thread Anand Jain



On 01/18/2018 04:32 PM, Nikolay Borisov wrote:



On 18.01.2018 04:32, Anand Jain wrote:

By maintaining the device order consistency it makes reproducing
the problem more consistent. So fix this by having the devices


Which problem is that ?


 I noticed when trying to reproduce raid1 missed writes issue
 (xfstests test case is coming up). This is good to have as
 such with any device related issue.

Thanks, Anand


sorted by some order within the kernel, lets say by devid.

Signed-off-by: Anand Jain 
---
  fs/btrfs/volumes.c | 16 
  1 file changed, 16 insertions(+)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d393808071d5..68be58a5b03f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -27,6 +27,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include "ctree.h"
  #include "extent_map.h"
@@ -1108,6 +1109,20 @@ static int __btrfs_open_devices(struct btrfs_fs_devices 
*fs_devices,
return ret;
  }
  
+static int device_sort(void *priv, struct list_head *a, struct list_head *b)

+{
+   struct btrfs_device *dev1, *dev2;
+
+   dev1 = list_entry(a, struct btrfs_device, dev_list);
+   dev2 = list_entry(b, struct btrfs_device, dev_list);
+
+   if (dev1->devid < dev2->devid)
+   return -1;
+   else if (dev1->devid > dev2->devid)
+   return 1;
+   return 0;
+}
+
  int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
   fmode_t flags, void *holder)
  {
@@ -1118,6 +1133,7 @@ int btrfs_open_devices(struct btrfs_fs_devices 
*fs_devices,
fs_devices->opened++;
ret = 0;
} else {
+   list_sort(NULL, &fs_devices->devices, device_sort);
ret = __btrfs_open_devices(fs_devices, flags, holder);
}
mutex_unlock(&uuid_mutex);


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: Use IS_ALIGNED in btrfs_truncate_block instead of opencoding it

2018-01-18 Thread Nikolay Borisov
No functional changes, just makes the code more readable

Signed-off-by: Nikolay Borisov 
---
 fs/btrfs/inode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 269d129ffb1f..e9690e2aba09 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4775,8 +4775,8 @@ int btrfs_truncate_block(struct inode *inode, loff_t 
from, loff_t len,
u64 block_start;
u64 block_end;
 
-   if ((offset & (blocksize - 1)) == 0 &&
-   (!len || ((len & (blocksize - 1)) == 0)))
+   if (IS_ALIGNED(offset, blocksize) &&
+   (!len || IS_ALIGNED(len, blocksize)))
goto out;
 
block_start = round_down(from, blocksize);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix space leak after fallocate and zero range operations

2018-01-18 Thread Nikolay Borisov


On 18.01.2018 13:34, fdman...@kernel.org wrote:
> From: Filipe Manana 
> 
> If we do a buffered write after a zero range operation that has an
> unaligned (with the filesystem's sector size) end which also falls within
> an unwritten (prealloc) extent that is currently beyond the inode's
> i_size, and the zero range operation has the flag FALLOC_FL_KEEP_SIZE,
> we end up leaking data and metadata space. This happens because when
> zeroing a range we call btrfs_truncate_block(), which does delalloc
> (loads the page and partially zeroes its content), and in the buffered
> write path we only clear existing delalloc space reservation for the
> range we are writing into if that range starts at an offset smaller then
> the inode's i_size, which makes sense since we can not have delalloc
> extents beyond the i_size, only unwritten extents are allowed.
> 
> Example reproducer:
> 
>  $ mkfs.btrfs -f /dev/sdb
>  $ mount /dev/sdb /mnt
>  $ xfs_io -f -c "falloc -k 428K 4K" /mnt/foobar
>  $ xfs_io -c "fzero -k 0 430K" /mnt/foobar
>  $ xfs_io -c "pwrite -S 0xaa 428K 4K" /mnt/foobar
>  $ umount /mnt
> 
> After the unmount we get the metadata and data space leaks reported in
> dmesg/syslog:
> 
>  [95794.602253] [ cut here ]
>  [95794.603322] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9561 
> btrfs_destroy_inode+0x4e/0x206 [btrfs]
>  [95794.605167] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc 
> aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg 
> i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 
> ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 
> async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
> libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod 
> virtio_scsi ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring 
> virtio libata scsi_mod e1000 [last unloaded: btrfs]
>  [95794.613000] CPU: 0 PID: 31496 Comm: umount Tainted: GW   
> 4.14.0-rc6-btrfs-next-54+ #1
>  [95794.614448] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>  [95794.615972] task: 880075aa0240 task.stack: c90001734000
>  [95794.617114] RIP: 0010:btrfs_destroy_inode+0x4e/0x206 [btrfs]
>  [95794.618001] RSP: 0018:c90001737d00 EFLAGS: 00010202
>  [95794.618721] RAX:  RBX: 880070fa1418 RCX: 
> c90001737c7c
>  [95794.619645] RDX: 000175aa0240 RSI: 0001 RDI: 
> 880070fa1418
>  [95794.620711] RBP: c90001737d38 R08:  R09: 
> 
>  [95794.621932] R10: c90001737c48 R11: 88007123e158 R12: 
> 880075b6a000
>  [95794.623124] R13: 88006145c000 R14: 880070fa1418 R15: 
> 880070c3b4a0
>  [95794.624188] FS:  7fa6793c92c0() GS:88023fc0() 
> knlGS:
>  [95794.625578] CS:  0010 DS:  ES:  CR0: 80050033
>  [95794.626522] CR2: 56338670d048 CR3: 610dc005 CR4: 
> 001606f0
>  [95794.627647] Call Trace:
>  [95794.628128]  destroy_inode+0x3d/0x55
>  [95794.628573]  evict+0x177/0x17e
>  [95794.629010]  dispose_list+0x50/0x71
>  [95794.629478]  evict_inodes+0x132/0x141
>  [95794.630289]  generic_shutdown_super+0x3f/0x10b
>  [95794.630864]  kill_anon_super+0x12/0x1c
>  [95794.631383]  btrfs_kill_super+0x16/0x21 [btrfs]
>  [95794.631930]  deactivate_locked_super+0x30/0x68
>  [95794.632539]  deactivate_super+0x36/0x39
>  [95794.633200]  cleanup_mnt+0x49/0x67
>  [95794.633818]  __cleanup_mnt+0x12/0x14
>  [95794.634416]  task_work_run+0x82/0xa6
>  [95794.634902]  prepare_exit_to_usermode+0xe1/0x10c
>  [95794.635525]  syscall_return_slowpath+0x18c/0x1af
>  [95794.636122]  entry_SYSCALL_64_fastpath+0xab/0xad
>  [95794.636834] RIP: 0033:0x7fa678cb99a7
>  [95794.637370] RSP: 002b:7ffccf0aaed8 EFLAGS: 0246 ORIG_RAX: 
> 00a6
>  [95794.638672] RAX:  RBX: 563386706030 RCX: 
> 7fa678cb99a7
>  [95794.639596] RDX: 0001 RSI:  RDI: 
> 56338670ca90
>  [95794.640703] RBP: 56338670ca90 R08: 56338670c740 R09: 
> 0015
>  [95794.641773] R10: 06b4 R11: 0246 R12: 
> 7fa6791bae64
>  [95794.643150] R13:  R14: 563386706210 R15: 
> 7ffccf0ab160
>  [95794.644249] Code: ff 4c 8b a8 80 06 00 00 48 8b 87 c0 01 00 00 48 85 c0 
> 74 02 0f ff 48 83 bb e0 02 00 00 00 74 02 0f ff 83 bb 3c ff ff ff 00 74 02 
> <0f> ff 83 bb 40 ff ff ff 00 74 02 0f ff 48 83 bb f8 fe ff ff 00
>  [95794.646929] ---[ end trace e95877675c6ec007 ]---
>  [95794.647751] [ cut here ]
>  [95794.648509] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9562 
> btrfs_destroy_inode+0x59/0x206 [btrfs]
>  [95794.649842] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc 
> aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc p

[PATCH 1/2] Btrfs: fix missing inode i_size update after zero range operation

2018-01-18 Thread fdmanana
From: Filipe Manana 

For a fallocate's zero range operation that targets a range with an end
that is not aligned to the sector size, we can end up not updating the
inode's i_size. This happens when the last page of the range maps to an
unwritten (prealloc) extent and before that last page we have either a
hole or a written extent. This is because in this scenario we relied
on a call to btrfs_prealloc_file_range() to update the inode's i_size,
however it can only update the i_size to the "down aligned" end of the
range.

Example:

 $ mkfs.btrfs -f /dev/sdc
 $ mount /dev/sdc /mnt
 $ xfs_io -f -c "pwrite -S 0xff 0 428K" /mnt/foobar
 $ xfs_io -c "falloc -k 428K 4K" /mnt/foobar
 $ xfs_io -c "fzero 0 430K" /mnt/foobar
 $ du --bytes /mnt/foobar
 438272 /mnt/foobar

The inode's i_size was left as 428Kb (438272 bytes) when it should have
been updated to 430Kb (440320 bytes).
Fix this by always updating the inode's i_size explicitly after zeroing
the range.

Signed-off-by: Filipe Manana 
---
 fs/btrfs/file.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index dc95d9590d2d..9ad0465d2e8e 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -3026,9 +3026,12 @@ static int btrfs_zero_range(struct inode *inode,
unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart,
 lockend, &cached_state, GFP_KERNEL);
/* btrfs_prealloc_file_range releases reserved space on error */
-   if (ret)
+   if (ret) {
space_reserved = false;
+   goto out;
+   }
}
+   ret = btrfs_fallocate_update_isize(inode, offset + len, mode);
  out:
if (ret && space_reserved)
btrfs_free_reserved_data_space(inode, data_reserved,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs: fix space leak after fallocate and zero range operations

2018-01-18 Thread fdmanana
From: Filipe Manana 

If we do a buffered write after a zero range operation that has an
unaligned (with the filesystem's sector size) end which also falls within
an unwritten (prealloc) extent that is currently beyond the inode's
i_size, and the zero range operation has the flag FALLOC_FL_KEEP_SIZE,
we end up leaking data and metadata space. This happens because when
zeroing a range we call btrfs_truncate_block(), which does delalloc
(loads the page and partially zeroes its content), and in the buffered
write path we only clear existing delalloc space reservation for the
range we are writing into if that range starts at an offset smaller then
the inode's i_size, which makes sense since we can not have delalloc
extents beyond the i_size, only unwritten extents are allowed.

Example reproducer:

 $ mkfs.btrfs -f /dev/sdb
 $ mount /dev/sdb /mnt
 $ xfs_io -f -c "falloc -k 428K 4K" /mnt/foobar
 $ xfs_io -c "fzero -k 0 430K" /mnt/foobar
 $ xfs_io -c "pwrite -S 0xaa 428K 4K" /mnt/foobar
 $ umount /mnt

After the unmount we get the metadata and data space leaks reported in
dmesg/syslog:

 [95794.602253] [ cut here ]
 [95794.603322] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9561 
btrfs_destroy_inode+0x4e/0x206 [btrfs]
 [95794.605167] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc 
aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg 
i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 
ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod sd_mod virtio_scsi 
ata_generic crc32c_intel ata_piix floppy virtio_pci virtio_ring virtio libata 
scsi_mod e1000 [last unloaded: btrfs]
 [95794.613000] CPU: 0 PID: 31496 Comm: umount Tainted: GW   
4.14.0-rc6-btrfs-next-54+ #1
 [95794.614448] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
 [95794.615972] task: 880075aa0240 task.stack: c90001734000
 [95794.617114] RIP: 0010:btrfs_destroy_inode+0x4e/0x206 [btrfs]
 [95794.618001] RSP: 0018:c90001737d00 EFLAGS: 00010202
 [95794.618721] RAX:  RBX: 880070fa1418 RCX: 
c90001737c7c
 [95794.619645] RDX: 000175aa0240 RSI: 0001 RDI: 
880070fa1418
 [95794.620711] RBP: c90001737d38 R08:  R09: 

 [95794.621932] R10: c90001737c48 R11: 88007123e158 R12: 
880075b6a000
 [95794.623124] R13: 88006145c000 R14: 880070fa1418 R15: 
880070c3b4a0
 [95794.624188] FS:  7fa6793c92c0() GS:88023fc0() 
knlGS:
 [95794.625578] CS:  0010 DS:  ES:  CR0: 80050033
 [95794.626522] CR2: 56338670d048 CR3: 610dc005 CR4: 
001606f0
 [95794.627647] Call Trace:
 [95794.628128]  destroy_inode+0x3d/0x55
 [95794.628573]  evict+0x177/0x17e
 [95794.629010]  dispose_list+0x50/0x71
 [95794.629478]  evict_inodes+0x132/0x141
 [95794.630289]  generic_shutdown_super+0x3f/0x10b
 [95794.630864]  kill_anon_super+0x12/0x1c
 [95794.631383]  btrfs_kill_super+0x16/0x21 [btrfs]
 [95794.631930]  deactivate_locked_super+0x30/0x68
 [95794.632539]  deactivate_super+0x36/0x39
 [95794.633200]  cleanup_mnt+0x49/0x67
 [95794.633818]  __cleanup_mnt+0x12/0x14
 [95794.634416]  task_work_run+0x82/0xa6
 [95794.634902]  prepare_exit_to_usermode+0xe1/0x10c
 [95794.635525]  syscall_return_slowpath+0x18c/0x1af
 [95794.636122]  entry_SYSCALL_64_fastpath+0xab/0xad
 [95794.636834] RIP: 0033:0x7fa678cb99a7
 [95794.637370] RSP: 002b:7ffccf0aaed8 EFLAGS: 0246 ORIG_RAX: 
00a6
 [95794.638672] RAX:  RBX: 563386706030 RCX: 
7fa678cb99a7
 [95794.639596] RDX: 0001 RSI:  RDI: 
56338670ca90
 [95794.640703] RBP: 56338670ca90 R08: 56338670c740 R09: 
0015
 [95794.641773] R10: 06b4 R11: 0246 R12: 
7fa6791bae64
 [95794.643150] R13:  R14: 563386706210 R15: 
7ffccf0ab160
 [95794.644249] Code: ff 4c 8b a8 80 06 00 00 48 8b 87 c0 01 00 00 48 85 c0 74 
02 0f ff 48 83 bb e0 02 00 00 00 74 02 0f ff 83 bb 3c ff ff ff 00 74 02 <0f> ff 
83 bb 40 ff ff ff 00 74 02 0f ff 48 83 bb f8 fe ff ff 00
 [95794.646929] ---[ end trace e95877675c6ec007 ]---
 [95794.647751] [ cut here ]
 [95794.648509] WARNING: CPU: 0 PID: 31496 at fs/btrfs/inode.c:9562 
btrfs_destroy_inode+0x59/0x206 [btrfs]
 [95794.649842] Modules linked in: btrfs xfs ppdev ghash_clmulni_intel pcbc 
aesni_intel aes_x86_64 crypto_simd cryptd glue_helper parport_pc psmouse sg 
i2c_piix4 parport i2c_core evdev pcspkr button serio_raw sunrpc loop autofs4 
ext4 crc16 mbcache jbd2 zstd_decompress zstd_compress xxhash raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcr

Re: [PATCH 1/2] btrfs: fix device order consistency

2018-01-18 Thread Nikolay Borisov


On 18.01.2018 04:32, Anand Jain wrote:
> By maintaining the device order consistency it makes reproducing
> the problem more consistent. So fix this by having the devices

Which problem is that ?

> sorted by some order within the kernel, lets say by devid.
> 
> Signed-off-by: Anand Jain 
> ---
>  fs/btrfs/volumes.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index d393808071d5..68be58a5b03f 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "ctree.h"
>  #include "extent_map.h"
> @@ -1108,6 +1109,20 @@ static int __btrfs_open_devices(struct 
> btrfs_fs_devices *fs_devices,
>   return ret;
>  }
>  
> +static int device_sort(void *priv, struct list_head *a, struct list_head *b)
> +{
> + struct btrfs_device *dev1, *dev2;
> +
> + dev1 = list_entry(a, struct btrfs_device, dev_list);
> + dev2 = list_entry(b, struct btrfs_device, dev_list);
> +
> + if (dev1->devid < dev2->devid)
> + return -1;
> + else if (dev1->devid > dev2->devid)
> + return 1;
> + return 0;
> +}
> +
>  int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
>  fmode_t flags, void *holder)
>  {
> @@ -1118,6 +1133,7 @@ int btrfs_open_devices(struct btrfs_fs_devices 
> *fs_devices,
>   fs_devices->opened++;
>   ret = 0;
>   } else {
> + list_sort(NULL, &fs_devices->devices, device_sort);
>   ret = __btrfs_open_devices(fs_devices, flags, holder);
>   }
>   mutex_unlock(&uuid_mutex);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html