Re: [PATCH v3 2/4] btrfs-progs: lowmem check: Fix false alert about referencer count mismatch

2017-07-02 Thread Lu Fengqi
On Sun, Jul 02, 2017 at 03:50:31PM +0200, Henk Slager wrote:
>
>With this patch applied to v4.11, I ran:
># btrfs check -p --mode lowmem /dev/mapper/smr
>
>no 'referencer count mismatch' anymore, but likely due to other hidden
>corruption, the check took more time than I had planned, so after 5
>days, I cancelled it.
>
>As a summary, both kernel and lowmem check mention the same issue as
>it looks like; for the lowmem check it is this, (repeating):
>[...]
>parent transid verify failed on 6350669414400 wanted 24678 found 24184
>parent transid verify failed on 6350645837824 wanted 24678 found 23277
>Ignoring transid failure
>leaf parent key incorrect 6350645837824
>ERROR: extent[6349151535104 16384] backref lost (owner: 2, level: 0)
>ERROR: check leaf failed root 2 bytenr 6349151535104 level 0, force
>continue check
>parent transid verify failed on 6350645837824 wanted 24678 found 23277
>Ignoring transid failure
>leaf parent key incorrect 6350645837824
>ERROR: extent[6349150486528 16384] backref lost (owner: 2, level: 0)
>ERROR: check leaf failed root 2 bytenr 6349150486528 level 0, force
>continue check

This looks like the extent tree has some problems. I would appreciate it
if you could run the following command to dump the extent tree for me?

# btrfs-debug-tree -t 2 /dev/mapper/smr | grep -C 10 -e 6349151535104 -e 
6349150486528

>My plan is now to image the whole 8TB fs to extra/new storage hardware
>with dd and then see if I can get the copy fixed. But it might take a
>year before I do so (it is not critical w.r.t. data-loss, it's cold
>storage, multi-year btrfs features test).
>
>

-- 
Thanks,
Lu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/4] btrfs-progs: lowmem check: Fix false alert about referencer count mismatch

2017-07-02 Thread Henk Slager
On Mon, Jun 26, 2017 at 12:37 PM, Lu Fengqi  wrote:
> The normal back reference counting doesn't care about the extent referred
> by the extent data in the shared leaf. The check_extent_data_backref
> function need to skip the leaf that owner mismatch with the root_id.
>
> Reported-by: Marc MERLIN 
> Signed-off-by: Lu Fengqi 
> ---
>  cmds-check.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/cmds-check.c b/cmds-check.c
> index 70d2b7f2..f42968cd 100644
> --- a/cmds-check.c
> +++ b/cmds-check.c
> @@ -10692,7 +10692,8 @@ static int check_extent_data_backref(struct 
> btrfs_fs_info *fs_info,
> leaf = path.nodes[0];
> slot = path.slots[0];
>
> -   if (slot >= btrfs_header_nritems(leaf))
> +   if (slot >= btrfs_header_nritems(leaf) ||
> +   btrfs_header_owner(leaf) != root_id)
> goto next;
> btrfs_item_key_to_cpu(leaf, , slot);
> if (key.objectid != objectid || key.type != 
> BTRFS_EXTENT_DATA_KEY)
> --
> 2.13.1

With this patch applied to v4.11, I ran:
# btrfs check -p --mode lowmem /dev/mapper/smr

no 'referencer count mismatch' anymore, but likely due to other hidden
corruption, the check took more time than I had planned, so after 5
days, I cancelled it.

As a summary, both kernel and lowmem check mention the same issue as
it looks like; for the lowmem check it is this, (repeating):
[...]
parent transid verify failed on 6350669414400 wanted 24678 found 24184
parent transid verify failed on 6350645837824 wanted 24678 found 23277
Ignoring transid failure
leaf parent key incorrect 6350645837824
ERROR: extent[6349151535104 16384] backref lost (owner: 2, level: 0)
ERROR: check leaf failed root 2 bytenr 6349151535104 level 0, force
continue check
parent transid verify failed on 6350645837824 wanted 24678 found 23277
Ignoring transid failure
leaf parent key incorrect 6350645837824
ERROR: extent[6349150486528 16384] backref lost (owner: 2, level: 0)
ERROR: check leaf failed root 2 bytenr 6349150486528 level 0, force
continue check
^C

My plan is now to image the whole 8TB fs to extra/new storage hardware
with dd and then see if I can get the copy fixed. But it might take a
year before I do so (it is not critical w.r.t. data-loss, it's cold
storage, multi-year btrfs features test).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/4] btrfs-progs: lowmem check: Fix false alert about referencer count mismatch

2017-06-26 Thread Lu Fengqi
The normal back reference counting doesn't care about the extent referred
by the extent data in the shared leaf. The check_extent_data_backref
function need to skip the leaf that owner mismatch with the root_id.

Reported-by: Marc MERLIN 
Signed-off-by: Lu Fengqi 
---
 cmds-check.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/cmds-check.c b/cmds-check.c
index 70d2b7f2..f42968cd 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -10692,7 +10692,8 @@ static int check_extent_data_backref(struct 
btrfs_fs_info *fs_info,
leaf = path.nodes[0];
slot = path.slots[0];
 
-   if (slot >= btrfs_header_nritems(leaf))
+   if (slot >= btrfs_header_nritems(leaf) ||
+   btrfs_header_owner(leaf) != root_id)
goto next;
btrfs_item_key_to_cpu(leaf, , slot);
if (key.objectid != objectid || key.type != 
BTRFS_EXTENT_DATA_KEY)
-- 
2.13.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html