Re: RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-04 Thread Qu Wenruo


On 2018/9/5 上午4:37, Chris Murphy wrote:
> On Tue, Sep 4, 2018 at 10:22 AM, Etienne Champetier
>  wrote:
> 
>> Do you have a procedure to copy all subvolumes & skip error ? (I have
>> ~200 snapshots)
> 
> If they're already read-only snapshots, then script an iteration of
> btrfs send receive to a new volume.

Doesn't simple "cp -r" work here?
(If the important thing is data, not the subvolume layout).

Thanks,
Qu

> 
> Btrfs seed-sprout would be ideal, however in this case I don't think
> can help because a.) it's temporarily one file system, which could
> mean the corruption is inherited; and b.) I'm not sure it's multiple
> device aware, so either the btrfs-tune -S1 might fail on 2+ device
> Btrfs volumes, or possibly it insists on a two device sprout in order
> to replicate a two device seed.
> 
> If they're not already read-only, it's tricky because it sounds like
> mounting rw is possibly risky, and taking read only snapshots might
> fail anyway. There is no way to make read only snapshots unless the
> volume can be written to; and no way to force a rw subvolume to be
> treated as if it were read only even if the volume is mounted read
> only. And it takes a read only subvolume for send to work.
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-04 Thread Chris Murphy
On Tue, Sep 4, 2018 at 10:22 AM, Etienne Champetier
 wrote:

> Do you have a procedure to copy all subvolumes & skip error ? (I have
> ~200 snapshots)

If they're already read-only snapshots, then script an iteration of
btrfs send receive to a new volume.

Btrfs seed-sprout would be ideal, however in this case I don't think
can help because a.) it's temporarily one file system, which could
mean the corruption is inherited; and b.) I'm not sure it's multiple
device aware, so either the btrfs-tune -S1 might fail on 2+ device
Btrfs volumes, or possibly it insists on a two device sprout in order
to replicate a two device seed.

If they're not already read-only, it's tricky because it sounds like
mounting rw is possibly risky, and taking read only snapshots might
fail anyway. There is no way to make read only snapshots unless the
volume can be written to; and no way to force a rw subvolume to be
treated as if it were read only even if the volume is mounted read
only. And it takes a read only subvolume for send to work.


-- 
Chris Murphy


Re: RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-04 Thread Etienne Champetier
Thanks Qu, one last question I think

Le mar. 4 sept. 2018 à 08:33, Qu Wenruo  a écrit :
>
> On 2018/9/4 下午7:53, Etienne Champetier wrote:
> > Hi Qu,
> >
> > Le lun. 3 sept. 2018 à 20:27, Qu Wenruo  a écrit :
> >>
> >> On 2018/9/3 下午10:18, Etienne Champetier wrote:
> >>> Hello btfrs hackers,
> >>>
> >>> I have a computer acting as backup server with BTRFS RAID1, and I
> >>> would like to know the different options to rebuild this RAID
> >>> (I saw this thread
> >>> https://www.spinics.net/lists/linux-btrfs/msg68679.html but there was
> >>> no raid 1)
> >>>
> >>> # uname -a
> >>> Linux servmaison 4.4.0-134-generic #160-Ubuntu SMP Wed Aug 15 14:58:00
> >>> UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> >>>
> >>> # btrfs --version
> >>> btrfs-progs v4.4
> >>>
> >>> # dmesg
> >>> ...
> >>> [ 1955.581972] BTRFS critical (device sda2): corrupt leaf, bad key
> >>> order: block=6020235362304,root=1, slot=63
> >>> [ 1955.582299] BTRFS critical (device sda2): corrupt leaf, bad key
> >>> order: block=6020235362304,root=1, slot=63
> >
> > Now running a Fedora 28 install kernel
> >
> > # uname -a
> > Linux servmaison 4.16.3-301.fc28.x86_64 #1 SMP Mon Apr 23 21:59:58 UTC
> > 2018 x86_64 x86_64 x86_64 GNU/Linux
> > # btrfs --version
> > btrfs-progs v4.15.1
>
> Unfortunately, even for latest btrfs-progs release (v4.17.1, and even
> devel branch), btrfs check will abort checking if free space cache is
> corrupted.
>
> So we didn't get any useful info from btrfs check.
>
> Such diff would help you continue checking (if you really want, other
> than starting salvaging your data)
> --
> diff --git a/check/main.c b/check/main.c
> index b361cd7e26a0..4f720163221e 100644
> --- a/check/main.c
> +++ b/check/main.c
> @@ -9885,7 +9885,6 @@ int cmd_check(int argc, char **argv)
> error("errors found in free space tree");
> else
> error("errors found in free space cache");
> -   goto out;
> }
>
> /*
> --
>
>
> For dump tree block, the corrupted tree block belongs to extent tree.
> Which could be a good news (depends on how you define GOOD news).
>
> The corruption is not an easy fix, it's not just a swapped slot.
> The corrupted slot (item 64, whole key objectid is 5946810351616) is way
> beyond the extent data range, thus btrfs-progs can't fix it easily.
>
> Considering how much bytenr difference there is and the generation gap
> (53167 vs current generation 1555950), the bug happens a long long time
> ago (days or weeks before 2016-06-04). So it's a little too late to be
> fixed (unless someone could send me a time machine).
>
> On the other hand, this means any WRITE would easily fail due to
> corrupted extent tree, but your fs should be OK if mounted RO, thus you
> could copy your data out.
>

Do you have a procedure to copy all subvolumes & skip error ? (I have
~200 snapshots)

> >
> >>
> >> Please provide the following dump:
> >>
> >> # btrfs inspect dump-tree -t root /dev/sda2
> >> # btrfs inspect dump-tree -b 6020235362304 /dev/sda2
> >
> > All requested dump are in this repo:
> > https://github.com/champtar/debugraidbtrfs
> >
> [snip]
> >>
> >> If it's the only problem, "btrfs check --repair" indeed could fix it.
> >
> > Also available in https://github.com/champtar/debugraidbtrfs, here
> > "btrfs check --readonly /dev/sda2" output
> > 
> > checking extents
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad key ordering 63 64
> > bad block 6020235362304
> > ERROR: errors found in extent allocation tree or chunk allocation
> > checking free space cache
> > there is no free space entry for 6011561750528-5942842273792
> > there is no free space entry for 6011561750528-6012044050432
> > cache appears valid but isn't 6010970308608
> > there is no free space entry for 6015529828352-5946810351616
> > there is no free space entry for 6015529828352-6016339017728
> > cache appears valid but isn't 6015265275904
> > there is no free space entry for 6139476623360-6070757146624
> > there is no free space entry for 6139476623360-6139852881920
> > cache appears valid but isn't 6138779140096
> > ERROR: errors found in free space cache
> > Checking filesystem on /dev/sda2
> > UUID: 4917db5e-fc20-4369-9556-83082a32d4cd
> > found 1321120776195 bytes used, error(s) found
> > total csum bytes: 0
> > total tree bytes: 1163182080
> > total fs tree bytes: 0
> > total extent tree bytes: 1161740288
> > btree space waste bytes: 290512355
> > file data blocks allocated: 618135552
> >  referenced 618135552
> > 
>
> 

Re: RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-04 Thread Qu Wenruo


On 2018/9/4 下午7:53, Etienne Champetier wrote:
> Hi Qu,
> 
> Le lun. 3 sept. 2018 à 20:27, Qu Wenruo  a écrit :
>>
>> On 2018/9/3 下午10:18, Etienne Champetier wrote:
>>> Hello btfrs hackers,
>>>
>>> I have a computer acting as backup server with BTRFS RAID1, and I
>>> would like to know the different options to rebuild this RAID
>>> (I saw this thread
>>> https://www.spinics.net/lists/linux-btrfs/msg68679.html but there was
>>> no raid 1)
>>>
>>> # uname -a
>>> Linux servmaison 4.4.0-134-generic #160-Ubuntu SMP Wed Aug 15 14:58:00
>>> UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> # btrfs --version
>>> btrfs-progs v4.4
>>>
>>> # dmesg
>>> ...
>>> [ 1955.581972] BTRFS critical (device sda2): corrupt leaf, bad key
>>> order: block=6020235362304,root=1, slot=63
>>> [ 1955.582299] BTRFS critical (device sda2): corrupt leaf, bad key
>>> order: block=6020235362304,root=1, slot=63
> 
> Now running a Fedora 28 install kernel
> 
> # uname -a
> Linux servmaison 4.16.3-301.fc28.x86_64 #1 SMP Mon Apr 23 21:59:58 UTC
> 2018 x86_64 x86_64 x86_64 GNU/Linux
> # btrfs --version
> btrfs-progs v4.15.1

Unfortunately, even for latest btrfs-progs release (v4.17.1, and even
devel branch), btrfs check will abort checking if free space cache is
corrupted.

So we didn't get any useful info from btrfs check.

Such diff would help you continue checking (if you really want, other
than starting salvaging your data)
--
diff --git a/check/main.c b/check/main.c
index b361cd7e26a0..4f720163221e 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9885,7 +9885,6 @@ int cmd_check(int argc, char **argv)
error("errors found in free space tree");
else
error("errors found in free space cache");
-   goto out;
}

/*
--


For dump tree block, the corrupted tree block belongs to extent tree.
Which could be a good news (depends on how you define GOOD news).

The corruption is not an easy fix, it's not just a swapped slot.
The corrupted slot (item 64, whole key objectid is 5946810351616) is way
beyond the extent data range, thus btrfs-progs can't fix it easily.

Considering how much bytenr difference there is and the generation gap
(53167 vs current generation 1555950), the bug happens a long long time
ago (days or weeks before 2016-06-04). So it's a little too late to be
fixed (unless someone could send me a time machine).

On the other hand, this means any WRITE would easily fail due to
corrupted extent tree, but your fs should be OK if mounted RO, thus you
could copy your data out.

> 
>>
>> Please provide the following dump:
>>
>> # btrfs inspect dump-tree -t root /dev/sda2
>> # btrfs inspect dump-tree -b 6020235362304 /dev/sda2
> 
> All requested dump are in this repo:
> https://github.com/champtar/debugraidbtrfs
> 
[snip]
>>
>> If it's the only problem, "btrfs check --repair" indeed could fix it.
> 
> Also available in https://github.com/champtar/debugraidbtrfs, here
> "btrfs check --readonly /dev/sda2" output
> 
> checking extents
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad key ordering 63 64
> bad block 6020235362304
> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache
> there is no free space entry for 6011561750528-5942842273792
> there is no free space entry for 6011561750528-6012044050432
> cache appears valid but isn't 6010970308608
> there is no free space entry for 6015529828352-5946810351616
> there is no free space entry for 6015529828352-6016339017728
> cache appears valid but isn't 6015265275904
> there is no free space entry for 6139476623360-6070757146624
> there is no free space entry for 6139476623360-6139852881920
> cache appears valid but isn't 6138779140096
> ERROR: errors found in free space cache
> Checking filesystem on /dev/sda2
> UUID: 4917db5e-fc20-4369-9556-83082a32d4cd
> found 1321120776195 bytes used, error(s) found
> total csum bytes: 0
> total tree bytes: 1163182080
> total fs tree bytes: 0
> total extent tree bytes: 1161740288
> btree space waste bytes: 290512355
> file data blocks allocated: 618135552
>  referenced 618135552
> 

As expected, btrfs-progs is unable to fix it.

> 
> Thanks
> Etienne
> 
> P.S: sorry for the initial duplicate email, it took a very long time
> to show up in https://www.spinics.net/lists/linux-btrfs/maillist.html,
> thought it was discarded as I was not subscribed to the list

It's pretty common, I even sometimes sent patches twice for the same reason.

And just another kindly note, for "btrfs check" or "btrfs inspect

Re: RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-03 Thread Qu Wenruo


On 2018/9/3 下午10:18, Etienne Champetier wrote:
> Hello btfrs hackers,
> 
> I have a computer acting as backup server with BTRFS RAID1, and I
> would like to know the different options to rebuild this RAID
> (I saw this thread
> https://www.spinics.net/lists/linux-btrfs/msg68679.html but there was
> no raid 1)
> 
> # uname -a
> Linux servmaison 4.4.0-134-generic #160-Ubuntu SMP Wed Aug 15 14:58:00
> UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
> 
> # btrfs --version
> btrfs-progs v4.4
> 
> # dmesg
> ...
> [ 1955.581972] BTRFS critical (device sda2): corrupt leaf, bad key
> order: block=6020235362304,root=1, slot=63
> [ 1955.582299] BTRFS critical (device sda2): corrupt leaf, bad key
> order: block=6020235362304,root=1, slot=63

Please provide the following dump:

# btrfs inspect dump-tree -t root /dev/sda2
# btrfs inspect dump-tree -b 6020235362304 /dev/sda2

The first one is to inspect current root tree to find any corruption.
The second one is to show that corrupted tree block, and compare with
first output to make sure if it's already in the committed tree.

And the critical error message already shows what's causing the bug,
your root tree is corrupted, some keys are not in correct order.

All the following errors could be caused by this problem.

[snip]
> 
> # btrfs fi show /
> Label: none  uuid: 4917db5e-fc20-4369-9556-83082a32d4cd
> Total devices 2 FS bytes used 2.25TiB
> devid1 size 3.64TiB used 2.34TiB path /dev/sda2
> devid2 size 3.64TiB used 2.34TiB path /dev/sdb2
> 
> # btrfs device stats /
> [/dev/sda2].write_io_errs   0
> [/dev/sda2].read_io_errs0
> [/dev/sda2].flush_io_errs   0
> [/dev/sda2].corruption_errs 0
> [/dev/sda2].generation_errs 0
> [/dev/sdb2].write_io_errs   0
> [/dev/sdb2].read_io_errs0
> [/dev/sdb2].flush_io_errs   0
> [/dev/sdb2].corruption_errs 0
> [/dev/sdb2].generation_errs 0
> 
> device stats report no errors :(
> 
> # btrfs fi df /
> Data, RAID1: total=2.32TiB, used=2.23TiB
> System, RAID1: total=96.00MiB, used=368.00KiB
> Metadata, RAID1: total=22.00GiB, used=19.12GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> # btrfs scrub status /
> scrub status for 4917db5e-fc20-4369-9556-83082a32d4cd
> scrub started at Mon Sep  3 05:32:52 2018, interrupted after
> 00:27:35, not running
> total bytes scrubbed: 514.05GiB with 0 errors
> 
> I've already tried 2 times to run btrfs scrub (after reboot), but it
> stops before the end, with the previous dmesg error
> 
> My question is what is the safest way to rebuild this BTRFS RAID1?

It depends on the inspect dump-tree output.

> I haven't tried "btrfs check --repair" yet

We need "btrfs check --readonly" output to verify if the bad key order
is the only problem.

If it's the only problem, "btrfs check --repair" indeed could fix it.

Thanks,
Qu

> (I can boot on a more up to date Linux live if it helps)
> 
> Thanks
> Etienne
> 



signature.asc
Description: OpenPGP digital signature


Re: RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-03 Thread Chris Murphy
On Mon, Sep 3, 2018 at 7:52 AM, Etienne Champetier
 wrote:
> Hello linux-btfrs,
>
> I have a computer acting as backup server with BTRFS RAID1, and I
> would like to know the different options to rebuild this RAID
> (I saw this thread
> https://www.spinics.net/lists/linux-btrfs/msg68679.html but there was
> no raid 1)
>
> # uname -a
> Linux servmaison 4.4.0-134-generic #160-Ubuntu SMP Wed Aug 15 14:58:00
> UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>
> # btrfs --version
> btrfs-progs v4.4
>
> # dmesg
> ...
> [ 1955.581972] BTRFS critical (device sda2): corrupt leaf, bad key
> order: block=6020235362304,root=1, slot=63
> [ 1955.582299] BTRFS critical (device sda2): corrupt leaf, bad key
> order: block=6020235362304,root=1, slot=63
> [ 1955.582414] [ cut here ]
> [ 1955.582452] WARNING: CPU: 0 PID: 2071 at
> /build/linux-osVS4h/linux-4.4.0/fs/btrfs/extent-tree.c:2930
> btrfs_run_delayed_refs+0x26b/0x2a0 [btrfs]()
> [ 1955.582454] BTRFS: Transaction aborted (error -5)
> [ 1955.582456] Modules linked in: eeepc_wmi asus_wmi sparse_keymap
> ppdev intel_rapl x86_pkg_temp_thermal snd_hda_codec_hdmi
> snd_hda_codec_realtek intel_powerclamp snd_hda_codec_generic coretemp
> snd_hda_intel snd_hda_codec bridge kvm_intel crct10dif_pclmul stp
> crc32_pclmul kvm snd_hda_core snd_hwdep llc ghash_clmulni_intel
> irqbypass snd_pcm input_leds serio_raw snd_timer 8250_fintek snd
> mei_me ie31200_edac mei lpc_ich mac_hid soundcore edac_core parport_pc
> shpchp parport ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core
> ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4
> btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
> async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
> hid_generic usbhid pata_acpi hid i915 aesni_intel i2c_algo_bit
> aes_x86_64 glue_helper
> [ 1955.582509]  drm_kms_helper lrw gf128mul ablk_helper syscopyarea
> cryptd sysfillrect sysimgblt fb_sys_fops ahci drm r8169 libahci mii
> wmi fjes video
> [ 1955.582522] CPU: 0 PID: 2071 Comm: kworker/u8:1 Not tainted
> 4.4.0-134-generic #160-Ubuntu
> [ 1955.582524] Hardware name: System manufacturer System Product
> Name/P8H77-M PRO, BIOS 1003 10/12/2012
> [ 1955.582546] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
> [ 1955.582548]  0286 e1236dd013ef459f 88034938fc98
> 814039f3
> [ 1955.582552]  88034938fce0 c03ff478 88034938fcd0
> 81084982
> [ 1955.582555]  880405a62980 8804076f7800 8803e6c6d0a0
> 02ec
> [ 1955.582558] Call Trace:
> [ 1955.582566]  [] dump_stack+0x63/0x90
> [ 1955.582571]  [] warn_slowpath_common+0x82/0xc0
> [ 1955.582574]  [] warn_slowpath_fmt+0x5c/0x80
> [ 1955.582592]  [] ?
> __btrfs_run_delayed_refs+0xce7/0x1220 [btrfs]
> [ 1955.582608]  [] btrfs_run_delayed_refs+0x26b/0x2a0 
> [btrfs]
> [ 1955.582624]  [] delayed_ref_async_start+0x37/0x90 [btrfs]
> [ 1955.582643]  [] btrfs_scrubparity_helper+0xcf/0x320 
> [btrfs]
> [ 1955.582661]  [] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
> [ 1955.582666]  [] process_one_work+0x16b/0x490
> [ 1955.582670]  [] worker_thread+0x4b/0x4d0
> [ 1955.582674]  [] ? process_one_work+0x490/0x490
> [ 1955.582677]  [] kthread+0xe7/0x100
> [ 1955.582680]  [] ? kthread_create_on_node+0x1e0/0x1e0
> [ 1955.582685]  [] ret_from_fork+0x55/0x80
> [ 1955.582689]  [] ? kthread_create_on_node+0x1e0/0x1e0
> [ 1955.582691] ---[ end trace cc65b5ec2d2430fc ]---
> [ 1955.582694] BTRFS: error (device sda2) in
> btrfs_run_delayed_refs:2930: errno=-5 IO failure
> [ 1955.582743] BTRFS info (device sda2): forced readonly
> [ 1955.595017] BTRFS critical (device sda2): corrupt leaf, bad key
> order: block=6020235362304,root=1, slot=63
> [ 1955.595106] BTRFS: error (device sda2) in
> btrfs_run_delayed_refs:2930: errno=-5 IO failure
> [ 1955.604374] BTRFS critical (device sda2): corrupt leaf, bad key
> order: block=6020235362304,root=1, slot=63
> [ 1955.60] BTRFS: error (device sda2) in
> btrfs_run_delayed_refs:2930: errno=-5 IO failure
> [ 1955.605331] BTRFS warning (device sda2): failed setting block group
> ro, ret=-30
> [ 1955.605334] BTRFS warning (device sda2): failed setting block group
> ro, ret=-30
>
> # btrfs fi show /
> Label: none  uuid: 4917db5e-fc20-4369-9556-83082a32d4cd
> Total devices 2 FS bytes used 2.25TiB
> devid1 size 3.64TiB used 2.34TiB path /dev/sda2
> devid2 size 3.64TiB used 2.34TiB path /dev/sdb2
>
> # btrfs device stats /
> [/dev/sda2].write_io_errs   0
> [/dev/sda2].read_io_errs0
> [/dev/sda2].flush_io_errs   0
> [/dev/sda2].corruption_errs 0
> [/dev/sda2].generation_errs 0
> [/dev/sdb2].write_io_errs   0
> [/dev/sdb2].read_io_errs0
> [/dev/sdb2].flush_io_errs   0
> [/dev/sdb2].corruption_errs 0
> [/dev/sdb2].generation_errs 0
>
> device stats report no errors :(
>
> # btrfs fi df /
> Data, RAID1: total=2.32TiB, used=2.23TiB
> System, RAID1: total=96.00MiB, used=368.00KiB
> Metadata, RAID1: total=22.00GiB, used=19.12GiB
> 

RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-03 Thread Etienne Champetier
Hello btfrs hackers,

I have a computer acting as backup server with BTRFS RAID1, and I
would like to know the different options to rebuild this RAID
(I saw this thread
https://www.spinics.net/lists/linux-btrfs/msg68679.html but there was
no raid 1)

# uname -a
Linux servmaison 4.4.0-134-generic #160-Ubuntu SMP Wed Aug 15 14:58:00
UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

# btrfs --version
btrfs-progs v4.4

# dmesg
...
[ 1955.581972] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.582299] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.582414] [ cut here ]
[ 1955.582452] WARNING: CPU: 0 PID: 2071 at
/build/linux-osVS4h/linux-4.4.0/fs/btrfs/extent-tree.c:2930
btrfs_run_delayed_refs+0x26b/0x2a0 [btrfs]()
[ 1955.582454] BTRFS: Transaction aborted (error -5)
[ 1955.582456] Modules linked in: eeepc_wmi asus_wmi sparse_keymap
ppdev intel_rapl x86_pkg_temp_thermal snd_hda_codec_hdmi
snd_hda_codec_realtek intel_powerclamp snd_hda_codec_generic coretemp
snd_hda_intel snd_hda_codec bridge kvm_intel crct10dif_pclmul stp
crc32_pclmul kvm snd_hda_core snd_hwdep llc ghash_clmulni_intel
irqbypass snd_pcm input_leds serio_raw snd_timer 8250_fintek snd
mei_me ie31200_edac mei lpc_ich mac_hid soundcore edac_core parport_pc
shpchp parport ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core
ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4
btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
hid_generic usbhid pata_acpi hid i915 aesni_intel i2c_algo_bit
aes_x86_64 glue_helper
[ 1955.582509]  drm_kms_helper lrw gf128mul ablk_helper syscopyarea
cryptd sysfillrect sysimgblt fb_sys_fops ahci drm r8169 libahci mii
wmi fjes video
[ 1955.582522] CPU: 0 PID: 2071 Comm: kworker/u8:1 Not tainted
4.4.0-134-generic #160-Ubuntu
[ 1955.582524] Hardware name: System manufacturer System Product
Name/P8H77-M PRO, BIOS 1003 10/12/2012
[ 1955.582546] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[ 1955.582548]  0286 e1236dd013ef459f 88034938fc98
814039f3
[ 1955.582552]  88034938fce0 c03ff478 88034938fcd0
81084982
[ 1955.582555]  880405a62980 8804076f7800 8803e6c6d0a0
02ec
[ 1955.582558] Call Trace:
[ 1955.582566]  [] dump_stack+0x63/0x90
[ 1955.582571]  [] warn_slowpath_common+0x82/0xc0
[ 1955.582574]  [] warn_slowpath_fmt+0x5c/0x80
[ 1955.582592]  [] ?
__btrfs_run_delayed_refs+0xce7/0x1220 [btrfs]
[ 1955.582608]  [] btrfs_run_delayed_refs+0x26b/0x2a0 [btrfs]
[ 1955.582624]  [] delayed_ref_async_start+0x37/0x90 [btrfs]
[ 1955.582643]  [] btrfs_scrubparity_helper+0xcf/0x320 [btrfs]
[ 1955.582661]  [] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[ 1955.582666]  [] process_one_work+0x16b/0x490
[ 1955.582670]  [] worker_thread+0x4b/0x4d0
[ 1955.582674]  [] ? process_one_work+0x490/0x490
[ 1955.582677]  [] kthread+0xe7/0x100
[ 1955.582680]  [] ? kthread_create_on_node+0x1e0/0x1e0
[ 1955.582685]  [] ret_from_fork+0x55/0x80
[ 1955.582689]  [] ? kthread_create_on_node+0x1e0/0x1e0
[ 1955.582691] ---[ end trace cc65b5ec2d2430fc ]---
[ 1955.582694] BTRFS: error (device sda2) in
btrfs_run_delayed_refs:2930: errno=-5 IO failure
[ 1955.582743] BTRFS info (device sda2): forced readonly
[ 1955.595017] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.595106] BTRFS: error (device sda2) in
btrfs_run_delayed_refs:2930: errno=-5 IO failure
[ 1955.604374] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.60] BTRFS: error (device sda2) in
btrfs_run_delayed_refs:2930: errno=-5 IO failure
[ 1955.605331] BTRFS warning (device sda2): failed setting block group
ro, ret=-30
[ 1955.605334] BTRFS warning (device sda2): failed setting block group
ro, ret=-30

# btrfs fi show /
Label: none  uuid: 4917db5e-fc20-4369-9556-83082a32d4cd
Total devices 2 FS bytes used 2.25TiB
devid1 size 3.64TiB used 2.34TiB path /dev/sda2
devid2 size 3.64TiB used 2.34TiB path /dev/sdb2

# btrfs device stats /
[/dev/sda2].write_io_errs   0
[/dev/sda2].read_io_errs0
[/dev/sda2].flush_io_errs   0
[/dev/sda2].corruption_errs 0
[/dev/sda2].generation_errs 0
[/dev/sdb2].write_io_errs   0
[/dev/sdb2].read_io_errs0
[/dev/sdb2].flush_io_errs   0
[/dev/sdb2].corruption_errs 0
[/dev/sdb2].generation_errs 0

device stats report no errors :(

# btrfs fi df /
Data, RAID1: total=2.32TiB, used=2.23TiB
System, RAID1: total=96.00MiB, used=368.00KiB
Metadata, RAID1: total=22.00GiB, used=19.12GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

# btrfs scrub status /
scrub status for 4917db5e-fc20-4369-9556-83082a32d4cd
scrub started at Mon Sep  3 05:32:52 2018, interrupted after
00:27:35, not running
total bytes scrubbed: 514.05GiB with 0 errors

I've already tried 

RAID1 & BTRFS critical (device sda2): corrupt leaf, bad key order

2018-09-03 Thread Etienne Champetier
Hello linux-btfrs,

I have a computer acting as backup server with BTRFS RAID1, and I
would like to know the different options to rebuild this RAID
(I saw this thread
https://www.spinics.net/lists/linux-btrfs/msg68679.html but there was
no raid 1)

# uname -a
Linux servmaison 4.4.0-134-generic #160-Ubuntu SMP Wed Aug 15 14:58:00
UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

# btrfs --version
btrfs-progs v4.4

# dmesg
...
[ 1955.581972] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.582299] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.582414] [ cut here ]
[ 1955.582452] WARNING: CPU: 0 PID: 2071 at
/build/linux-osVS4h/linux-4.4.0/fs/btrfs/extent-tree.c:2930
btrfs_run_delayed_refs+0x26b/0x2a0 [btrfs]()
[ 1955.582454] BTRFS: Transaction aborted (error -5)
[ 1955.582456] Modules linked in: eeepc_wmi asus_wmi sparse_keymap
ppdev intel_rapl x86_pkg_temp_thermal snd_hda_codec_hdmi
snd_hda_codec_realtek intel_powerclamp snd_hda_codec_generic coretemp
snd_hda_intel snd_hda_codec bridge kvm_intel crct10dif_pclmul stp
crc32_pclmul kvm snd_hda_core snd_hwdep llc ghash_clmulni_intel
irqbypass snd_pcm input_leds serio_raw snd_timer 8250_fintek snd
mei_me ie31200_edac mei lpc_ich mac_hid soundcore edac_core parport_pc
shpchp parport ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core
ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4
btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
hid_generic usbhid pata_acpi hid i915 aesni_intel i2c_algo_bit
aes_x86_64 glue_helper
[ 1955.582509]  drm_kms_helper lrw gf128mul ablk_helper syscopyarea
cryptd sysfillrect sysimgblt fb_sys_fops ahci drm r8169 libahci mii
wmi fjes video
[ 1955.582522] CPU: 0 PID: 2071 Comm: kworker/u8:1 Not tainted
4.4.0-134-generic #160-Ubuntu
[ 1955.582524] Hardware name: System manufacturer System Product
Name/P8H77-M PRO, BIOS 1003 10/12/2012
[ 1955.582546] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[ 1955.582548]  0286 e1236dd013ef459f 88034938fc98
814039f3
[ 1955.582552]  88034938fce0 c03ff478 88034938fcd0
81084982
[ 1955.582555]  880405a62980 8804076f7800 8803e6c6d0a0
02ec
[ 1955.582558] Call Trace:
[ 1955.582566]  [] dump_stack+0x63/0x90
[ 1955.582571]  [] warn_slowpath_common+0x82/0xc0
[ 1955.582574]  [] warn_slowpath_fmt+0x5c/0x80
[ 1955.582592]  [] ?
__btrfs_run_delayed_refs+0xce7/0x1220 [btrfs]
[ 1955.582608]  [] btrfs_run_delayed_refs+0x26b/0x2a0 [btrfs]
[ 1955.582624]  [] delayed_ref_async_start+0x37/0x90 [btrfs]
[ 1955.582643]  [] btrfs_scrubparity_helper+0xcf/0x320 [btrfs]
[ 1955.582661]  [] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[ 1955.582666]  [] process_one_work+0x16b/0x490
[ 1955.582670]  [] worker_thread+0x4b/0x4d0
[ 1955.582674]  [] ? process_one_work+0x490/0x490
[ 1955.582677]  [] kthread+0xe7/0x100
[ 1955.582680]  [] ? kthread_create_on_node+0x1e0/0x1e0
[ 1955.582685]  [] ret_from_fork+0x55/0x80
[ 1955.582689]  [] ? kthread_create_on_node+0x1e0/0x1e0
[ 1955.582691] ---[ end trace cc65b5ec2d2430fc ]---
[ 1955.582694] BTRFS: error (device sda2) in
btrfs_run_delayed_refs:2930: errno=-5 IO failure
[ 1955.582743] BTRFS info (device sda2): forced readonly
[ 1955.595017] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.595106] BTRFS: error (device sda2) in
btrfs_run_delayed_refs:2930: errno=-5 IO failure
[ 1955.604374] BTRFS critical (device sda2): corrupt leaf, bad key
order: block=6020235362304,root=1, slot=63
[ 1955.60] BTRFS: error (device sda2) in
btrfs_run_delayed_refs:2930: errno=-5 IO failure
[ 1955.605331] BTRFS warning (device sda2): failed setting block group
ro, ret=-30
[ 1955.605334] BTRFS warning (device sda2): failed setting block group
ro, ret=-30

# btrfs fi show /
Label: none  uuid: 4917db5e-fc20-4369-9556-83082a32d4cd
Total devices 2 FS bytes used 2.25TiB
devid1 size 3.64TiB used 2.34TiB path /dev/sda2
devid2 size 3.64TiB used 2.34TiB path /dev/sdb2

# btrfs device stats /
[/dev/sda2].write_io_errs   0
[/dev/sda2].read_io_errs0
[/dev/sda2].flush_io_errs   0
[/dev/sda2].corruption_errs 0
[/dev/sda2].generation_errs 0
[/dev/sdb2].write_io_errs   0
[/dev/sdb2].read_io_errs0
[/dev/sdb2].flush_io_errs   0
[/dev/sdb2].corruption_errs 0
[/dev/sdb2].generation_errs 0

device stats report no errors :(

# btrfs fi df /
Data, RAID1: total=2.32TiB, used=2.23TiB
System, RAID1: total=96.00MiB, used=368.00KiB
Metadata, RAID1: total=22.00GiB, used=19.12GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

# btrfs scrub status /
scrub status for 4917db5e-fc20-4369-9556-83082a32d4cd
scrub started at Mon Sep  3 05:32:52 2018, interrupted after
00:27:35, not running
total bytes scrubbed: 514.05GiB with 0 errors

I've already tried 2