On 2/24/21 9:23 AM, Neal Gompa wrote:
On Tue, Feb 23, 2021 at 10:05 AM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/22/21 11:03 PM, Neal Gompa wrote:
On Mon, Feb 22, 2021 at 2:34 PM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/21/21 1:27 PM, Neal Gompa wrote:
On Wed, Feb 17, 2021 at 11:44 AM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/17/21 11:29 AM, Neal Gompa wrote:
On Wed, Feb 17, 2021 at 9:59 AM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/17/21 9:50 AM, Neal Gompa wrote:
On Wed, Feb 17, 2021 at 9:36 AM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/16/21 9:05 PM, Neal Gompa wrote:
On Tue, Feb 16, 2021 at 4:24 PM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/16/21 3:29 PM, Neal Gompa wrote:
On Tue, Feb 16, 2021 at 1:11 PM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/16/21 11:27 AM, Neal Gompa wrote:
On Tue, Feb 16, 2021 at 10:19 AM Josef Bacik <jo...@toxicpanda.com> wrote:

On 2/14/21 3:25 PM, Neal Gompa wrote:
Hey all,

So one of my main computers recently had a disk controller failure
that caused my machine to freeze. After rebooting, Btrfs refuses to
mount. I tried to do a mount and the following errors show up in the
journal:

Feb 14 15:20:49 localhost-live kernel: BTRFS info (device sda3): disk space 
caching is enabled
Feb 14 15:20:49 localhost-live kernel: BTRFS info (device sda3): has skinny 
extents
Feb 14 15:20:49 localhost-live kernel: BTRFS critical (device sda3): corrupt 
leaf: root=401 block=796082176 slot=15 ino=203657, invalid inode transid: has 
888896 expect [0, 888895]
Feb 14 15:20:49 localhost-live kernel: BTRFS error (device sda3): 
block=796082176 read time tree block corruption detected
Feb 14 15:20:49 localhost-live kernel: BTRFS critical (device sda3): corrupt 
leaf: root=401 block=796082176 slot=15 ino=203657, invalid inode transid: has 
888896 expect [0, 888895]
Feb 14 15:20:49 localhost-live kernel: BTRFS error (device sda3): 
block=796082176 read time tree block corruption detected
Feb 14 15:20:49 localhost-live kernel: BTRFS warning (device sda3): couldn't 
read tree root
Feb 14 15:20:49 localhost-live kernel: BTRFS error (device sda3): open_ctree 
failed

I've tried to do -o recovery,ro mount and get the same issue. I can't
seem to find any reasonably good information on how to do recovery in
this scenario, even to just recover enough to copy data off.

I'm on Fedora 33, the system was on Linux kernel version 5.9.16 and
the Fedora 33 live ISO I'm using has Linux kernel version 5.10.14. I'm
using btrfs-progs v5.10.

Can anyone help?

Can you try

btrfs check --clear-space-cache v1 /dev/whatever

That should fix the inode generation thing so it's sane, and then the tree
checker will allow the fs to be read, hopefully.  If not we can work out some
other magic.  Thanks,

Josef

I got the same error as I did with btrfs-check --readonly...


Oh lovely, what does btrfs check --readonly --backup do?


No dice...

# btrfs check --readonly --backup /dev/sda3
Opening filesystem to check...
parent transid verify failed on 791281664 wanted 888893 found 888895
parent transid verify failed on 791281664 wanted 888893 found 888895
parent transid verify failed on 791281664 wanted 888893 found 888895

Hey look the block we're looking for, I wrote you some magic, just pull

https://github.com/josefbacik/btrfs-progs/tree/for-neal

build, and then run

btrfs-neal-magic /dev/sda3 791281664 888895

This will force us to point at the old root with (hopefully) the right bytenr
and gen, and then hopefully you'll be able to recover from there.  This is kind
of saucy, so yolo, but I can undo it if it makes things worse.  Thanks,


# btrfs check --readonly /dev/sda3
Opening filesystem to check...
ERROR: could not setup extent tree
ERROR: cannot open file system
# btrfs check --clear-space-cache v1 /dev/sda3
Opening filesystem to check...
ERROR: could not setup extent tree
ERROR: cannot open file system

It's better, but still no dice... :(



Hmm it's not telling us what's wrong with the extent tree, which is annoying.
Does mount -o rescue=all,ro work now that the root tree is normal?  Thanks,


Nope, I see this in the journal:

Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): enabling all 
of the rescue options
Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): ignoring data 
csums
Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): ignoring bad 
roots
Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): disabling log 
replay at mount time
Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): disk space 
caching is enabled
Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): has skinny 
extents
Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): tree level 
mismatch detected, bytenr=791281664 level expected=1 has=2
Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): tree level 
mismatch detected, bytenr=791281664 level expected=1 has=2
Feb 17 09:49:40 localhost-live kernel: BTRFS warning (device sda3): couldn't 
read tree root
Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): open_ctree 
failed



Ok git pull for-neal, rebuild, then run

btrfs-neal-magic /dev/sda3 791281664 888895 2

I thought of this yesterday but in my head was like "naaahhhh, whats the chances
that the level doesn't match??".  Thanks,


Tried rescue mount again after running that and got a stack trace in
the kernel, detailed in the following attached log.

Huh I wonder how I didn't hit this when testing, I must have only tested with
zero'ing the extent root and the csum root.  You're going to have to build a
kernel with a fix for this

https://paste.centos.org/view/7b48aaea

and see if that gets you further.  Thanks,


I built a kernel build as an RPM with your patch[1] and tried it.

[root@fedora ~]# mount -t btrfs -o rescue=all,ro /dev/sdb3 /mnt
Killed

The log from the journal is attached.


Ahh crud my bad, this should do it

https://paste.centos.org/view/ac2e61ef


Patch doesn't apply (note it is patch 667 below):

Ah sorry, should have just sent you an iterative patch.  You can take the above
patch and just delete the hunk from volumes.c as you already have that applied
and then it'll work.  Thanks,


Failed with a weird error...?

[root@fedora ~]# mount -t btrfs -o rescue=all,ro /dev/sda3 /mnt
mount: /mnt: mount(2) system call failed: No such file or directory.

Journal log with traceback attached.

Last one maybe?

https://paste.centos.org/view/80edd6fd

thanks,

Josef

Reply via email to