On Tue, Feb 23, 2021 at 10:05 AM Josef Bacik <jo...@toxicpanda.com> wrote:
>
> On 2/22/21 11:03 PM, Neal Gompa wrote:
> > On Mon, Feb 22, 2021 at 2:34 PM Josef Bacik <jo...@toxicpanda.com> wrote:
> >>
> >> On 2/21/21 1:27 PM, Neal Gompa wrote:
> >>> On Wed, Feb 17, 2021 at 11:44 AM Josef Bacik <jo...@toxicpanda.com> wrote:
> >>>>
> >>>> On 2/17/21 11:29 AM, Neal Gompa wrote:
> >>>>> On Wed, Feb 17, 2021 at 9:59 AM Josef Bacik <jo...@toxicpanda.com> 
> >>>>> wrote:
> >>>>>>
> >>>>>> On 2/17/21 9:50 AM, Neal Gompa wrote:
> >>>>>>> On Wed, Feb 17, 2021 at 9:36 AM Josef Bacik <jo...@toxicpanda.com> 
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> On 2/16/21 9:05 PM, Neal Gompa wrote:
> >>>>>>>>> On Tue, Feb 16, 2021 at 4:24 PM Josef Bacik <jo...@toxicpanda.com> 
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> On 2/16/21 3:29 PM, Neal Gompa wrote:
> >>>>>>>>>>> On Tue, Feb 16, 2021 at 1:11 PM Josef Bacik 
> >>>>>>>>>>> <jo...@toxicpanda.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 2/16/21 11:27 AM, Neal Gompa wrote:
> >>>>>>>>>>>>> On Tue, Feb 16, 2021 at 10:19 AM Josef Bacik 
> >>>>>>>>>>>>> <jo...@toxicpanda.com> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On 2/14/21 3:25 PM, Neal Gompa wrote:
> >>>>>>>>>>>>>>> Hey all,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> So one of my main computers recently had a disk controller 
> >>>>>>>>>>>>>>> failure
> >>>>>>>>>>>>>>> that caused my machine to freeze. After rebooting, Btrfs 
> >>>>>>>>>>>>>>> refuses to
> >>>>>>>>>>>>>>> mount. I tried to do a mount and the following errors show up 
> >>>>>>>>>>>>>>> in the
> >>>>>>>>>>>>>>> journal:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS info (device 
> >>>>>>>>>>>>>>>> sda3): disk space caching is enabled
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS info (device 
> >>>>>>>>>>>>>>>> sda3): has skinny extents
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS critical 
> >>>>>>>>>>>>>>>> (device sda3): corrupt leaf: root=401 block=796082176 
> >>>>>>>>>>>>>>>> slot=15 ino=203657, invalid inode transid: has 888896 expect 
> >>>>>>>>>>>>>>>> [0, 888895]
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error (device 
> >>>>>>>>>>>>>>>> sda3): block=796082176 read time tree block corruption 
> >>>>>>>>>>>>>>>> detected
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS critical 
> >>>>>>>>>>>>>>>> (device sda3): corrupt leaf: root=401 block=796082176 
> >>>>>>>>>>>>>>>> slot=15 ino=203657, invalid inode transid: has 888896 expect 
> >>>>>>>>>>>>>>>> [0, 888895]
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error (device 
> >>>>>>>>>>>>>>>> sda3): block=796082176 read time tree block corruption 
> >>>>>>>>>>>>>>>> detected
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS warning (device 
> >>>>>>>>>>>>>>>> sda3): couldn't read tree root
> >>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error (device 
> >>>>>>>>>>>>>>>> sda3): open_ctree failed
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I've tried to do -o recovery,ro mount and get the same issue. 
> >>>>>>>>>>>>>>> I can't
> >>>>>>>>>>>>>>> seem to find any reasonably good information on how to do 
> >>>>>>>>>>>>>>> recovery in
> >>>>>>>>>>>>>>> this scenario, even to just recover enough to copy data off.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm on Fedora 33, the system was on Linux kernel version 
> >>>>>>>>>>>>>>> 5.9.16 and
> >>>>>>>>>>>>>>> the Fedora 33 live ISO I'm using has Linux kernel version 
> >>>>>>>>>>>>>>> 5.10.14. I'm
> >>>>>>>>>>>>>>> using btrfs-progs v5.10.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Can anyone help?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Can you try
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> btrfs check --clear-space-cache v1 /dev/whatever
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> That should fix the inode generation thing so it's sane, and 
> >>>>>>>>>>>>>> then the tree
> >>>>>>>>>>>>>> checker will allow the fs to be read, hopefully.  If not we 
> >>>>>>>>>>>>>> can work out some
> >>>>>>>>>>>>>> other magic.  Thanks,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Josef
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I got the same error as I did with btrfs-check --readonly...
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Oh lovely, what does btrfs check --readonly --backup do?
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> No dice...
> >>>>>>>>>>>
> >>>>>>>>>>> # btrfs check --readonly --backup /dev/sda3
> >>>>>>>>>>>> Opening filesystem to check...
> >>>>>>>>>>>> parent transid verify failed on 791281664 wanted 888893 found 
> >>>>>>>>>>>> 888895
> >>>>>>>>>>>> parent transid verify failed on 791281664 wanted 888893 found 
> >>>>>>>>>>>> 888895
> >>>>>>>>>>>> parent transid verify failed on 791281664 wanted 888893 found 
> >>>>>>>>>>>> 888895
> >>>>>>>>>>
> >>>>>>>>>> Hey look the block we're looking for, I wrote you some magic, just 
> >>>>>>>>>> pull
> >>>>>>>>>>
> >>>>>>>>>> https://github.com/josefbacik/btrfs-progs/tree/for-neal
> >>>>>>>>>>
> >>>>>>>>>> build, and then run
> >>>>>>>>>>
> >>>>>>>>>> btrfs-neal-magic /dev/sda3 791281664 888895
> >>>>>>>>>>
> >>>>>>>>>> This will force us to point at the old root with (hopefully) the 
> >>>>>>>>>> right bytenr
> >>>>>>>>>> and gen, and then hopefully you'll be able to recover from there.  
> >>>>>>>>>> This is kind
> >>>>>>>>>> of saucy, so yolo, but I can undo it if it makes things worse.  
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> # btrfs check --readonly /dev/sda3
> >>>>>>>>>> Opening filesystem to check...
> >>>>>>>>>> ERROR: could not setup extent tree
> >>>>>>>>>> ERROR: cannot open file system
> >>>>>>>>> # btrfs check --clear-space-cache v1 /dev/sda3
> >>>>>>>>>> Opening filesystem to check...
> >>>>>>>>>> ERROR: could not setup extent tree
> >>>>>>>>>> ERROR: cannot open file system
> >>>>>>>>>
> >>>>>>>>> It's better, but still no dice... :(
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Hmm it's not telling us what's wrong with the extent tree, which is 
> >>>>>>>> annoying.
> >>>>>>>> Does mount -o rescue=all,ro work now that the root tree is normal?  
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>
> >>>>>>> Nope, I see this in the journal:
> >>>>>>>
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): 
> >>>>>>>> enabling all of the rescue options
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): 
> >>>>>>>> ignoring data csums
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): 
> >>>>>>>> ignoring bad roots
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): 
> >>>>>>>> disabling log replay at mount time
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): 
> >>>>>>>> disk space caching is enabled
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): has 
> >>>>>>>> skinny extents
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): 
> >>>>>>>> tree level mismatch detected, bytenr=791281664 level expected=1 has=2
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): 
> >>>>>>>> tree level mismatch detected, bytenr=791281664 level expected=1 has=2
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS warning (device sda3): 
> >>>>>>>> couldn't read tree root
> >>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): 
> >>>>>>>> open_ctree failed
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> Ok git pull for-neal, rebuild, then run
> >>>>>>
> >>>>>> btrfs-neal-magic /dev/sda3 791281664 888895 2
> >>>>>>
> >>>>>> I thought of this yesterday but in my head was like "naaahhhh, whats 
> >>>>>> the chances
> >>>>>> that the level doesn't match??".  Thanks,
> >>>>>>
> >>>>>
> >>>>> Tried rescue mount again after running that and got a stack trace in
> >>>>> the kernel, detailed in the following attached log.
> >>>>
> >>>> Huh I wonder how I didn't hit this when testing, I must have only tested 
> >>>> with
> >>>> zero'ing the extent root and the csum root.  You're going to have to 
> >>>> build a
> >>>> kernel with a fix for this
> >>>>
> >>>> https://paste.centos.org/view/7b48aaea
> >>>>
> >>>> and see if that gets you further.  Thanks,
> >>>>
> >>>
> >>> I built a kernel build as an RPM with your patch[1] and tried it.
> >>>
> >>> [root@fedora ~]# mount -t btrfs -o rescue=all,ro /dev/sdb3 /mnt
> >>> Killed
> >>>
> >>> The log from the journal is attached.
> >>
> >>
> >> Ahh crud my bad, this should do it
> >>
> >> https://paste.centos.org/view/ac2e61ef
> >>
> >
> > Patch doesn't apply (note it is patch 667 below):
>
> Ah sorry, should have just sent you an iterative patch.  You can take the 
> above
> patch and just delete the hunk from volumes.c as you already have that applied
> and then it'll work.  Thanks,
>

Failed with a weird error...?

[root@fedora ~]# mount -t btrfs -o rescue=all,ro /dev/sda3 /mnt
mount: /mnt: mount(2) system call failed: No such file or directory.

Journal log with traceback attached.



-- 
真実はいつも一つ!/ Always, there's only one truth!
Feb 24 09:12:41 fedora kernel: BTRFS info (device sda3): enabling all of the rescue options
Feb 24 09:12:41 fedora kernel: BTRFS info (device sda3): ignoring data csums
Feb 24 09:12:41 fedora kernel: BTRFS info (device sda3): ignoring bad roots
Feb 24 09:12:41 fedora kernel: BTRFS info (device sda3): disabling log replay at mount time
Feb 24 09:12:41 fedora kernel: BTRFS info (device sda3): disk space caching is enabled
Feb 24 09:12:41 fedora kernel: BTRFS info (device sda3): has skinny extents
Feb 24 09:12:41 fedora kernel: we tried to search with a NULL root
Feb 24 09:12:41 fedora kernel: CPU: 0 PID: 1760 Comm: mount Not tainted 5.11.0-155.nealbtrfstest.1.fc34.x86_64 #1
Feb 24 09:12:41 fedora kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020
Feb 24 09:12:41 fedora kernel: Call Trace:
Feb 24 09:12:41 fedora kernel:  dump_stack+0x6b/0x83
Feb 24 09:12:41 fedora kernel:  btrfs_search_slot.cold+0x11/0x1b
Feb 24 09:12:41 fedora kernel:  ? btrfs_init_dev_replace+0x36/0x450
Feb 24 09:12:41 fedora kernel:  btrfs_init_dev_replace+0x71/0x450
Feb 24 09:12:41 fedora kernel:  open_ctree+0x1054/0x1610
Feb 24 09:12:41 fedora kernel:  btrfs_mount_root.cold+0x13/0xfa
Feb 24 09:12:41 fedora kernel:  legacy_get_tree+0x27/0x40
Feb 24 09:12:41 fedora kernel:  vfs_get_tree+0x25/0xb0
Feb 24 09:12:41 fedora kernel:  vfs_kern_mount.part.0+0x71/0xb0
Feb 24 09:12:41 fedora kernel:  btrfs_mount+0x131/0x3d0
Feb 24 09:12:41 fedora kernel:  ? legacy_get_tree+0x27/0x40
Feb 24 09:12:41 fedora kernel:  ? btrfs_show_options+0x640/0x640
Feb 24 09:12:41 fedora kernel:  legacy_get_tree+0x27/0x40
Feb 24 09:12:41 fedora kernel:  vfs_get_tree+0x25/0xb0
Feb 24 09:12:41 fedora kernel:  path_mount+0x441/0xa80
Feb 24 09:12:41 fedora kernel:  __x64_sys_mount+0xf4/0x130
Feb 24 09:12:41 fedora kernel:  do_syscall_64+0x33/0x40
Feb 24 09:12:41 fedora kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Feb 24 09:12:41 fedora kernel: RIP: 0033:0x7f644730352e
Feb 24 09:12:41 fedora kernel: Code: 48 8b 0d 45 19 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 12 19 0c 00 f7 d8 64 89 01 48
Feb 24 09:12:41 fedora kernel: RSP: 002b:00007ffd338bfac8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
Feb 24 09:12:41 fedora kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f644730352e
Feb 24 09:12:41 fedora kernel: RDX: 000055d4c6983690 RSI: 000055d4c6983730 RDI: 000055d4c69836b0
Feb 24 09:12:41 fedora kernel: RBP: 000055d4c6983460 R08: 000055d4c69836f0 R09: 00007f64473c5a60
Feb 24 09:12:41 fedora kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
Feb 24 09:12:41 fedora kernel: R13: 000055d4c69836b0 R14: 000055d4c6983690 R15: 000055d4c6983460
Feb 24 09:12:41 fedora kernel: BTRFS warning (device sda3): failed to read fs tree: -2
Feb 24 09:12:41 fedora kernel: BTRFS error (device sda3): open_ctree failed

Reply via email to