On Wed, Feb 17, 2021 at 9:59 AM Josef Bacik <jo...@toxicpanda.com> wrote: > > On 2/17/21 9:50 AM, Neal Gompa wrote: > > On Wed, Feb 17, 2021 at 9:36 AM Josef Bacik <jo...@toxicpanda.com> wrote: > >> > >> On 2/16/21 9:05 PM, Neal Gompa wrote: > >>> On Tue, Feb 16, 2021 at 4:24 PM Josef Bacik <jo...@toxicpanda.com> wrote: > >>>> > >>>> On 2/16/21 3:29 PM, Neal Gompa wrote: > >>>>> On Tue, Feb 16, 2021 at 1:11 PM Josef Bacik <jo...@toxicpanda.com> > >>>>> wrote: > >>>>>> > >>>>>> On 2/16/21 11:27 AM, Neal Gompa wrote: > >>>>>>> On Tue, Feb 16, 2021 at 10:19 AM Josef Bacik <jo...@toxicpanda.com> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> On 2/14/21 3:25 PM, Neal Gompa wrote: > >>>>>>>>> Hey all, > >>>>>>>>> > >>>>>>>>> So one of my main computers recently had a disk controller failure > >>>>>>>>> that caused my machine to freeze. After rebooting, Btrfs refuses to > >>>>>>>>> mount. I tried to do a mount and the following errors show up in the > >>>>>>>>> journal: > >>>>>>>>> > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS info (device sda3): > >>>>>>>>>> disk space caching is enabled > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS info (device sda3): > >>>>>>>>>> has skinny extents > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS critical (device > >>>>>>>>>> sda3): corrupt leaf: root=401 block=796082176 slot=15 ino=203657, > >>>>>>>>>> invalid inode transid: has 888896 expect [0, 888895] > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error (device sda3): > >>>>>>>>>> block=796082176 read time tree block corruption detected > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS critical (device > >>>>>>>>>> sda3): corrupt leaf: root=401 block=796082176 slot=15 ino=203657, > >>>>>>>>>> invalid inode transid: has 888896 expect [0, 888895] > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error (device sda3): > >>>>>>>>>> block=796082176 read time tree block corruption detected > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS warning (device > >>>>>>>>>> sda3): couldn't read tree root > >>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error (device sda3): > >>>>>>>>>> open_ctree failed > >>>>>>>>> > >>>>>>>>> I've tried to do -o recovery,ro mount and get the same issue. I > >>>>>>>>> can't > >>>>>>>>> seem to find any reasonably good information on how to do recovery > >>>>>>>>> in > >>>>>>>>> this scenario, even to just recover enough to copy data off. > >>>>>>>>> > >>>>>>>>> I'm on Fedora 33, the system was on Linux kernel version 5.9.16 and > >>>>>>>>> the Fedora 33 live ISO I'm using has Linux kernel version 5.10.14. > >>>>>>>>> I'm > >>>>>>>>> using btrfs-progs v5.10. > >>>>>>>>> > >>>>>>>>> Can anyone help? > >>>>>>>> > >>>>>>>> Can you try > >>>>>>>> > >>>>>>>> btrfs check --clear-space-cache v1 /dev/whatever > >>>>>>>> > >>>>>>>> That should fix the inode generation thing so it's sane, and then > >>>>>>>> the tree > >>>>>>>> checker will allow the fs to be read, hopefully. If not we can work > >>>>>>>> out some > >>>>>>>> other magic. Thanks, > >>>>>>>> > >>>>>>>> Josef > >>>>>>> > >>>>>>> I got the same error as I did with btrfs-check --readonly... > >>>>>>> > >>>>>> > >>>>>> Oh lovely, what does btrfs check --readonly --backup do? > >>>>>> > >>>>> > >>>>> No dice... > >>>>> > >>>>> # btrfs check --readonly --backup /dev/sda3 > >>>>>> Opening filesystem to check... > >>>>>> parent transid verify failed on 791281664 wanted 888893 found 888895 > >>>>>> parent transid verify failed on 791281664 wanted 888893 found 888895 > >>>>>> parent transid verify failed on 791281664 wanted 888893 found 888895 > >>>> > >>>> Hey look the block we're looking for, I wrote you some magic, just pull > >>>> > >>>> https://github.com/josefbacik/btrfs-progs/tree/for-neal > >>>> > >>>> build, and then run > >>>> > >>>> btrfs-neal-magic /dev/sda3 791281664 888895 > >>>> > >>>> This will force us to point at the old root with (hopefully) the right > >>>> bytenr > >>>> and gen, and then hopefully you'll be able to recover from there. This > >>>> is kind > >>>> of saucy, so yolo, but I can undo it if it makes things worse. Thanks, > >>>> > >>> > >>> # btrfs check --readonly /dev/sda3 > >>>> Opening filesystem to check... > >>>> ERROR: could not setup extent tree > >>>> ERROR: cannot open file system > >>> # btrfs check --clear-space-cache v1 /dev/sda3 > >>>> Opening filesystem to check... > >>>> ERROR: could not setup extent tree > >>>> ERROR: cannot open file system > >>> > >>> It's better, but still no dice... :( > >>> > >>> > >> > >> Hmm it's not telling us what's wrong with the extent tree, which is > >> annoying. > >> Does mount -o rescue=all,ro work now that the root tree is normal? Thanks, > >> > > > > Nope, I see this in the journal: > > > >> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): enabling > >> all of the rescue options > >> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): ignoring > >> data csums > >> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): ignoring > >> bad roots > >> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): disabling > >> log replay at mount time > >> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): disk > >> space caching is enabled > >> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device sda3): has > >> skinny extents > >> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): tree > >> level mismatch detected, bytenr=791281664 level expected=1 has=2 > >> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): tree > >> level mismatch detected, bytenr=791281664 level expected=1 has=2 > >> Feb 17 09:49:40 localhost-live kernel: BTRFS warning (device sda3): > >> couldn't read tree root > >> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device sda3): > >> open_ctree failed > > > > > > Ok git pull for-neal, rebuild, then run > > btrfs-neal-magic /dev/sda3 791281664 888895 2 > > I thought of this yesterday but in my head was like "naaahhhh, whats the > chances > that the level doesn't match??". Thanks, >
Tried rescue mount again after running that and got a stack trace in the kernel, detailed in the following attached log. -- 真実はいつも一つ!/ Always, there's only one truth!
Feb 17 11:24:35 localhost-live kernel: BTRFS info (device sda3): enabling all of the rescue options Feb 17 11:24:35 localhost-live kernel: BTRFS info (device sda3): ignoring data csums Feb 17 11:24:35 localhost-live kernel: BTRFS info (device sda3): ignoring bad roots Feb 17 11:24:35 localhost-live kernel: BTRFS info (device sda3): disabling log replay at mount time Feb 17 11:24:35 localhost-live kernel: BTRFS info (device sda3): disk space caching is enabled Feb 17 11:24:35 localhost-live kernel: BTRFS info (device sda3): has skinny extents Feb 17 11:24:35 localhost-live kernel: BUG: kernel NULL pointer dereference, address: 0000000000000030 Feb 17 11:24:35 localhost-live kernel: #PF: supervisor read access in kernel mode Feb 17 11:24:35 localhost-live kernel: #PF: error_code(0x0000) - not-present page Feb 17 11:24:35 localhost-live kernel: PGD 0 P4D 0 Feb 17 11:24:35 localhost-live kernel: Oops: 0000 [#1] SMP PTI Feb 17 11:24:35 localhost-live kernel: CPU: 0 PID: 4095 Comm: mount Not tainted 5.11.0-0.rc7.149.fc34.x86_64 #1 Feb 17 11:24:35 localhost-live kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/22/2020 Feb 17 11:24:35 localhost-live kernel: RIP: 0010:btrfs_device_init_dev_stats+0x4c/0x1f0 Feb 17 11:24:35 localhost-live kernel: Code: 01 00 00 53 48 83 ec 40 48 8b 47 78 48 8d 54 24 2f c6 44 24 37 f9 48 89 44 24 38 48 8b 47 38 31 ff 48 c7 44 24 2f 00 00 00 00 <48> 8b 70 30 e8 0b b4 fb ff 41 89 c5 85 c0 74 52 49 8d 84 24 40 01 Feb 17 11:24:35 localhost-live kernel: RSP: 0018:ffffa60285fbfb68 EFLAGS: 00010246 Feb 17 11:24:35 localhost-live kernel: RAX: 0000000000000000 RBX: ffff88b88f806498 RCX: ffff88b82e7a2a10 Feb 17 11:24:35 localhost-live kernel: RDX: ffffa60285fbfb97 RSI: ffff88b82e7a2a10 RDI: 0000000000000000 Feb 17 11:24:35 localhost-live kernel: RBP: ffff88b88f806b3c R08: 0000000000000000 R09: 0000000000000000 Feb 17 11:24:35 localhost-live kernel: R10: ffff88b82e7a2a10 R11: 0000000000000000 R12: ffff88b88f806a00 Feb 17 11:24:35 localhost-live kernel: R13: ffff88b88f806478 R14: ffff88b88f806a00 R15: ffff88b82e7a2a10 Feb 17 11:24:35 localhost-live kernel: FS: 00007f698be1ec40(0000) GS:ffff88b937e00000(0000) knlGS:0000000000000000 Feb 17 11:24:35 localhost-live kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 11:24:35 localhost-live kernel: CR2: 0000000000000030 CR3: 0000000092c9c006 CR4: 00000000003706f0 Feb 17 11:24:35 localhost-live kernel: Call Trace: Feb 17 11:24:35 localhost-live kernel: ? btrfs_init_dev_stats+0x1f/0xf0 Feb 17 11:24:35 localhost-live kernel: btrfs_init_dev_stats+0x62/0xf0 Feb 17 11:24:35 localhost-live kernel: open_ctree+0x1019/0x15ff Feb 17 11:24:35 localhost-live kernel: btrfs_mount_root.cold+0x13/0xfa Feb 17 11:24:35 localhost-live kernel: legacy_get_tree+0x27/0x40 Feb 17 11:24:35 localhost-live kernel: vfs_get_tree+0x25/0xb0 Feb 17 11:24:35 localhost-live kernel: vfs_kern_mount.part.0+0x71/0xb0 Feb 17 11:24:35 localhost-live kernel: btrfs_mount+0x131/0x3d0 Feb 17 11:24:35 localhost-live kernel: ? legacy_get_tree+0x27/0x40 Feb 17 11:24:35 localhost-live kernel: ? btrfs_show_options+0x640/0x640 Feb 17 11:24:35 localhost-live kernel: legacy_get_tree+0x27/0x40 Feb 17 11:24:35 localhost-live kernel: vfs_get_tree+0x25/0xb0 Feb 17 11:24:35 localhost-live kernel: path_mount+0x441/0xa80 Feb 17 11:24:35 localhost-live kernel: __x64_sys_mount+0xf4/0x130 Feb 17 11:24:35 localhost-live kernel: do_syscall_64+0x33/0x40 Feb 17 11:24:35 localhost-live kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Feb 17 11:24:35 localhost-live kernel: RIP: 0033:0x7f698c04e52e Feb 17 11:24:35 localhost-live kernel: Code: 48 8b 0d 45 19 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 12 19 0c 00 f7 d8 64 89 01 48 Feb 17 11:24:35 localhost-live kernel: RSP: 002b:00007ffdf52bc518 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 Feb 17 11:24:35 localhost-live kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f698c04e52e Feb 17 11:24:35 localhost-live kernel: RDX: 00005603f068e690 RSI: 00005603f068e730 RDI: 00005603f068e6b0 Feb 17 11:24:35 localhost-live kernel: RBP: 00005603f068e460 R08: 00005603f068e6f0 R09: 00007f698c110a60 Feb 17 11:24:35 localhost-live kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 Feb 17 11:24:35 localhost-live kernel: R13: 00005603f068e6b0 R14: 00005603f068e690 R15: 00005603f068e460 Feb 17 11:24:35 localhost-live kernel: Modules linked in: snd_seq_dummy snd_hrtimer uinput nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables rfkill nfnetlink ip6table_filter ip6_tables iptable_filter vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event snd_ens1371 snd_ac97_codec ac97_bus snd_rawmidi snd_seq snd_seq_device snd_pcm intel_rapl_msr snd_timer intel_rapl_common snd pktcdvd vmwgfx rapl soundcore gameport ttm drm_kms_helper vmw_balloon cec pcspkr joydev vmw_vmci i2c_piix4 drm zram ip_tables nls_utf8 isofs squashfs crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw mptspi mptscsih vmxnet3 mptbase scsi_transport_spi ata_generic pata_acpi sunrpc be2iscsi bnx2i cnic uio Feb 17 11:24:35 localhost-live kernel: cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi loop fuse scsi_transport_iscsi Feb 17 11:24:35 localhost-live kernel: CR2: 0000000000000030 Feb 17 11:24:35 localhost-live kernel: ---[ end trace a8d9dde9a9afac6b ]--- Feb 17 11:24:35 localhost-live kernel: RIP: 0010:btrfs_device_init_dev_stats+0x4c/0x1f0 Feb 17 11:24:35 localhost-live kernel: Code: 01 00 00 53 48 83 ec 40 48 8b 47 78 48 8d 54 24 2f c6 44 24 37 f9 48 89 44 24 38 48 8b 47 38 31 ff 48 c7 44 24 2f 00 00 00 00 <48> 8b 70 30 e8 0b b4 fb ff 41 89 c5 85 c0 74 52 49 8d 84 24 40 01 Feb 17 11:24:35 localhost-live kernel: RSP: 0018:ffffa60285fbfb68 EFLAGS: 00010246 Feb 17 11:24:35 localhost-live kernel: RAX: 0000000000000000 RBX: ffff88b88f806498 RCX: ffff88b82e7a2a10 Feb 17 11:24:35 localhost-live kernel: RDX: ffffa60285fbfb97 RSI: ffff88b82e7a2a10 RDI: 0000000000000000 Feb 17 11:24:35 localhost-live kernel: RBP: ffff88b88f806b3c R08: 0000000000000000 R09: 0000000000000000 Feb 17 11:24:35 localhost-live kernel: R10: ffff88b82e7a2a10 R11: 0000000000000000 R12: ffff88b88f806a00 Feb 17 11:24:35 localhost-live kernel: R13: ffff88b88f806478 R14: ffff88b88f806a00 R15: ffff88b82e7a2a10 Feb 17 11:24:35 localhost-live kernel: FS: 00007f698be1ec40(0000) GS:ffff88b937e00000(0000) knlGS:0000000000000000 Feb 17 11:24:35 localhost-live kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 11:24:35 localhost-live kernel: CR2: 0000000000000030 CR3: 0000000092c9c006 CR4: 00000000003706f0