On Thu, Mar 4, 2021 at 3:25 PM Josef Bacik <jo...@toxicpanda.com> wrote: > > On 3/3/21 2:38 PM, Neal Gompa wrote: > > On Wed, Mar 3, 2021 at 1:42 PM Josef Bacik <jo...@toxicpanda.com> wrote: > >> > >> On 2/24/21 10:47 PM, Neal Gompa wrote: > >>> On Wed, Feb 24, 2021 at 10:44 AM Josef Bacik <jo...@toxicpanda.com> wrote: > >>>> > >>>> On 2/24/21 9:23 AM, Neal Gompa wrote: > >>>>> On Tue, Feb 23, 2021 at 10:05 AM Josef Bacik <jo...@toxicpanda.com> > >>>>> wrote: > >>>>>> > >>>>>> On 2/22/21 11:03 PM, Neal Gompa wrote: > >>>>>>> On Mon, Feb 22, 2021 at 2:34 PM Josef Bacik <jo...@toxicpanda.com> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> On 2/21/21 1:27 PM, Neal Gompa wrote: > >>>>>>>>> On Wed, Feb 17, 2021 at 11:44 AM Josef Bacik <jo...@toxicpanda.com> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> On 2/17/21 11:29 AM, Neal Gompa wrote: > >>>>>>>>>>> On Wed, Feb 17, 2021 at 9:59 AM Josef Bacik > >>>>>>>>>>> <jo...@toxicpanda.com> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> On 2/17/21 9:50 AM, Neal Gompa wrote: > >>>>>>>>>>>>> On Wed, Feb 17, 2021 at 9:36 AM Josef Bacik > >>>>>>>>>>>>> <jo...@toxicpanda.com> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 2/16/21 9:05 PM, Neal Gompa wrote: > >>>>>>>>>>>>>>> On Tue, Feb 16, 2021 at 4:24 PM Josef Bacik > >>>>>>>>>>>>>>> <jo...@toxicpanda.com> wrote: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> On 2/16/21 3:29 PM, Neal Gompa wrote: > >>>>>>>>>>>>>>>>> On Tue, Feb 16, 2021 at 1:11 PM Josef Bacik > >>>>>>>>>>>>>>>>> <jo...@toxicpanda.com> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On 2/16/21 11:27 AM, Neal Gompa wrote: > >>>>>>>>>>>>>>>>>>> On Tue, Feb 16, 2021 at 10:19 AM Josef Bacik > >>>>>>>>>>>>>>>>>>> <jo...@toxicpanda.com> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On 2/14/21 3:25 PM, Neal Gompa wrote: > >>>>>>>>>>>>>>>>>>>>> Hey all, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> So one of my main computers recently had a disk > >>>>>>>>>>>>>>>>>>>>> controller failure > >>>>>>>>>>>>>>>>>>>>> that caused my machine to freeze. After rebooting, > >>>>>>>>>>>>>>>>>>>>> Btrfs refuses to > >>>>>>>>>>>>>>>>>>>>> mount. I tried to do a mount and the following errors > >>>>>>>>>>>>>>>>>>>>> show up in the > >>>>>>>>>>>>>>>>>>>>> journal: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS info > >>>>>>>>>>>>>>>>>>>>>> (device sda3): disk space caching is enabled > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS info > >>>>>>>>>>>>>>>>>>>>>> (device sda3): has skinny extents > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS critical > >>>>>>>>>>>>>>>>>>>>>> (device sda3): corrupt leaf: root=401 block=796082176 > >>>>>>>>>>>>>>>>>>>>>> slot=15 ino=203657, invalid inode transid: has 888896 > >>>>>>>>>>>>>>>>>>>>>> expect [0, 888895] > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error > >>>>>>>>>>>>>>>>>>>>>> (device sda3): block=796082176 read time tree block > >>>>>>>>>>>>>>>>>>>>>> corruption detected > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS critical > >>>>>>>>>>>>>>>>>>>>>> (device sda3): corrupt leaf: root=401 block=796082176 > >>>>>>>>>>>>>>>>>>>>>> slot=15 ino=203657, invalid inode transid: has 888896 > >>>>>>>>>>>>>>>>>>>>>> expect [0, 888895] > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error > >>>>>>>>>>>>>>>>>>>>>> (device sda3): block=796082176 read time tree block > >>>>>>>>>>>>>>>>>>>>>> corruption detected > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS warning > >>>>>>>>>>>>>>>>>>>>>> (device sda3): couldn't read tree root > >>>>>>>>>>>>>>>>>>>>>> Feb 14 15:20:49 localhost-live kernel: BTRFS error > >>>>>>>>>>>>>>>>>>>>>> (device sda3): open_ctree failed > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I've tried to do -o recovery,ro mount and get the same > >>>>>>>>>>>>>>>>>>>>> issue. I can't > >>>>>>>>>>>>>>>>>>>>> seem to find any reasonably good information on how to > >>>>>>>>>>>>>>>>>>>>> do recovery in > >>>>>>>>>>>>>>>>>>>>> this scenario, even to just recover enough to copy data > >>>>>>>>>>>>>>>>>>>>> off. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I'm on Fedora 33, the system was on Linux kernel > >>>>>>>>>>>>>>>>>>>>> version 5.9.16 and > >>>>>>>>>>>>>>>>>>>>> the Fedora 33 live ISO I'm using has Linux kernel > >>>>>>>>>>>>>>>>>>>>> version 5.10.14. I'm > >>>>>>>>>>>>>>>>>>>>> using btrfs-progs v5.10. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Can anyone help? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Can you try > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> btrfs check --clear-space-cache v1 /dev/whatever > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> That should fix the inode generation thing so it's sane, > >>>>>>>>>>>>>>>>>>>> and then the tree > >>>>>>>>>>>>>>>>>>>> checker will allow the fs to be read, hopefully. If not > >>>>>>>>>>>>>>>>>>>> we can work out some > >>>>>>>>>>>>>>>>>>>> other magic. Thanks, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Josef > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I got the same error as I did with btrfs-check > >>>>>>>>>>>>>>>>>>> --readonly... > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Oh lovely, what does btrfs check --readonly --backup do? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> No dice... > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> # btrfs check --readonly --backup /dev/sda3 > >>>>>>>>>>>>>>>>>> Opening filesystem to check... > >>>>>>>>>>>>>>>>>> parent transid verify failed on 791281664 wanted 888893 > >>>>>>>>>>>>>>>>>> found 888895 > >>>>>>>>>>>>>>>>>> parent transid verify failed on 791281664 wanted 888893 > >>>>>>>>>>>>>>>>>> found 888895 > >>>>>>>>>>>>>>>>>> parent transid verify failed on 791281664 wanted 888893 > >>>>>>>>>>>>>>>>>> found 888895 > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hey look the block we're looking for, I wrote you some > >>>>>>>>>>>>>>>> magic, just pull > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> https://github.com/josefbacik/btrfs-progs/tree/for-neal > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> build, and then run > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> btrfs-neal-magic /dev/sda3 791281664 888895 > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> This will force us to point at the old root with (hopefully) > >>>>>>>>>>>>>>>> the right bytenr > >>>>>>>>>>>>>>>> and gen, and then hopefully you'll be able to recover from > >>>>>>>>>>>>>>>> there. This is kind > >>>>>>>>>>>>>>>> of saucy, so yolo, but I can undo it if it makes things > >>>>>>>>>>>>>>>> worse. Thanks, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> # btrfs check --readonly /dev/sda3 > >>>>>>>>>>>>>>>> Opening filesystem to check... > >>>>>>>>>>>>>>>> ERROR: could not setup extent tree > >>>>>>>>>>>>>>>> ERROR: cannot open file system > >>>>>>>>>>>>>>> # btrfs check --clear-space-cache v1 /dev/sda3 > >>>>>>>>>>>>>>>> Opening filesystem to check... > >>>>>>>>>>>>>>>> ERROR: could not setup extent tree > >>>>>>>>>>>>>>>> ERROR: cannot open file system > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> It's better, but still no dice... :( > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hmm it's not telling us what's wrong with the extent tree, > >>>>>>>>>>>>>> which is annoying. > >>>>>>>>>>>>>> Does mount -o rescue=all,ro work now that the root tree is > >>>>>>>>>>>>>> normal? Thanks, > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> Nope, I see this in the journal: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device > >>>>>>>>>>>>>> sda3): enabling all of the rescue options > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device > >>>>>>>>>>>>>> sda3): ignoring data csums > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device > >>>>>>>>>>>>>> sda3): ignoring bad roots > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device > >>>>>>>>>>>>>> sda3): disabling log replay at mount time > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device > >>>>>>>>>>>>>> sda3): disk space caching is enabled > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS info (device > >>>>>>>>>>>>>> sda3): has skinny extents > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device > >>>>>>>>>>>>>> sda3): tree level mismatch detected, bytenr=791281664 level > >>>>>>>>>>>>>> expected=1 has=2 > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device > >>>>>>>>>>>>>> sda3): tree level mismatch detected, bytenr=791281664 level > >>>>>>>>>>>>>> expected=1 has=2 > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS warning (device > >>>>>>>>>>>>>> sda3): couldn't read tree root > >>>>>>>>>>>>>> Feb 17 09:49:40 localhost-live kernel: BTRFS error (device > >>>>>>>>>>>>>> sda3): open_ctree failed > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Ok git pull for-neal, rebuild, then run > >>>>>>>>>>>> > >>>>>>>>>>>> btrfs-neal-magic /dev/sda3 791281664 888895 2 > >>>>>>>>>>>> > >>>>>>>>>>>> I thought of this yesterday but in my head was like "naaahhhh, > >>>>>>>>>>>> whats the chances > >>>>>>>>>>>> that the level doesn't match??". Thanks, > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Tried rescue mount again after running that and got a stack trace > >>>>>>>>>>> in > >>>>>>>>>>> the kernel, detailed in the following attached log. > >>>>>>>>>> > >>>>>>>>>> Huh I wonder how I didn't hit this when testing, I must have only > >>>>>>>>>> tested with > >>>>>>>>>> zero'ing the extent root and the csum root. You're going to have > >>>>>>>>>> to build a > >>>>>>>>>> kernel with a fix for this > >>>>>>>>>> > >>>>>>>>>> https://paste.centos.org/view/7b48aaea > >>>>>>>>>> > >>>>>>>>>> and see if that gets you further. Thanks, > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> I built a kernel build as an RPM with your patch[1] and tried it. > >>>>>>>>> > >>>>>>>>> [root@fedora ~]# mount -t btrfs -o rescue=all,ro /dev/sdb3 /mnt > >>>>>>>>> Killed > >>>>>>>>> > >>>>>>>>> The log from the journal is attached. > >>>>>>>> > >>>>>>>> > >>>>>>>> Ahh crud my bad, this should do it > >>>>>>>> > >>>>>>>> https://paste.centos.org/view/ac2e61ef > >>>>>>>> > >>>>>>> > >>>>>>> Patch doesn't apply (note it is patch 667 below): > >>>>>> > >>>>>> Ah sorry, should have just sent you an iterative patch. You can take > >>>>>> the above > >>>>>> patch and just delete the hunk from volumes.c as you already have that > >>>>>> applied > >>>>>> and then it'll work. Thanks, > >>>>>> > >>>>> > >>>>> Failed with a weird error...? > >>>>> > >>>>> [root@fedora ~]# mount -t btrfs -o rescue=all,ro /dev/sda3 /mnt > >>>>> mount: /mnt: mount(2) system call failed: No such file or directory. > >>>>> > >>>>> Journal log with traceback attached. > >>>> > >>>> Last one maybe? > >>>> > >>>> https://paste.centos.org/view/80edd6fd > >>>> > >>> > >>> Similar weird failure: > >>> > >>> [root@fedora ~]# mount -t btrfs -o rescue=all,ro /dev/sdb3 /mnt > >>> mount: /mnt: mount(2) system call failed: No such file or directory. > >>> > >>> No crash in the journal this time, though: > >>> > >>>> Feb 24 22:43:19 fedora kernel: BTRFS info (device sdb3): enabling all of > >>>> the rescue options > >>>> Feb 24 22:43:19 fedora kernel: BTRFS info (device sdb3): ignoring data > >>>> csums > >>>> Feb 24 22:43:19 fedora kernel: BTRFS info (device sdb3): ignoring bad > >>>> roots > >>>> Feb 24 22:43:19 fedora kernel: BTRFS info (device sdb3): disabling log > >>>> replay at mount time > >>>> Feb 24 22:43:19 fedora kernel: BTRFS info (device sdb3): disk space > >>>> caching is enabled > >>>> Feb 24 22:43:19 fedora kernel: BTRFS info (device sdb3): has skinny > >>>> extents > >>>> Feb 24 22:43:19 fedora kernel: BTRFS warning (device sdb3): failed to > >>>> read fs tree: -2 > >>>> Feb 24 22:43:19 fedora kernel: BTRFS error (device sdb3): open_ctree > >>>> failed > >>> > >>> > >> > >> Sorry Neal, you replied when I was in the middle of something and promptly > >> forgot about it. I figured the fs root was fine, can you do the following > >> so I > >> can figure out from the error messages what might be wrong > >> > >> btrfs check --readonly > >> btrfs restore -D > >> btrfs restore -l > >> > > > > It didn't work.. Here's the output: > > > > [root@fedora ~]# btrfs check --readonly /dev/sdb3 > > Opening filesystem to check... > > ERROR: could not setup extent tree > > ERROR: cannot open file system > > [root@fedora ~]# btrfs restore -D /dev/sdb3 /mnt > > WARNING: could not setup extent tree, skipping it > > Couldn't setup device tree > > Could not open root, trying backup super > > parent transid verify failed on 796082176 wanted 888894 found 888896 > > parent transid verify failed on 796082176 wanted 888894 found 888896 > > parent transid verify failed on 796082176 wanted 888894 found 888896 > > Ignoring transid failure > > WARNING: could not setup extent tree, skipping it > > Couldn't setup device tree > > Could not open root, trying backup super > > ERROR: superblock bytenr 274877906944 is larger than device size > > 263132807168 > > Could not open root, trying backup super > > [root@fedora ~]# btrfs restore -l /dev/sdb3 /mnt > > WARNING: could not setup extent tree, skipping it > > Couldn't setup device tree > > Could not open root, trying backup super > > parent transid verify failed on 796082176 wanted 888894 found 888896 > > parent transid verify failed on 796082176 wanted 888894 found 888896 > > parent transid verify failed on 796082176 wanted 888894 found 888896 > > Ignoring transid failure > > WARNING: could not setup extent tree, skipping it > > Couldn't setup device tree > > Could not open root, trying backup super > > ERROR: superblock bytenr 274877906944 is larger than device size > > 263132807168 > > Could not open root, trying backup super > > > > > > Hmm OK I think we want the neal magic for this one too, but before we go doing > that can I get a > > btrfs inspect-internal -f /dev/whatever > > so I can make sure I'm not just blindly clobbering something. Thanks, >
Doesn't work, did you mean some other command? [root@fedora ~]# btrfs inspect-internal -f /dev/sdb3 btrfs inspect-internal: unknown token '-f' -- 真実はいつも一つ!/ Always, there's only one truth!