Apologies for the dupe Chris, I neglected to hit Reply-All.. Comments below.

On Mon, Dec 3, 2018 at 9:56 PM Chris Murphy <li...@colorremedies.com> wrote:
>
> On Mon, Dec 3, 2018 at 8:32 PM Mike Javorski <mike.javor...@gmail.com> wrote:
> >
> > Need a bit of advice here ladies / gents. I am running into an issue
> > which Qu Wenruo seems to have posted a patch for several weeks ago
> > (see https://patchwork.kernel.org/patch/10694997/).
> >
> > Here is the relevant dmesg output which led me to Qu's patch.
> > ----
> > [   10.032475] BTRFS critical (device sdb): corrupt leaf: root=2
> > block=24655027060736 slot=20 bg_start=13188988928 bg_len=10804527104,
> > invalid block group size, have 10804527104 expect (0, 10737418240]
> > [   10.032493] BTRFS error (device sdb): failed to read block groups: -5
> > [   10.053365] BTRFS error (device sdb): open_ctree failed
> > ----
> >
> > This server has a 16 disk btrfs filesystem (RAID6) which I boot
> > periodically to btrfs-send snapshots to. This machine is running
> > ArchLinux and I had just updated  to their latest 4.19.4 kernel
> > package (from 4.18.10 which was working fine). I've tried updating to
> > the 4.19.6 kernel that is in testing, but that doesn't seem to resolve
> > the issue. From what I can see on kernel.org, the patch above is not
> > pushed to stable or to Linus' tree.
> >
> > At this point the question is what to do. Is my FS toast? Could I
> > revert to the 4.18.10 kernel and boot safely? I don't know if the 4.19
> > boot process may have flipped some bits which would make reverting
> > problematic.
>
> That patch is not yet merged in linux-next so to use it, you'd need to
> apply yourself and compile a kernel. I can't tell for sure if it'd
> help.
>
> But, the less you change the file system, the better chance of saving
> it. I have no idea why there'd be a corrupt leaf just due to a kernel
> version change, though.
>
> Needless to say, raid56 just seems fragile once it runs into any kind
> of trouble. I personally wouldn't boot off it at all. I would only
> mount it from another system, ideally an installed system but a live
> system with the kernel versions you need would also work. That way you
> can get more information without changes, and booting will almost
> immediately mount rw, if mount succeeds at all, and will write a bunch
> of changes to the file system.
>

If the boot could corrupt the disk, that ship has already sailed as I
have previously attempted to mount the volume with the 4.19.4 and
4.19.6 kernels, but they both failed reporting the log lines in my
original message. I am hoping that Qu notices this thread at some
point as they are the author of the original patch which introduced
the check which is now failing, as well as the un-merged patch linked
earlier that adjusts the check condition.

What I don't know is if the checks up until that mount failure have
been read-only and thus I can revert to the older kernel, or if
something would have been written to disk prior to the mount call
failing. I don't want to risk the 23 TiB of snapshot data stored if
it's an easy workaround :-).  I realize there are risks with the
RAID56 code, but I've done my best to mitigate with server-grade
hardware, ECC memory, a UPS and a redundant copy of the data via
btrfs-send to this machine. Losing this snapshot volume is not the end
of the world, but I am about to upgrade the primary server (which is
currently running 4.19.4 without issue btw) and want to have a best
effort snapshot/backup in place before I do so.

> Whether it's a case of 4.18.10 not detecting corruption that 4.19
> sees, or if 4.19 already caused it, the best chance is to not mount it
> rw, and not run check --repair, until you get some feedback from a
> developer.
>

+1

> The thing I'd like to see is
> # btrfs rescue super -v /anydevice/
> # btrfs insp dump-s -f /anydevice/
>
> First command will tell us if all the supers are the same and valid
> across all devices. And the second one, hopefully it's pointed to a
> device with valid super, will tell us if there's a log root value
> other than 0. Both of those are read only commands.
>

It was my understanding that rescue writes to the disk (the man page
indicates it recovers superblocks which would imply writes). If you
are sure they both leave the on-disk data structures completely
intact, I would be willing to run them.

- mike

Reply via email to