On Fri, Feb 26, 2021 at 9:01 AM Sebastian Roller <sebastian.rol...@gmail.com> wrote: > > > > I think you best chance is to start out trying to restore from a > > > recent snapshot. As long as the failed controller wasn't writing > > > totally spurious data in random locations, that snapshot should be > > > intact. > > > > i.e. the strategy for this is btrfs restore -r option > > > > That only takes subvolid. You can get a subvolid listing with -l > > option but this doesn't show the subvolume names yet (patch is > > pending) > > https://github.com/kdave/btrfs-progs/issues/289 > > > > As an alternative to applying that and building yourself, you can > > approximate it with: > > > > sudo btrfs insp dump-t -t 1 /dev/sda6 | grep -A 1 ROOT_REF > > > > e.g. > > item 9 key (FS_TREE ROOT_REF 631) itemoff 14799 itemsize 26 > > root ref key dirid 256 sequence 54 name varlog34 > > > > Using this command I got a complete list of all the snapshots back to > 2016 with full name. > I tried to restore from different snapshots and using btrfs restore -t > from some other older roots. > Unfortunately no matter which root I restore from, the files are > always the same. I selected a list of some larger files, namely ppts > and sgmls from one of our own tools, and restored them from different > roots. Then I compared the files by checksums. They are the same from > all roots I could find the files. > The output of btrfs restore gives me some errors for checksums and > deflate, but most of the files are just listed as restored. > > Errors look like this: > > Restoring > /mnt/dumpo/recover/transfer/Hardware_Software/ABAQUS/AWI/AWI_6.14-2_2015.zip > Restoring > /mnt/dumpo/recover/transfer/Hardware_Software/ABAQUS/AWI/installInstructions.txt > Done searching /Hardware_Software/ABAQUS/AWI > checksum verify failed on 57937054842880 found 000000B6 wanted 00000000 > ERROR: lzo decompress failed: -4 > Error copying data for > /mnt/dumpo/recover/transfer/Hardware_Software/ABAQUS/CM/CMA_win86_32_2012.0928.3/setup.exe > Error searching > /mnt/dumpo/recover/transfer/Hardware_Software/ABAQUS/CM/CMA_win86_32_2012.0928.3/setup.exe > ERROR: lzo decompress failed: -4 > Error copying data for > /mnt/dumpo/recover/transfer/Hardware_Software/ABAQUS/CM/CMAInstaller.msi > ERROR: lzo decompress failed: -4 > Error copying data for > /mnt/dumpo/recover/transfer/Hardware_Software/ABAQUS/CM/setup.exe > Error searching > /mnt/dumpo/recover/transfer/Hardware_Software/ABAQUS/CM/setup.exe > > Most of the files are just listed as "Restoring ...". Still they are > severely damaged afterwards. They seem to contain "holes" filled with > 0x00 (this is from some rudimentary hexdump examination of the files.) > > Any chance to recover/restore from that? Thanks.
I don't know. The exact nature of the damage of a failing controller is adding a significant unknown component to it. If it was just a matter of not writing anything at all, then there'd be no problem. But it sounds like it wrote spurious or corrupt data, possibly into locations that weren't even supposed to be written to. I think if the snapshot b-tree is ok, and the chunk b-tree is ok, then it should be possible to recover the data correctly without needing any other tree. I'm not sure if that's how btrfs restore already works. Kernel 5.11 has a new feature, mount -o ro,rescue=all that is more tolerant of mounting when there are various kinds of problems. But there's another thread where a failed controller is thwarting recovery, and that code is being looked at for further enhancement. https://lore.kernel.org/linux-btrfs/CAEg-Je-DJW3saYKA2OBLwgyLU6j0JOF7NzXzECi0HJ5hft_5=a...@mail.gmail.com/ -- Chris Murphy