Re: Raid0 rescue
OK this time, also -mraid1 -draid0, and filled it with some more metadata this time, but I then formatted NTFS, then ext4, then xfs. And then wiped those signatures. Brutal, especially ext4 which writes a lot more stuff and zeros a bunch of areas too. # btrfs rescue super -v /dev/mapper/vg-2 All Devices: Device: id = 1, name = /dev/mapper/vg-1 Device: id = 2, name = /dev/mapper/vg-2 Before Recovering: [All good supers]: device name = /dev/mapper/vg-1 superblock bytenr = 65536 device name = /dev/mapper/vg-1 superblock bytenr = 67108864 device name = /dev/mapper/vg-2 superblock bytenr = 67108864 [All bad supers]: All supers are valid, no need to recover Obviously vg-2 is missing its first superblock and this tool is not complaining about it at all. Normal mount does not work (generic open ctree error). # btrfs check /dev/mapper/vg-1 warning, device 2 is missing Umm, no. But yeah because the first super is missing the kernel isn't considering it a Btrfs volume at all. There's also other errors with the check, due to metadata being stepped on I'm guessing. But we need a way to fix an obviously stepped on first super, and I don't like the idea of using btrfs check for that anyway. All I need is the first copy fixed up, and then just do a normal mount. But let's see how messy this gets, pointing check to the damaged device and the known good 2nd super (-s0 is the first super): # btrfs check -s 1 /dev/mapper/vg-2 using SB copy 1, bytenr 67108864 ...skipping checksum errors etc OK so I guess I have to try --repair. # btrfs check --repair -s1 /dev/mapper/vg-2 enabling repair mode using SB copy 1, bytenr 67108864 ...skipping checksum errors etc. ]# btrfs rescue super -v /dev/mapper/vg-1 All Devices: Device: id = 1, name = /dev/mapper/vg-1 Before Recovering: [All good supers]: device name = /dev/mapper/vg-1 superblock bytenr = 67108864 [All bad supers]: device name = /dev/mapper/vg-1 superblock bytenr = 65536 That is fucked. It broke the previously good super on vg-1? [root@f26wnuc ~]# btrfs rescue super -v /dev/mapper/vg-2 All Devices: Device: id = 1, name = /dev/mapper/vg-1 Device: id = 2, name = /dev/mapper/vg-2 Before Recovering: [All good supers]: device name = /dev/mapper/vg-1 superblock bytenr = 67108864 device name = /dev/mapper/vg-2 superblock bytenr = 67108864 [All bad supers]: device name = /dev/mapper/vg-1 superblock bytenr = 65536 Worse, it did not actually fix the bad/missing superblock on vg-2 either. Let's answer Y to its questions... [root@f26wnuc ~]# btrfs rescue super -v /dev/mapper/vg-2 All Devices: Device: id = 1, name = /dev/mapper/vg-1 Device: id = 2, name = /dev/mapper/vg-2 Before Recovering: [All good supers]: device name = /dev/mapper/vg-1 superblock bytenr = 67108864 device name = /dev/mapper/vg-2 superblock bytenr = 67108864 [All bad supers]: device name = /dev/mapper/vg-1 superblock bytenr = 65536 Make sure this is a btrfs disk otherwise the tool will destroy other fs, Are you sure? [y/N]: y checksum verify failed on 20971520 found 348F13AD wanted 8100 checksum verify failed on 20971520 found 348F13AD wanted 8100 Recovered bad superblocks successful [root@f26wnuc ~]# btrfs rescue super -v /dev/mapper/vg-2 All Devices: Device: id = 1, name = /dev/mapper/vg-1 Device: id = 2, name = /dev/mapper/vg-2 Before Recovering: [All good supers]: device name = /dev/mapper/vg-1 superblock bytenr = 65536 device name = /dev/mapper/vg-1 superblock bytenr = 67108864 device name = /dev/mapper/vg-2 superblock bytenr = 65536 device name = /dev/mapper/vg-2 superblock bytenr = 67108864 [All bad supers]: All supers are valid, no need to recover OK! That's better! Mount it. dmesg https://pastebin.com/6kVzYLfZ Pretty boring, bad tree block, and then some read errors corrected. I get more similarly formatted errors, different numbers... but no failures. Scrub it... # btrfs scrub status /mnt/yo scrub status for b2ee5125-cf56-493a-b094-81fe8330115a scrub started at Wed Aug 16 23:08:54 2017, running for 00:00:30 total bytes scrubbed: 1.19GiB with 5 errors error details: csum=5 corrected errors: 5, uncorrectable errors: 0, unverified errors: 0 # There's almost no data on this file system, it's mostly metadata which is raid1 so that's why data survives. But even in the previous example where some data is clobbered, the data loss is limited. The file system itself survives, and can continue to be used. The 'btrfs rescue super' function could be better, and it looks like there's a bug in btrfs check's superblock repair. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to maj
Re: Raid0 rescue
I'm testing explicitly for this case: # lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert 1 vg Vwi-a-tz-- 10.00g thintastic0.00 2 vg Vwi-a-tz-- 10.00g thintastic0.00 thintastic vg twi-aotz-- 100.00g 0.00 0.38 # mkfs.btrfs -f -mraid1 -draid0 /dev/mapper/vg-1 /dev/mapper/vg-2 ... mount and copy some variable data to the volume, most files are less than 64KiB, and even some are less than 2KiB. So there will be a mix of files that will definitely get nerfed by damaged strips, and many that will live from the drive not accidentally formatted, as well as inline. But for sure the file system *ought* to survive. umount and then format NTFS # mkfs.ntfs -f /dev/mapper/vg-2 Now get this bit of curiousness: # wipefs /dev/mapper/vg-2 offset type 0x1fedos [partition table] 0x10040 btrfs [filesystem] UUID: bebaedc5-96a1-4163-9527-8254ecae817e 0x3 ntfs [filesystem] UUID: 67AD98CF36096C70 So the two supers can co-exist. That invariably is going to cause kernel code confusion. blkid will neither consider it NTFS nor Btrfs. So it's sortof in a zombie situation. Get this: # btrfs rescue super -v /dev/mapper/vg-1 All Devices: Device: id = 1, name = /dev/mapper/vg-1 Before Recovering: [All good supers]: device name = /dev/mapper/vg-1 superblock bytenr = 65536 device name = /dev/mapper/vg-1 superblock bytenr = 67108864 [All bad supers]: All supers are valid, no need to recover # btrfs rescue super -v /dev/mapper/vg-2 All Devices: Device: id = 1, name = /dev/mapper/vg-1 Device: id = 2, name = /dev/mapper/vg-2 Before Recovering: [All good supers]: device name = /dev/mapper/vg-1 superblock bytenr = 65536 device name = /dev/mapper/vg-1 superblock bytenr = 67108864 device name = /dev/mapper/vg-2 superblock bytenr = 65536 device name = /dev/mapper/vg-2 superblock bytenr = 67108864 [All bad supers]: All supers are valid, no need to recover # So the first command sees the supers only on vg-1, it doesn't go looking at vg-2 at all presumably because kernel code is ignoring that device due to two different file system supers (?). But the second command forces it to look at vg-2, and it says the Btrfs supers are fine, and then also auto discovers the vg-1 device too. OK so I'm just going to cheat at this point and wipefs just the NTFS magic so this device is now seen as Btrfs. # wipefs -n -o 0x3 /dev/mapper/vg-2 /dev/mapper/vg-2: 8 bytes were erased at offset 0x0003 (ntfs): 4e 54 46 53 20 20 20 20 # wipefs -o 0x3 /dev/mapper/vg-2 /dev/mapper/vg-2: 8 bytes were erased at offset 0x0003 (ntfs): 4e 54 46 53 20 20 20 20 # partprobe # blkid ... /dev/mapper/vg-1: UUID="bebaedc5-96a1-4163-9527-8254ecae817e" UUID_SUB="ef9dbcf0-bb0b-4faf-a7b4-02f1c92631e4" TYPE="btrfs" /dev/mapper/vg-2: UUID="bebaedc5-96a1-4163-9527-8254ecae817e" UUID_SUB="490504ea-4ee4-47ad-91a7-58b6ccf4be8e" TYPE="btrfs" PTTYPE="dos" ... OK good. Except, what is PTTYPE? Ohh, that's the first entry in the wipefs command way at the top I bet. [root@f26wnuc ~]# wipefs -o 0x1fe /dev/mapper/vg-2 /dev/mapper/vg-2: 2 bytes were erased at offset 0x01fe (dos): 55 aa # blkid ... /dev/mapper/vg-1: UUID="bebaedc5-96a1-4163-9527-8254ecae817e" UUID_SUB="ef9dbcf0-bb0b-4faf-a7b4-02f1c92631e4" TYPE="btrfs" /dev/mapper/vg-2: UUID="bebaedc5-96a1-4163-9527-8254ecae817e" UUID_SUB="490504ea-4ee4-47ad-91a7-58b6ccf4be8e" TYPE="btrfs" ... Yep! OK let's just try a normal mount. It mounts! No errors at all. list all the files on the file system (about 700). No errors. Let's cat a few to /dev/null manually no errors. OK I'm bored. Let's just scrub it. [root@f26wnuc yo]# btrfs scrub status /mnt/yo/ scrub status for bebaedc5-96a1-4163-9527-8254ecae817e scrub started at Wed Aug 16 19:40:26 2017, running for 00:00:10 total bytes scrubbed: 529.62MiB with 181 errors error details: csum=181 corrected errors: 0, uncorrectable errors: 181, unverified errors: 0 One file is affected, the large ~1+GiB file. [77898.116429] BTRFS warning (device dm-6): checksum error at logical 1621229568 on dev /dev/mapper/vg-2, sector 2621568, root 5, inode 257, offset 517341184, length 4096, links 1 (path: Fedora-Workstation-Live-x86_64-Rawhide-20170814.n.0.iso) [77898.116463] BTRFS error (device dm-6): bdev /dev/mapper/vg-2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 [77898.116478] BTRFS error (device dm-6): unable to fixup (regular) error at logical 1621229568 on dev /dev/mapper/vg-2 There's about 9 more of those kinds of messages. Anyway that looks to me like that file itself is nerfed by the NTFS format, but the file system itself wasn't hit. There's no fixups whi
Re: Raid0 rescue
On Tue, Aug 1, 2017 at 12:36 PM, Alan Brand wrote: > I successfully repaired the superblock, copied it from one of the backups. > My biggest problem now is that the UUID for the disk has changed due > to the reformatting and no longer matches what is in the metadata. > I need to make linux recognize the partition as btrfs and have the correct > UUID. > Any suggestions? Huh, insofar as I'm aware, Btrfs does not track a "disk" UUID or partition UUID. A better qualified set of steps for fixing this would be: a.) restore partitioning, if any b.) wipefs the NTFS signature to invalidate the NTFS file system c.) use super-recover to replace correct supers on both drives d.) mount the file system e.) do a full scrub The last step is optional but best practice. It'll actively do fixups, and you'll get an error message with path to files that are not recoverable. Alternatively a metadata only balance will do fixups, and it'll be much faster. But you won't get info right away about what files are damaged. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid0 rescue
On Thu, Jul 27, 2017 at 8:49 AM, Alan Brand wrote: > I know I am screwed but hope someone here can point at a possible solution. > > I had a pair of btrfs drives in a raid0 configuration. One of the > drives was pulled by mistake, put in a windows box, and a quick NTFS > format was done. Then much screaming occurred. > > I know the data is still there. Is there anyway to rebuild the raid > bringing in the bad disk? I know some info is still good, for example > metadata0 is corrupt but 1 and 2 are good. > The trees look bad which is probably the killer. Well the first step is to check and fix the super blocks. And then the normal code should just discover the bad stuff, and get good copies from the good drive, and copy them to the corrupt one, passively, and eventually fix the file system itself. There's probably only a few files corrupted irrecoverably. It's probably worth testing for this explicitly. It's not a wild scenario, and it's something Btrfs should be able to recover from gracefully. The gotcha part of a totally automatic recovery is the superblocks because there's no *one true right way* for the kernel to just assume the remaining Btrfs supers are more valid than the NTFS supers. So then the question is, which tool should fix this up? I'd say both 'btrfs rescue super-recover' and 'btrfs check' should do this. The difference being super-recover would fix only the supers, with kernel code doing passive fixups as problems are encountered once the fs is mounted. And 'check --repair' would fix supers and additionally fix missing metadata on the corrupt drive, using user space code with an unmounted system. Both should work, or at least both should be fail safe. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid0 rescue
On Thu, Jul 27, 2017 at 08:25:19PM +, Duncan wrote: > >Welcome to RAID-0... > > As Hugo implies, RAID-0 mode, not just for btrfs but in general, is well > known among admins for being "garbage data not worth trying to recover" > mode. Not only is there no redundancy, but with raid0 you're > deliberately increasing the chances of loss because now loss of any one > device pretty well makes garbage of the entire array, and loss of any > single device in a group of more than one is more likely than loss of any > single device by itself. Disks don't quite die once a week, you see. Using raid0 is actually quite rational in a good part of setups. * You need backups _anyway_. No raid level removes this requirement. * You can give a machine twice as much immediate storage with raid0 than with raid1. * You get twice as many disks you can use for backup. Redundant raid is good for two things: * uptime * reducing the chance for loss of data between last backup and the failure For the second point, do you happen to know of a filesystem that gives you cheap hourly backups that avoid taking half an hour just to stat? Thus, you need to make a decision: would you prefer to take time trying to recover, with a good chance of failure anyway -- or a-priori accept that every failure means hitting the backups? Obviously, depends on the use case. This said, I don't have a raid0 anywhere. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ What Would Jesus Do, MUD/MMORPG edition: ⣾⠁⢰⠒⠀⣿⡁ • multiplay with an admin char to benefit your mortal ⢿⡄⠘⠷⠚⠋⠀ • abuse item cloning bugs (the five fishes + two breads affair) ⠈⠳⣄ • use glitches to walk on water -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid0 rescue
Hugo Mills posted on Thu, 27 Jul 2017 15:10:38 + as excerpted: > On Thu, Jul 27, 2017 at 10:49:37AM -0400, Alan Brand wrote: >> I know I am screwed but hope someone here can point at a possible >> solution. >> >> I had a pair of btrfs drives in a raid0 configuration. One of the >> drives was pulled by mistake, put in a windows box, and a quick NTFS >> format was done. Then much screaming occurred. >> >> I know the data is still there. [...] >> I can't run a normal recovery as only half of each file is there. > >Welcome to RAID-0... Hugo, Chris Murphy, or one of the devs should they take an interest, are your best bets for current recovery. This reply only tries to fill in some recommendations for an eventual rebuild. As Hugo implies, RAID-0 mode, not just for btrfs but in general, is well known among admins for being "garbage data not worth trying to recover" mode. Not only is there no redundancy, but with raid0 you're deliberately increasing the chances of loss because now loss of any one device pretty well makes garbage of the entire array, and loss of any single device in a group of more than one is more likely than loss of any single device by itself. So first rule of raid0, don't use it unless the data you're putting on it is indeed not worth trying to rebuild, either because you keep the backups updated and it's easier to just go back to them than to even try recovery of the raid0, or because the data really is garbage data, internet cache, temp files, etc, that it's really just better to scrap and let the cache rebuild, etc, than try to recover. That's in general. For btrfs in particular, there's some additional considerations altho they don't change the above. If the data isn't quite down to the raid0-garbage, just-give-up-and-start-over, level, with btrfs, what you likely want is metadata raid1, data single, mode, which is the btrfs multi-device default. The raid1 metadata mode will mean there's two copies of metadata, one on each of two different devices, so it'll tolerate loss of a single device and still let you at least know where the files are located and give you a chance at recovery. But since metadata is typically a small fraction of the total, you'll not be sacrificing /too/ much space for that additional safety. The single data mode will normally put files (under a gig filesize anyway, tho as the size increases toward a gig the chances of it all being on a single device go down) all on one device, so with a loss of a device, you'll either still have the file or you won't. The contrast with raid0 mode is that its line is 64k instead of a gig, above which the file will be striped across multiple devices, so indeed, with a two- device raid0, half of each file, in alternating 64k pieces, is what you have left if one of the devices goes bad, while with single, your chances of whole-file recovery, assuming it wasn't /entirely/ on the bad device, are pretty good upto a gig or so. And because btrfs is still in the stabilizing, "get-the-code-correct- before-you-worry-about-optimizing-it" mode, unlike more mature raid implementations such as the kernel's mdraid, btrfs still normally accesses only one device at a time, so btrfs raid0 only gets you the space advantage, not the usual raid0 speed advantage. So btrfs single mode isn't really much if any slower than raid0, while being much safer and offering the same (or even better in the case of differing device sizes) size advantage as raid0. Put differently, there's really very little case for choosing btrfs raid0 mode at this time. Maybe some years in the future when raid0 mode is speed-optimized that will change, but for now, single mode is safer and in the case of unequal device sizes makes better use of space, while being generally as fast, as raid0 mode, so single mode is almost certainly a better choice. Meanwhile, back to the general case again: Admin's first rule of backups: The *true* value you place on your your data is defined not by arbitrary claims, but by the number of backups of that data you have. No backups, much like putting the data on raid0, defines that data as of garbage value, not worth the trouble to try to recover in the case of raid0, not worth the trouble of making the backup in the first place in the case of no backup. Of course really valuable data will have multiple backups, generally some of which are off-site in case the entire site is lost (flood, fire, earthquake, bomb, etc), while others are on-site in ordered to facilitate easy recovery from a less major disaster, should it be necessary. Which means, regardless of whether files are lost or not, what was of most value as defined by an admin's actions (or lack of them in the case of not having a backup) is always saved, either the time/resources/ trouble to make the backup in the first place if the data wasn't worth it, or the data, if the backup was made and is thus available
Re: Raid0 rescue
> > Correct, I should have said 'superblock'. > > It is/was raid0. Funny thing is that this all happened when I was > > prepping to convert to raid1. >If youre metadata was also RAID-0, then your filesystem is almost > certainly toast. If any part of the btrfs metadata was overwritten by > some of the NTFS metadata, then the FS will be broken (somewhere) and > probably not in a fixable way. It should have been raid-1 as I believe that is the default for metadata when creating a btrfs volume How do i put the good copy back on the corrupt volume? I cant even look at the metadata on the good disk as it complains about one of the disk being missing. > running a btrfs-find-root shows this (which gives me hope) > Well block 4871870791680(gen: 73257 level: 1) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > Well block 4639933562880(gen: 73256 level: 1) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > Well block 4639935168512(gen: 73255 level: 1) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > Well block 4639926239232(gen: 73242 level: 0) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > > but when I run btrfs > inspect-internal dump-tree -r /dev/sdc1 > > checksum verify failed on 874856448 found 5A85B5D9 wanted 17E3CB7D > checksum verify failed on 874856448 found 5A85B5D9 wanted 17E3CB7D > checksum verify failed on 874856448 found 2204C752 wanted C6ADDF7E > checksum verify failed on 874856448 found 2204C752 wanted C6ADDF7E > bytenr mismatch, want=874856448, have=8568478783891655077 This would suggest that some fairly important part of the metadata was damaged. You'll probably spend far less effort recovering the data by restoring your backups than trying to fix this. Hugo. > root tree: 4871875543040 level 1 > chunk tree: 20971520 level 1 > extent tree key (EXTENT_TREE ROOT_ITEM 0) 4871875559424 level 2 > device tree key (DEV_TREE ROOT_ITEM 0) 4635801976832 level 1 > fs tree key (FS_TREE ROOT_ITEM 0) 4871870414848 level 3 > checksum tree key (CSUM_TREE ROOT_ITEM 0) 4871876034560 level 3 > uuid tree key (UUID_TREE ROOT_ITEM 0) 29376512 level 0 > checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 > checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 > checksum verify failed on 728891392 found F4F3A4AD wanted E6D063C7 > checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 > bytenr mismatch, want=728891392, have=269659807399918462 > total bytes 5000989728768 > bytes used 3400345264128 > > > > On Thu, Jul 27, 2017 at 11:10 AM, Hugo Mills wrote: > > On Thu, Jul 27, 2017 at 10:49:37AM -0400, Alan Brand wrote: > >> I know I am screwed but hope someone here can point at a possible solution. > >> > >> I had a pair of btrfs drives in a raid0 configuration. One of the > >> drives was pulled by mistake, put in a windows box, and a quick NTFS > >> format was done. Then much screaming occurred. > >> > >> I know the data is still there. > > > >Well, except for all the parts overwritten by a blank NTFS metadata > > structure. > > > >> Is there anyway to rebuild the raid > >> bringing in the bad disk? I know some info is still good, for example > >> metadata0 is corrupt but 1 and 2 are good. > > > >I assume you mean superblock there. > > > >> The trees look bad which is probably the killer. > > > >We really should improve the error messages at some point. Whatever > > you're inferring from the kernel logs is probably not quite right. :) > > > >What's the metadata configuration on this FS? Also RAID-0? or RAID-1? > > > >> I can't run a normal recovery as only half of each file is there. > > > >Welcome to RAID-0... > > > >Hugo. > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid0 rescue
On Thu, Jul 27, 2017 at 03:43:37PM -0400, Alan Brand wrote: > Correct, I should have said 'superblock'. > It is/was raid0. Funny thing is that this all happened when I was > prepping to convert to raid1. If youre metadata was also RAID-0, then your filesystem is almost certainly toast. If any part of the btrfs metadata was overwritten by some of the NTFS metadata, then the FS will be broken (somewhere) and probably not in a fixable way. > running a btrfs-find-root shows this (which gives me hope) > Well block 4871870791680(gen: 73257 level: 1) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > Well block 4639933562880(gen: 73256 level: 1) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > Well block 4639935168512(gen: 73255 level: 1) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > Well block 4639926239232(gen: 73242 level: 0) seems good, but > generation/level doesn't match, want gen: 73258 level: 1 > > but when I run btrfs > inspect-internal dump-tree -r /dev/sdc1 > > checksum verify failed on 874856448 found 5A85B5D9 wanted 17E3CB7D > checksum verify failed on 874856448 found 5A85B5D9 wanted 17E3CB7D > checksum verify failed on 874856448 found 2204C752 wanted C6ADDF7E > checksum verify failed on 874856448 found 2204C752 wanted C6ADDF7E > bytenr mismatch, want=874856448, have=8568478783891655077 This would suggest that some fairly important part of the metadata was damaged. You'll probably spend far less effort recovering the data by restoring your backups than trying to fix this. Hugo. > root tree: 4871875543040 level 1 > chunk tree: 20971520 level 1 > extent tree key (EXTENT_TREE ROOT_ITEM 0) 4871875559424 level 2 > device tree key (DEV_TREE ROOT_ITEM 0) 4635801976832 level 1 > fs tree key (FS_TREE ROOT_ITEM 0) 4871870414848 level 3 > checksum tree key (CSUM_TREE ROOT_ITEM 0) 4871876034560 level 3 > uuid tree key (UUID_TREE ROOT_ITEM 0) 29376512 level 0 > checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 > checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 > checksum verify failed on 728891392 found F4F3A4AD wanted E6D063C7 > checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 > bytenr mismatch, want=728891392, have=269659807399918462 > total bytes 5000989728768 > bytes used 3400345264128 > > > > On Thu, Jul 27, 2017 at 11:10 AM, Hugo Mills wrote: > > On Thu, Jul 27, 2017 at 10:49:37AM -0400, Alan Brand wrote: > >> I know I am screwed but hope someone here can point at a possible solution. > >> > >> I had a pair of btrfs drives in a raid0 configuration. One of the > >> drives was pulled by mistake, put in a windows box, and a quick NTFS > >> format was done. Then much screaming occurred. > >> > >> I know the data is still there. > > > >Well, except for all the parts overwritten by a blank NTFS metadata > > structure. > > > >> Is there anyway to rebuild the raid > >> bringing in the bad disk? I know some info is still good, for example > >> metadata0 is corrupt but 1 and 2 are good. > > > >I assume you mean superblock there. > > > >> The trees look bad which is probably the killer. > > > >We really should improve the error messages at some point. Whatever > > you're inferring from the kernel logs is probably not quite right. :) > > > >What's the metadata configuration on this FS? Also RAID-0? or RAID-1? > > > >> I can't run a normal recovery as only half of each file is there. > > > >Welcome to RAID-0... > > > >Hugo. > > -- Hugo Mills | Great oxymorons of the world, no. 1: hugo@... carfax.org.uk | Family Holiday http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: Raid0 rescue
Correct, I should have said 'superblock'. It is/was raid0. Funny thing is that this all happened when I was prepping to convert to raid1. running a btrfs-find-root shows this (which gives me hope) Well block 4871870791680(gen: 73257 level: 1) seems good, but generation/level doesn't match, want gen: 73258 level: 1 Well block 4639933562880(gen: 73256 level: 1) seems good, but generation/level doesn't match, want gen: 73258 level: 1 Well block 4639935168512(gen: 73255 level: 1) seems good, but generation/level doesn't match, want gen: 73258 level: 1 Well block 4639926239232(gen: 73242 level: 0) seems good, but generation/level doesn't match, want gen: 73258 level: 1 but when I run btrfs inspect-internal dump-tree -r /dev/sdc1 checksum verify failed on 874856448 found 5A85B5D9 wanted 17E3CB7D checksum verify failed on 874856448 found 5A85B5D9 wanted 17E3CB7D checksum verify failed on 874856448 found 2204C752 wanted C6ADDF7E checksum verify failed on 874856448 found 2204C752 wanted C6ADDF7E bytenr mismatch, want=874856448, have=8568478783891655077 root tree: 4871875543040 level 1 chunk tree: 20971520 level 1 extent tree key (EXTENT_TREE ROOT_ITEM 0) 4871875559424 level 2 device tree key (DEV_TREE ROOT_ITEM 0) 4635801976832 level 1 fs tree key (FS_TREE ROOT_ITEM 0) 4871870414848 level 3 checksum tree key (CSUM_TREE ROOT_ITEM 0) 4871876034560 level 3 uuid tree key (UUID_TREE ROOT_ITEM 0) 29376512 level 0 checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 checksum verify failed on 728891392 found F4F3A4AD wanted E6D063C7 checksum verify failed on 728891392 found 75E2752C wanted D6CA4FB4 bytenr mismatch, want=728891392, have=269659807399918462 total bytes 5000989728768 bytes used 3400345264128 On Thu, Jul 27, 2017 at 11:10 AM, Hugo Mills wrote: > On Thu, Jul 27, 2017 at 10:49:37AM -0400, Alan Brand wrote: >> I know I am screwed but hope someone here can point at a possible solution. >> >> I had a pair of btrfs drives in a raid0 configuration. One of the >> drives was pulled by mistake, put in a windows box, and a quick NTFS >> format was done. Then much screaming occurred. >> >> I know the data is still there. > >Well, except for all the parts overwritten by a blank NTFS metadata > structure. > >> Is there anyway to rebuild the raid >> bringing in the bad disk? I know some info is still good, for example >> metadata0 is corrupt but 1 and 2 are good. > >I assume you mean superblock there. > >> The trees look bad which is probably the killer. > >We really should improve the error messages at some point. Whatever > you're inferring from the kernel logs is probably not quite right. :) > >What's the metadata configuration on this FS? Also RAID-0? or RAID-1? > >> I can't run a normal recovery as only half of each file is there. > >Welcome to RAID-0... > >Hugo. > > -- > Hugo Mills | We don't just borrow words; on occasion, English has > hugo@... carfax.org.uk | pursued other languages down alleyways to beat them > http://carfax.org.uk/ | unconscious and rifle their pockets for new > PGP: E2AB1DE4 | vocabulary. James D. Nicoll -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Raid0 rescue
On Thu, Jul 27, 2017 at 10:49:37AM -0400, Alan Brand wrote: > I know I am screwed but hope someone here can point at a possible solution. > > I had a pair of btrfs drives in a raid0 configuration. One of the > drives was pulled by mistake, put in a windows box, and a quick NTFS > format was done. Then much screaming occurred. > > I know the data is still there. Well, except for all the parts overwritten by a blank NTFS metadata structure. > Is there anyway to rebuild the raid > bringing in the bad disk? I know some info is still good, for example > metadata0 is corrupt but 1 and 2 are good. I assume you mean superblock there. > The trees look bad which is probably the killer. We really should improve the error messages at some point. Whatever you're inferring from the kernel logs is probably not quite right. :) What's the metadata configuration on this FS? Also RAID-0? or RAID-1? > I can't run a normal recovery as only half of each file is there. Welcome to RAID-0... Hugo. -- Hugo Mills | We don't just borrow words; on occasion, English has hugo@... carfax.org.uk | pursued other languages down alleyways to beat them http://carfax.org.uk/ | unconscious and rifle their pockets for new PGP: E2AB1DE4 | vocabulary. James D. Nicoll signature.asc Description: Digital signature
Raid0 rescue
I know I am screwed but hope someone here can point at a possible solution. I had a pair of btrfs drives in a raid0 configuration. One of the drives was pulled by mistake, put in a windows box, and a quick NTFS format was done. Then much screaming occurred. I know the data is still there. Is there anyway to rebuild the raid bringing in the bad disk? I know some info is still good, for example metadata0 is corrupt but 1 and 2 are good. The trees look bad which is probably the killer. I can't run a normal recovery as only half of each file is there. $100 reward if you come up with a workable solution. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html