subject:"Several unhappy btrfs's after RAID meltdown"

Re: Several unhappy btrfs's after RAID meltdown

2012-11-15 Thread Ryan C. Underwood


Finally made some more progress on one of my melted down btrfs from
earlier this year.

First I hacked find-root.c to not stop scanning the disk when it
thinks it has found the real root.  I wanted it to print out all
possible roots.  I saved the stderr output to a logfile.  About 1226
possible roots were found.

Then I used btrfs-restore, iterating over each one of these to
attempt to use each one of them as the root and see what files could
be found:

for temp in `sed 's/^.*block \([0-9]\+\).*$/\1/' log`;
do echo $temp;
nice ./btrfs-restore -t $temp /dev/mapper/tr5ut-vicep--library 
/mnt/recovery;
done

In this way I was able to recover about 36GB of data and the
directory structure of what is recovered looks fine.  The data also
looks fine too by scanning MIME types with file and selecting a few
text or HTML files to check manually.

There is still a lot of data missing though.  If I am reading this
correctly there was about 300GB of data which compressed to 254GB
on-disk.

Label: 'vicep-library'  uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851
Total devices 1 FS bytes used 254.35GB
devid1 size 1.00TB used 299.04GB path /dev/dm-27

A lot of my btrfs-restore output looks like this:

318259351552
parent transid verify failed on 318259351552 wanted 575931 found 546662
parent transid verify failed on 318259351552 wanted 575931 found 546662
parent transid verify failed on 318259351552 wanted 575931 found 546662
parent transid verify failed on 318259351552 wanted 575931 found 546662
Ignoring transid failure
parent transid verify failed on 318125375488 wanted 541528 found 572360
parent transid verify failed on 318125375488 wanted 541528 found 572360
parent transid verify failed on 318125375488 wanted 541528 found 572360
parent transid verify failed on 318125375488 wanted 541528 found 572360
Ignoring transid failure
parent transid verify failed on 561016832 wanted 544038 found 574369
parent transid verify failed on 561016832 wanted 544038 found 574369
parent transid verify failed on 561016832 wanted 544038 found 574369
parent transid verify failed on 561016832 wanted 544038 found 574369
Ignoring transid failure
leaf parent key incorrect 561016832
Root objectid is 5
parent transid verify failed on 164073472 wanted 544650 found 562972
parent transid verify failed on 164073472 wanted 544650 found 562972
parent transid verify failed on 164073472 wanted 544650 found 562972
parent transid verify failed on 164073472 wanted 544650 found 562972
Ignoring transid failure
leaf parent key incorrect 164073472
Error searching -1

As far as I can see only the #5 root object was found, at least I
don't see any others found in the output.  This could account for the
missing data.  How could I get to the other root objects?

-- 
Ryan C. Underwood, neme...@icequake.net


signature.asc
Description: Digital signature

Re: Several unhappy btrfs's after RAID meltdown

2012-06-01 Thread Ryan C. Underwood


I made a little bit of progress recovering this mess, seems
btrfs-progs has improved since I last tried.

# ./btrfs-find-root /dev/mapper/tr5ut-vicep--library
[..]
Well block 317865713664 seems great, but generation doesn't match, have=574372, 
want=575931
Well block 317874491392 seems great, but generation doesn't match, have=575930, 
want=575931
Found tree root at 317874626560

Seems like this is a good sign that btrfs-find-root was able to find
the root.

But I'm still stuck on this trying to run btrfs-restore:

# ./btrfs-restore -v -i -u 1 -t 317874626560 /dev/mapper/tr5ut-vicep--library .
checksum verify failed on 317874630656 wanted 8E19212D found FFA6
checksum verify failed on 317874630656 wanted 8E19212D found FFA6
checksum verify failed on 317874630656 wanted 491D9C1A found FFA6
checksum verify failed on 317874630656 wanted 8E19212D found FFA6
Csum didn't match
btrfs-restore: disk-io.c:441:
find_and_setup_root: Assertion `!(ret)' failed.
Aborted

It seems like -i should ignore the csum mismatch, what am I missing?

-- 
Ryan C. Underwood, neme...@icequake.net


signature.asc
Description: Digital signature

Re: Several unhappy btrfs's after RAID meltdown

2012-02-13 Thread David Sterba

On Sun, Feb 12, 2012 at 10:31:34AM -0600, Ryan C. Underwood wrote:
 So, I examined the below filesystem, the one of the two that I would
 really like to restore.  There is basically nothing but zeros, and
 very occasionally a sparse string of data, until exactly 0x20
 offset,

This matches start of an allocation cluster.

 ... at which point the data is suddenly very packed and looks like
 usual compressed data should.  Is there a way one could de-LZO the
 data chunkwise and dump to another device so I could even get an idea
 what I am looking at?

If the blocks are in right order, you can decompress the raw data from
the format

[4B total length] [4B compressed chunk length][chunk data] [another chunk]

there is no signature of the compressed extent boundaries, but the
lengths stored are always smaller than 128K, so it's hex values like

23 04 00 00 | 34 01 00 00 | lzo data...

and shoud be detectable in the block sequence.

 What about a 'superblock' signature I can scan
 for?

_BHRfS_M at offset 0x40 in a 4kb aligned block


david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-12 Thread Ryan C. Underwood


So, I examined the below filesystem, the one of the two that I would
really like to restore.  There is basically nothing but zeros, and
very occasionally a sparse string of data, until exactly 0x20
offset, at which point the data is suddenly very packed and looks like
usual compressed data should.  Is there a way one could de-LZO the
data chunkwise and dump to another device so I could even get an idea
what I am looking at?  What about a 'superblock' signature I can scan
for?

 # /usr/local/btrfs-progs/bin/restore -v /dev/mapper/tr5ut-vicep--library /mnt2
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 checksum verify failed on 317874630656 wanted 491D9C1A found FFA6
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 Csum didn't match
 restore: root-tree.c:46: btrfs_find_last_root: Assertion
 `!(path-slots[0] == 0)' failed.
 Aborted

-- 
Ryan C. Underwood, neme...@icequake.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-07 Thread Duncan

Ryan C. Underwood posted on Mon, 06 Feb 2012 21:39:45 -0600 as excerpted:

 Does anyone have any idea how I should proceed with the below quoted
 situation?  Unfortunately, I am going to have to give up on btrfs if it
 is really so fragile.  I am using kernel 3.2.2 and btrfs-tools from
 November.

Regardless of the technical details of your situation, keep in mind that 
btrfs is still experimental at this time, and remains under heavy 
development, as you'll have noticed if you read the kernel's changelogs 
or this list at all.  Kernel 3.2.2 is relatively recent altho you could 
try the latest 3.3 rc or git kernel as well, but I'd suggest a btrfs-
tools rebuild as November really isn't particularly current there.

However, complaining about the fragility of a still in development and 
marked experimental filesystem would seem disingenuous at best.  
Particularly when it's used on top of a dmcrypt layer that btrfs was 
known to have issues with (see the wiki), **AND** when you were using 
raid-5 and had not just a single spindle failure, but a double-spindle 
failure, a situation that's well outside anything raid-5 claims to handle 
(raid-6 OTOH... or triple-redundant raid-1 or raid-10...).

OK, so given you're running an experimental filesystem on a block-device 
stack it's known to have problems with, you surely had backups if the 
data was at all important to you.  Simply restore from those backups.  If 
you didn't care to make backups, when running in such a known unstable 
situation, well, obviously the data couldn't have been so important to 
you after all, as you obviously didn't care about it enough to do those 
backups, and by the sound of things, not even enough to be informed about 
the development and stability status of the filesystem and block-device 
stack you were using.

IOW, yes, btrfs is to be considered fragile at this point.  It's still in 
development, there's not even an error-correcting btrfsck yet, and you 
were using it on a block-device stack that the wiki specifically mentions 
is problematic.  Both the btrfs kernel option and the wiki have big 
warnings about the stability at this point, specifically stating that 
it's not to be trusted to safely hold data yet.  If you were using it 
contrary to those warnings and lost data due to lack of backups, there's 
no one to blame but yourself.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-07 Thread Ryan C. Underwood


  Unfortunately, I am going to have to give up on btrfs if it
  is really so fragile.
 
 However, complaining about the fragility of a still in development
 and 
 marked experimental filesystem would seem disingenuous at best.  
[snip paragraphs of tut-tutting]
 IOW, yes, btrfs is to be considered fragile at this point.

So you re-stated my position.  I gave btrfs a chance but it is still
apparently far more fragile than ext4 when corruption is introduced --
although btrfs is the filesystem of the two which is specifically
designed to provide internal fault tolerance and resilience.  Is there
a fine line between user feedback and disingenuous complaining
that I am not aware of?

The data in question is not that important, though I would like to
have it back considering it should mostly still be there as on the
ext4 volumes.  40MB of bad sectors on one 2TB disk in a 6TB volume
does not seem like a lot.  Even if the whole beginning of the volume
was wiped out surely there is the equivalent of backup superblocks?  I
can hack if I could just get a clue where to start.

-- 
Ryan C. Underwood, neme...@icequake.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-07 Thread Mitch Harder

On Tue, Feb 7, 2012 at 8:04 AM, Ryan C. Underwood
nemesis-li...@icequake.net wrote:

  Unfortunately, I am going to have to give up on btrfs if it
  is really so fragile.

 However, complaining about the fragility of a still in development
 and
 marked experimental filesystem would seem disingenuous at best.
 [snip paragraphs of tut-tutting]
 IOW, yes, btrfs is to be considered fragile at this point.

 So you re-stated my position.  I gave btrfs a chance but it is still
 apparently far more fragile than ext4 when corruption is introduced --
 although btrfs is the filesystem of the two which is specifically
 designed to provide internal fault tolerance and resilience.  Is there
 a fine line between user feedback and disingenuous complaining
 that I am not aware of?

 The data in question is not that important, though I would like to
 have it back considering it should mostly still be there as on the
 ext4 volumes.  40MB of bad sectors on one 2TB disk in a 6TB volume
 does not seem like a lot.  Even if the whole beginning of the volume
 was wiped out surely there is the equivalent of backup superblocks?  I
 can hack if I could just get a clue where to start.


Since you're getting failed to read /dev/sr0 messages, that might be
an indication there are some newer btrfs-progs tools available.

You might want to try the building btrfs-progs from the git repository:
http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git;a=summary

There are some recovery tools there that may extract your data (look
at the recover program).
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-07 Thread Ryan C. Underwood


On Tue, Feb 07, 2012 at 12:17:23PM +0800, Liu Bo wrote:
 
  The failure occurred while the volumes were online and in use, so in
  addition to what was unreadable, all pending writes to the device
  between the failure and when the problem was discovered were lost as
  well.
 
 
 Hi Ryan,
 
 So on the failure, what does dmesg show?  checksum errors?

Dmesg at the time showed block errors on the RAID due to the multi-disk
failure.  I do have a log from that time which I have attached, including btrfs
unhappiness at the time.

Here is the oops I currently get on 3.2.2 when trying to mount the btrfs volume
of the two that btrfs-show is able to detect:

[ 1023.151683] device label vicep-library devid 1 transid 575931 
/dev/mapper/tr5ut-vicep--library
[ 1023.152136] btrfs: use lzo compression
[ 1023.152174] btrfs: disk space caching is enabled
[ 1023.191409] btrfs: dm-32 checksum verify failed on 317874630656 wanted 
28ABE8A6 found 8E19212D level 0
[ 1023.211750] btrfs: dm-32 checksum verify failed on 317874630656 wanted 
28ABE8A6 found 491D9C1A level 0
[ 1023.216243] btrfs: dm-32 checksum verify failed on 317874630656 wanted 
28ABE8A6 found 8E19212D level 0
[ 1023.224252] btrfs: dm-32 checksum verify failed on 317874630656 wanted 
28ABE8A6 found 491D9C1A level 0
[ 1023.224521] btrfs: dm-32 checksum verify failed on 317874630656 wanted 
28ABE8A6 found 491D9C1A level 0
[ 1023.232211] btrfs: dm-32 checksum verify failed on 317874630656 wanted 
28ABE8A6 found 8E19212D level 0
[ 1023.232456] btrfs: dm-32 checksum verify failed on 317874630656 wanted 
28ABE8A6 found 491D9C1A level 0
[ 1023.232549] [ cut here ]
[ 1023.232591] kernel BUG at fs/btrfs/disk-io.c:1203!
[ 1023.232627] invalid opcode:  [#1] SMP
[ 1023.232723] CPU 1
[ 1023.232755] Modules linked in: ext2 ext4 jbd2 crc16 it87 hwmon_vid loop 
snd_hda_codec_hdmi tpm_tis tpm tpm_bios snd_hda_codec_realtek pcspkr evdev wmi 
snd_hda_intel i2c_piix4 i2c_core k8temp edac_core e
dac_mce_amd snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore 
snd_page_alloc shpchp processor button thermal_sys pci_hotplug parport_pc 
parport ext3 jbd mbcache dm_snapshot aes_x86_64 aes_generic dm_crypt dm_mod 
raid1 md_mod nbd btrfs zlib_deflate crc32c libcrc32c xts gf128mul sg sr_mod 
cdrom sd_mod crc_t10dif ata_generic ohci_hcd pata_atiixp firewire_ohci ahci 
libahci firewire_core ehci_hcd libata tulip crc_itu_t scsi_mod usbcore floppy 
r8169 mii usb_common [last unloaded: scsi_wait_scan]
[ 1023.235168]
[ 1023.235203] Pid: 4829, comm: mount Not tainted 3.2.2 #3 Gigabyte Technology 
Co., Ltd. GA-MA78GPM-DS2H/GA-MA78GPM-DS2H
[ 1023.235335] RIP: 0010:[a00c0a9a]  [a00c0a9a] 
find_and_setup_root+0x5c/0xdc [btrfs]
[ 1023.235437] RSP: 0018:8801a7607b98  EFLAGS: 00010282
[ 1023.235473] RAX: fffe RBX: 8801a798b800 RCX: 0005
[ 1023.235510] RDX: fffb RSI: 0001af60 RDI: ea00069b1d40
[ 1023.235547] RBP: 8801a798f800 R08: a00bc092 R09: 
[ 1023.235584] R10: 8801a798f800 R11:  R12: 0002
[ 1023.235621] R13: 8801a7989400 R14: 0008c9bb R15: 8801a772f718
[ 1023.235659] FS:  7fee836557e0() GS:8801afc4() 
knlGS:
[ 1023.235699] CS:  0010 DS:  ES:  CR0: 80050033
[ 1023.235735] CR2: 7fee836a4000 CR3: 0001a7f41000 CR4: 06e0
[ 1023.235772] DR0:  DR1:  DR2: 
[ 1023.235809] DR3:  DR6: 0ff0 DR7: 0400
[ 1023.235846] Process mount (pid: 4829, threadinfo 8801a7606000, task 
8801a7f521c0)
[ 1023.235886] Stack:
[ 1023.235920]  0002 0008c9bb 8801a772f000 
8801a798f800
[ 1023.236080]  8801a772d000 a00c4436 0003 
817b16e0
[ 1023.236240]  1000 1000811a57f0 8801a7989680 
8801a798b800
[ 1023.236400] Call Trace:
[ 1023.236448]  [a00c4436] ? open_ctree+0xf6c/0x1535 [btrfs]
[ 1023.236489]  [810ff823] ? sget+0x39a/0x3ac
[ 1023.236501]  [a00a9fb5] ? btrfs_mount+0x3a2/0x539 [btrfs]
[ 1023.236501]  [810d54ed] ? pcpu_next_pop+0x37/0x43
[ 1023.236501]  [810d50f3] ? cpumask_next+0x18/0x1a
[ 1023.236501]  [810d6502] ? pcpu_alloc+0x875/0x8be
[ 1023.236501]  [810ff3ab] ? mount_fs+0x6c/0x14a
[ 1023.236501]  [81113715] ? vfs_kern_mount+0x61/0x97
[ 1023.236501]  [81114a2c] ? do_kern_mount+0x49/0xd6
[ 1023.236501]  [811151e1] ? do_mount+0x728/0x792
[ 1023.236501]  [810ee478] ? alloc_pages_current+0xa7/0xc9
[ 1023.236501]  [811152d3] ? sys_mount+0x88/0xc3
[ 1023.236501]  [81341152] ? system_call_fastpath+0x16/0x1b
[ 1023.236501] Code: 24 24 e8 23 f5 ff ff 48 8d 53 20 48 8d 8b 0f 01 00 00 4c 
89 e6 48 89 ef e8 fc b4 ff ff 89 c2 b8 fe ff ff ff 83 fa 00 7f 79 74 04 0f 0b 
eb fe 80 bb 0e 01 00 00 00 48 8b ab c0 00 00

Re: Several unhappy btrfs's after RAID meltdown

2012-02-07 Thread Ryan C. Underwood


On Tue, Feb 07, 2012 at 08:36:15AM -0600, Mitch Harder wrote:
 
 Since you're getting failed to read /dev/sr0 messages, that might be
 an indication there are some newer btrfs-progs tools available.
 
 You might want to try the building btrfs-progs from the git repository:
 http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git;a=summary

I did so, here's the new output (not much changed)
# /usr/local/btrfs-progs/bin/btrfs-show
**
** WARNING: this program is considered deprecated
** Please consider to switch to the btrfs utility
**
failed to read /dev/sr0: No medium found
Label: vicep-library  uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851
Total devices 1 FS bytes used 254.35GB
devid1 size 1.00TB used 299.04GB path /dev/dm-32

Btrfs Btrfs v0.19

# /usr/local/btrfs-progs/bin/btrfs device scan
Scanning for Btrfs filesystems
failed to read /dev/sr0

 
 There are some recovery tools there that may extract your data (look
 at the recover program).

I found a 'restore' program, are you referring to the mount option '-o
recovery'?

-- 
Ryan C. Underwood, neme...@icequake.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-07 Thread Ryan C. Underwood


Output of 'restore':

# /usr/local/btrfs-progs/bin/restore -v /dev/mapper/tr5ut-vicep--clones /mnt2
No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones
Could not open root, trying backup super
Check tree block failed, want=298807296, have=13791616683601169802
Check tree block failed, want=298807296, have=13791616683601169802
Check tree block failed, want=298807296, have=3150973834573588028
Check tree block failed, want=298807296, have=13791616683601169802
Check tree block failed, want=298807296, have=13791616683601169802
read block failed check_tree_block
Couldn't read tree root
Could not open root, trying backup super
Check tree block failed, want=298807296, have=13791616683601169802
Check tree block failed, want=298807296, have=13791616683601169802
Check tree block failed, want=298807296, have=3150973834573588028
Check tree block failed, want=298807296, have=13791616683601169802
Check tree block failed, want=298807296, have=13791616683601169802
read block failed check_tree_block
Couldn't read tree root
Could not open root, trying backup super
[...here the output ends, seems to not complete?]

# /usr/local/btrfs-progs/bin/restore -v /dev/mapper/tr5ut-vicep--library /mnt2
checksum verify failed on 317874630656 wanted 8E19212D found FFA6
checksum verify failed on 317874630656 wanted 8E19212D found FFA6
checksum verify failed on 317874630656 wanted 491D9C1A found FFA6
checksum verify failed on 317874630656 wanted 8E19212D found FFA6
Csum didn't match
restore: root-tree.c:46: btrfs_find_last_root: Assertion
`!(path-slots[0] == 0)' failed.
Aborted


-- 
Ryan C. Underwood, neme...@icequake.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-07 Thread Chris Samuel

On Tuesday 07 February 2012 20:53:59 Duncan wrote:

 Kernel 3.2.2 is relatively recent altho you could 
 try the latest 3.3 rc or git kernel as well

Please keep in mind that work done in git does not appear to get 
backported to the stable updates for releases (such as 3.2.x), in 
other words you'll have the same btrfs code as in the first 3.2 
release.

You will need to use RC's (or git) for the current btrfs kernel code.

 Particularly when it's used on top of a dmcrypt layer that btrfs was
 known to have issues with

I believe the issues between btrfs and dm-crypt have been sorted out 
as of 3.2 (going on an earlier posting of Chris Masons).

Returning to the OP's case, I'm surprised that ext4 is able to get 
anything back and I'd say that's a testament to its long development 
life (ext-ext2-ext3-ext4) in comparison to btrfs. If that happened 
on a system I was sysadmin'ing (and it has - losing an entire tray of 
drives in a RAID array due to controller firmware bugs really spoils 
your day) I'd be reaching for the backup tapes about now.

Best of luck!
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP


signature.asc
Description: This is a digitally signed message part.

Re: Several unhappy btrfs's after RAID meltdown

2012-02-06 Thread Ryan C. Underwood


Does anyone have any idea how I should proceed with the below quoted
situation?  Unfortunately, I am going to have to give up on btrfs if
it is really so fragile.  I am using kernel 3.2.2 and btrfs-tools
from November.

On Sun, Feb 05, 2012 at 12:41:28PM -0600, Ryan C. Underwood wrote:
 
 Hi,
 
 I had a RAID5 double disk failure (40 megs or so bad sectors near
 middle of the second failed disk), bad news but I recovered what I was
 able to.
 
 The RAID contained a dm-crypt physical volume which then contained
 four logical volumes.  Two are EXT4 and two BTRFS, about 1TB in size
 each.
 
 The failure occurred while the volumes were online and in use, so in
 addition to what was unreadable, all pending writes to the device
 between the failure and when the problem was discovered were lost as
 well.
 
 The two ext4, fortunately, had some relatively minor corruption which
 was cleared up with a few rounds of fsck.  The two btrfs are
 completely unhappy though and I do not know how to proceed, since
 btrfs problems are new to me.  Any suggestions are welcome.
 
 Here is the basic picture of what is going on.
 
 # cat /etc/fstab
 # file system mount point   type  options   dump  pass
 #/dev/mapper/tr5ut-media/mnt/media  btrfs
 defaults,compress=lzo,space_cache 0   2
 
 /dev/mapper/tr5ut-media /mnt/media  ext4 defaults 0 2
 
 /dev/mapper/tr5ut-vicep--library/vicepa auto
 defaults,compress=lzo,space_cache  0   2
 
 /dev/mapper/tr5ut-vicep--clones /vicepb auto
 defaults,compress=lzo,space_cache  0   2
 
 
 You can see that btrfs device scan does not find anything, while
 btrfs-show finds one of the volumes and not the other.  Fscking the
 found volume halts due to checksum and assertion errors, while fscking
 the other volume fails completely, I guess due to a missing
 'superblock' type structure?
 
 
 seraph:~# btrfs device scan
 Scanning for Btrfs filesystems
 failed to read /dev/sr0
 
 
 seraph:~# btrfs-show
 **
 ** WARNING: this program is considered deprecated
 ** Please consider to switch to the btrfs utility
 **
 failed to read /dev/sr0: No medium found
 Label: vicep-library  uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851
 Total devices 1 FS bytes used 254.35GB
 devid1 size 1.00TB used 299.04GB path /dev/dm-32
 
 Btrfs Btrfs v0.19
 
 
 seraph:~# btrfsck /dev/mapper/tr5ut-vicep--library
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 checksum verify failed on 317874630656 wanted 491D9C1A found FFA6
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 Csum didn't match
 btrfsck: root-tree.c:46: btrfs_find_last_root: Assertion
 `!(path-slots[0] == 0)' failed.
 Aborted
 
 
 seraph:~# btrfsck /dev/mapper/tr5ut-vicep--clones
 No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones
 
 
 seraph:~# dpkg -l btrfs-tools
 Desired=Unknown/Install/Remove/Purge/Hold
 |
 Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
 |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
 ||/ Name  Version
 Description
 +++-=-=-==
 ii  btrfs-tools   0.19+2005-2
 Checksumming Copy on Write Filesystem utilities
 
 
 -- 
 Ryan C. Underwood, neme...@icequake.net
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

-- 
Ryan C. Underwood, neme...@icequake.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

2012-02-06 Thread Liu Bo

On 02/07/2012 11:39 AM, Ryan C. Underwood wrote:
 Does anyone have any idea how I should proceed with the below quoted
 situation?  Unfortunately, I am going to have to give up on btrfs if
 it is really so fragile.  I am using kernel 3.2.2 and btrfs-tools
 from November.
 
 On Sun, Feb 05, 2012 at 12:41:28PM -0600, Ryan C. Underwood wrote:
 Hi,

 I had a RAID5 double disk failure (40 megs or so bad sectors near
 middle of the second failed disk), bad news but I recovered what I was
 able to.

 The RAID contained a dm-crypt physical volume which then contained
 four logical volumes.  Two are EXT4 and two BTRFS, about 1TB in size
 each.

 The failure occurred while the volumes were online and in use, so in
 addition to what was unreadable, all pending writes to the device
 between the failure and when the problem was discovered were lost as
 well.


Hi Ryan,

So on the failure, what does dmesg show?  checksum errors?


 The two ext4, fortunately, had some relatively minor corruption which
 was cleared up with a few rounds of fsck.  The two btrfs are
 completely unhappy though and I do not know how to proceed, since
 btrfs problems are new to me.  Any suggestions are welcome.


btrfsck is not ready for data recovery, but only for error checking.
But btrfs-tools do have some features that may help us, e.g zero-log.

More recovery details refer to the thread from Hugo:
http://www.spinics.net/lists/linux-btrfs/msg14890.html


thanks,
liubo

 Here is the basic picture of what is going on.

 # cat /etc/fstab
 # file system mount point   type  options   dump  pass
 #/dev/mapper/tr5ut-media/mnt/media  btrfs
 defaults,compress=lzo,space_cache 0   2

 /dev/mapper/tr5ut-media /mnt/media  ext4 defaults 0 2

 /dev/mapper/tr5ut-vicep--library/vicepa auto
 defaults,compress=lzo,space_cache  0   2

 /dev/mapper/tr5ut-vicep--clones /vicepb auto
 defaults,compress=lzo,space_cache  0   2


 You can see that btrfs device scan does not find anything, while
 btrfs-show finds one of the volumes and not the other.  Fscking the
 found volume halts due to checksum and assertion errors, while fscking
 the other volume fails completely, I guess due to a missing
 'superblock' type structure?


 seraph:~# btrfs device scan
 Scanning for Btrfs filesystems
 failed to read /dev/sr0


 seraph:~# btrfs-show
 **
 ** WARNING: this program is considered deprecated
 ** Please consider to switch to the btrfs utility
 **
 failed to read /dev/sr0: No medium found
 Label: vicep-library  uuid: 89b14d35-b31a-4fbe-a2d9-cb83cbcd3851
 Total devices 1 FS bytes used 254.35GB
 devid1 size 1.00TB used 299.04GB path /dev/dm-32

 Btrfs Btrfs v0.19


 seraph:~# btrfsck /dev/mapper/tr5ut-vicep--library
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 checksum verify failed on 317874630656 wanted 491D9C1A found FFA6
 checksum verify failed on 317874630656 wanted 8E19212D found FFA6
 Csum didn't match
 btrfsck: root-tree.c:46: btrfs_find_last_root: Assertion
 `!(path-slots[0] == 0)' failed.
 Aborted


 seraph:~# btrfsck /dev/mapper/tr5ut-vicep--clones
 No valid Btrfs found on /dev/mapper/tr5ut-vicep--clones


 seraph:~# dpkg -l btrfs-tools
 Desired=Unknown/Install/Remove/Purge/Hold
 |
 Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
 |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
 ||/ Name  Version
 Description
 +++-=-=-==
 ii  btrfs-tools   0.19+2005-2
 Checksumming Copy on Write Filesystem utilities


 -- 
 Ryan C. Underwood, neme...@icequake.net
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

Re: Several unhappy btrfs's after RAID meltdown

13 matches

Site Navigation

Mail list logo

Footer information