Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-07 Thread Sander
Hello Steve,

Steve Freitas wrote (ao):
 Alright, I'll trash it and start over with a different drive.

With the danger of mentioning the obvious: you could do a few
destructive badblocks runs on that disk to see if SMART keeps adding up
to the bad blocks list.

With kind regards, Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))

2010-01-07 Thread Steve Freitas
Hi all,

I was under the mistaken impression that btrfs checksumming, in its
current default configuration, protected your data from bitrot. It
appears this is not the case:

On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote:
 Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas:
  So please correct me if I have some mistaken assumptions. I thought
  btrfs would be tolerant of that -- if a block failed the checksum test,
  it would reconstruct and remap it. 

 Only if enough redundancy is left. And with the default setup btrfs is only 
 mirroring the metadata not the data.

So can someone please tell me what the current state-of-the-art is of
data protection with btrfs? Does it differ with single-device versus
multiple-device configurations? Is it possible to enable data
checksumming now? Under what conditions? And will it do what a naive
user would expect it to do, namely, correct for diverse kinds of errors
in your storage subsystem? If not, what does it do? Etc...

Any and all information is much appreciated.

Thanks!

Steve

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))

2010-01-07 Thread jim owens
Steve Freitas wrote:
 Hi all,
 
 I was under the mistaken impression that btrfs checksumming, in its
 current default configuration, protected your data from bitrot. It
 appears this is not the case:
 
 On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote:
 Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas:
 So please correct me if I have some mistaken assumptions. I thought
 btrfs would be tolerant of that -- if a block failed the checksum test,
 it would reconstruct and remap it. 
 
 Only if enough redundancy is left. And with the default setup btrfs is only 
 mirroring the metadata not the data.
 
 So can someone please tell me what the current state-of-the-art is of
 data protection with btrfs? Does it differ with single-device versus
 multiple-device configurations? Is it possible to enable data
 checksumming now? Under what conditions? And will it do what a naive
 user would expect it to do, namely, correct for diverse kinds of errors
 in your storage subsystem? If not, what does it do? Etc...

First, understand that a checksum only says this block is good or bad.

The checksum can not be used to reconstruct the data.

Checksums are present for all btrfs blocks unless you explicitly shut
them off with mount/ioctl/fcntl options.

To have a good copy you can use as a replacement block, you must
use either btrfs raid1 or raid10.  You can use raid1 with 1 drive,
in a mode called dup where both copies are made to that device.

By default with 1 drive, btrfs uses dup for metadata and 1 copy
(nodup) for file data blocks. To get file data dup, you just use
mkfs.btrfs -d raid1.

If you have btrfs raid, it will find the good block on a read, but
AFAIK we don't have tools yet to automatically reallocate the bad one.

jim
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))

2010-01-07 Thread Johannes Hirte
Am Donnerstag 07 Januar 2010 20:29:49 schrieb jim owens:
 Steve Freitas wrote:
  Hi all,
 
  I was under the mistaken impression that btrfs checksumming, in its
  current default configuration, protected your data from bitrot. It
  appears this is not the case:
 
  On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote:
  Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas:
  So please correct me if I have some mistaken assumptions. I thought
  btrfs would be tolerant of that -- if a block failed the checksum test,
  it would reconstruct and remap it.
 
  Only if enough redundancy is left. And with the default setup btrfs is
  only mirroring the metadata not the data.
 
  So can someone please tell me what the current state-of-the-art is of
  data protection with btrfs? Does it differ with single-device versus
  multiple-device configurations? Is it possible to enable data
  checksumming now? Under what conditions? And will it do what a naive
  user would expect it to do, namely, correct for diverse kinds of errors
  in your storage subsystem? If not, what does it do? Etc...
 
 First, understand that a checksum only says this block is good or bad.
 
 The checksum can not be used to reconstruct the data.
 
 Checksums are present for all btrfs blocks unless you explicitly shut
 them off with mount/ioctl/fcntl options.
 
 To have a good copy you can use as a replacement block, you must
 use either btrfs raid1 or raid10.  You can use raid1 with 1 drive,
 in a mode called dup where both copies are made to that device.
 
 By default with 1 drive, btrfs uses dup for metadata and 1 copy
 (nodup) for file data blocks. To get file data dup, you just use
 mkfs.btrfs -d raid1.
 
 If you have btrfs raid, it will find the good block on a read, but
 AFAIK we don't have tools yet to automatically reallocate the bad one.
 
 jim

Additionally I repeat the suggestion from Sander, check your drive for bad 
blocks. It sounds very likely that your drive is bad and you will get into 
trouble again with the new created FS. And the Oops you've posted smells like 
a bug in btrfs code.

regards,
  Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-06 Thread Sander
Hello Steve,

Steve Freitas wrote (ao):
 Should I take it by the lack of list response that I should just flush
 this partition down the toilet and start over? Or is everybody either
 flummoxed or on vacation?

I don't have your original mail, but I think I remember you mentioned a
lot of bad sectors on that disk reported by SMART.

If that is indeed the case it might be dificult for the people who might
be able to help you, to help you.

Please ignore me if I confused your mail with another.

With kind regard, Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-06 Thread Steve Freitas
Hi Sander,

On Wed, 2010-01-06 at 08:52 +0100, Sander wrote:
 I don't have your original mail, but I think I remember you mentioned a
 lot of bad sectors on that disk reported by SMART.
 
 If that is indeed the case it might be dificult for the people who might
 be able to help you, to help you.

Thanks for your  response. You're correct about the bad sector warning.
So please correct me if I have some mistaken assumptions. I thought
btrfs would be tolerant of that -- if a block failed the checksum test,
it would reconstruct and remap it. (Also, I assumed that if a drive
hadn't filled its bad sector remapping table, it could handle it at the
hardware level, and SMART's warning was just that -- a warning, not a
dire pronouncement of utter unsuitability -- but that's something else.)

Steve

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-06 Thread Johannes Hirte
Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas:
 Hi Sander,
 
 On Wed, 2010-01-06 at 08:52 +0100, Sander wrote:
  I don't have your original mail, but I think I remember you mentioned a
  lot of bad sectors on that disk reported by SMART.
 
  If that is indeed the case it might be dificult for the people who might
  be able to help you, to help you.
 
 Thanks for your  response. You're correct about the bad sector warning.
 So please correct me if I have some mistaken assumptions. I thought
 btrfs would be tolerant of that -- if a block failed the checksum test,
 it would reconstruct and remap it. 
Only if enough redundancy is left. And with the default setup btrfs is only 
mirroring the metadata not the data.

 (Also, I assumed that if a drive
 hadn't filled its bad sector remapping table, it could handle it at the
 hardware level, and SMART's warning was just that -- a warning, not a
 dire pronouncement of utter unsuitability -- but that's something else.)

Bad sectors are only remapped by the drive on write time. As long as this 
isn't the case, they are only marked as pending. As you have written, that 
SMART detected many bad blocks, I suspect the FS is really damaged. And as 
btrfsck is limited, I don't think it can fix this.

regards,
  Johannes
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-06 Thread Steve Freitas
Hi Johannes,

On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote:
 Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas:
  Thanks for your  response. You're correct about the bad sector warning.
  So please correct me if I have some mistaken assumptions. I thought
  btrfs would be tolerant of that -- if a block failed the checksum test,
  it would reconstruct and remap it. 
 Only if enough redundancy is left. And with the default setup btrfs is only 
 mirroring the metadata not the data.

Okay. What capacity does btrfs have for reconstructing data, and how do
I enable it (if any) for a new partition? I think I've confused
checksums with magical ponies.

 Bad sectors are only remapped by the drive on write time. As long as this 
 isn't the case, they are only marked as pending. As you have written, that 
 SMART detected many bad blocks, I suspect the FS is really damaged. And as 
 btrfsck is limited, I don't think it can fix this.

Alright, I'll trash it and start over with a different drive.

Thanks,

Steve

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-05 Thread Steve Freitas
Should I take it by the lack of list response that I should just flush
this partition down the toilet and start over? Or is everybody either
flummoxed or on vacation?

Steve

On Sun, 2010-01-03 at 16:37 -0800, Steve Freitas wrote:
 On Sun, 2010-01-03 at 14:57 -0800, Steve Freitas wrote:
  Got some more information. I installed Debian on another disk (rescue)
  running 2.6.32, pulled the latest btrfs module code from git, applied an
  earlier mentioned patch[1], then compiled and loaded the new module.
  It's able to mount the volume initially...
 
 I've just tried it again with the pure git pull, no patch, and the
 result (of an ls -R /mnt/btrfs_vol) was the same. 'Cept this time it
 never gave me a kernel traceback, just unending lines like:
 
 Jan  3 16:36:36 rescue kernel: [ 1046.494252] parent transid verify
 failed on 69140480 wanted 28342 found 29646
 
 Steve
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)

2010-01-03 Thread Steve Freitas
On Sun, 2010-01-03 at 14:57 -0800, Steve Freitas wrote:
 Got some more information. I installed Debian on another disk (rescue)
 running 2.6.32, pulled the latest btrfs module code from git, applied an
 earlier mentioned patch[1], then compiled and loaded the new module.
 It's able to mount the volume initially...

I've just tried it again with the pure git pull, no patch, and the
result (of an ls -R /mnt/btrfs_vol) was the same. 'Cept this time it
never gave me a kernel traceback, just unending lines like:

Jan  3 16:36:36 rescue kernel: [ 1046.494252] parent transid verify
failed on 69140480 wanted 28342 found 29646

Steve

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html