What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))
Hi all, I was under the mistaken impression that btrfs checksumming, in its current default configuration, protected your data from bitrot. It appears this is not the case: On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote: Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. So can someone please tell me what the current state-of-the-art is of data protection with btrfs? Does it differ with single-device versus multiple-device configurations? Is it possible to enable data checksumming now? Under what conditions? And will it do what a naive user would expect it to do, namely, correct for diverse kinds of errors in your storage subsystem? If not, what does it do? Etc... Any and all information is much appreciated. Thanks! Steve -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))
Steve Freitas wrote: Hi all, I was under the mistaken impression that btrfs checksumming, in its current default configuration, protected your data from bitrot. It appears this is not the case: On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote: Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. So can someone please tell me what the current state-of-the-art is of data protection with btrfs? Does it differ with single-device versus multiple-device configurations? Is it possible to enable data checksumming now? Under what conditions? And will it do what a naive user would expect it to do, namely, correct for diverse kinds of errors in your storage subsystem? If not, what does it do? Etc... First, understand that a checksum only says this block is good or bad. The checksum can not be used to reconstruct the data. Checksums are present for all btrfs blocks unless you explicitly shut them off with mount/ioctl/fcntl options. To have a good copy you can use as a replacement block, you must use either btrfs raid1 or raid10. You can use raid1 with 1 drive, in a mode called dup where both copies are made to that device. By default with 1 drive, btrfs uses dup for metadata and 1 copy (nodup) for file data blocks. To get file data dup, you just use mkfs.btrfs -d raid1. If you have btrfs raid, it will find the good block on a read, but AFAIK we don't have tools yet to automatically reallocate the bad one. jim -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))
Am Donnerstag 07 Januar 2010 20:29:49 schrieb jim owens: Steve Freitas wrote: Hi all, I was under the mistaken impression that btrfs checksumming, in its current default configuration, protected your data from bitrot. It appears this is not the case: On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote: Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. So can someone please tell me what the current state-of-the-art is of data protection with btrfs? Does it differ with single-device versus multiple-device configurations? Is it possible to enable data checksumming now? Under what conditions? And will it do what a naive user would expect it to do, namely, correct for diverse kinds of errors in your storage subsystem? If not, what does it do? Etc... First, understand that a checksum only says this block is good or bad. The checksum can not be used to reconstruct the data. Checksums are present for all btrfs blocks unless you explicitly shut them off with mount/ioctl/fcntl options. To have a good copy you can use as a replacement block, you must use either btrfs raid1 or raid10. You can use raid1 with 1 drive, in a mode called dup where both copies are made to that device. By default with 1 drive, btrfs uses dup for metadata and 1 copy (nodup) for file data blocks. To get file data dup, you just use mkfs.btrfs -d raid1. If you have btrfs raid, it will find the good block on a read, but AFAIK we don't have tools yet to automatically reallocate the bad one. jim Additionally I repeat the suggestion from Sander, check your drive for bad blocks. It sounds very likely that your drive is bad and you will get into trouble again with the new created FS. And the Oops you've posted smells like a bug in btrfs code. regards, Johannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html