Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)
Hello Steve, Steve Freitas wrote (ao): Alright, I'll trash it and start over with a different drive. With the danger of mentioning the obvious: you could do a few destructive badblocks runs on that disk to see if SMART keeps adding up to the bad blocks list. With kind regards, Sander -- Humilis IT Services and Solutions http://www.humilis.net -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))
Hi all, I was under the mistaken impression that btrfs checksumming, in its current default configuration, protected your data from bitrot. It appears this is not the case: On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote: Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. So can someone please tell me what the current state-of-the-art is of data protection with btrfs? Does it differ with single-device versus multiple-device configurations? Is it possible to enable data checksumming now? Under what conditions? And will it do what a naive user would expect it to do, namely, correct for diverse kinds of errors in your storage subsystem? If not, what does it do? Etc... Any and all information is much appreciated. Thanks! Steve -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))
Steve Freitas wrote: Hi all, I was under the mistaken impression that btrfs checksumming, in its current default configuration, protected your data from bitrot. It appears this is not the case: On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote: Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. So can someone please tell me what the current state-of-the-art is of data protection with btrfs? Does it differ with single-device versus multiple-device configurations? Is it possible to enable data checksumming now? Under what conditions? And will it do what a naive user would expect it to do, namely, correct for diverse kinds of errors in your storage subsystem? If not, what does it do? Etc... First, understand that a checksum only says this block is good or bad. The checksum can not be used to reconstruct the data. Checksums are present for all btrfs blocks unless you explicitly shut them off with mount/ioctl/fcntl options. To have a good copy you can use as a replacement block, you must use either btrfs raid1 or raid10. You can use raid1 with 1 drive, in a mode called dup where both copies are made to that device. By default with 1 drive, btrfs uses dup for metadata and 1 copy (nodup) for file data blocks. To get file data dup, you just use mkfs.btrfs -d raid1. If you have btrfs raid, it will find the good block on a read, but AFAIK we don't have tools yet to automatically reallocate the bad one. jim -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What protection does btrfs checksumming currently give? (Was Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck))
Am Donnerstag 07 Januar 2010 20:29:49 schrieb jim owens: Steve Freitas wrote: Hi all, I was under the mistaken impression that btrfs checksumming, in its current default configuration, protected your data from bitrot. It appears this is not the case: On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote: Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. So can someone please tell me what the current state-of-the-art is of data protection with btrfs? Does it differ with single-device versus multiple-device configurations? Is it possible to enable data checksumming now? Under what conditions? And will it do what a naive user would expect it to do, namely, correct for diverse kinds of errors in your storage subsystem? If not, what does it do? Etc... First, understand that a checksum only says this block is good or bad. The checksum can not be used to reconstruct the data. Checksums are present for all btrfs blocks unless you explicitly shut them off with mount/ioctl/fcntl options. To have a good copy you can use as a replacement block, you must use either btrfs raid1 or raid10. You can use raid1 with 1 drive, in a mode called dup where both copies are made to that device. By default with 1 drive, btrfs uses dup for metadata and 1 copy (nodup) for file data blocks. To get file data dup, you just use mkfs.btrfs -d raid1. If you have btrfs raid, it will find the good block on a read, but AFAIK we don't have tools yet to automatically reallocate the bad one. jim Additionally I repeat the suggestion from Sander, check your drive for bad blocks. It sounds very likely that your drive is bad and you will get into trouble again with the new created FS. And the Oops you've posted smells like a bug in btrfs code. regards, Johannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)
Hello Steve, Steve Freitas wrote (ao): Should I take it by the lack of list response that I should just flush this partition down the toilet and start over? Or is everybody either flummoxed or on vacation? I don't have your original mail, but I think I remember you mentioned a lot of bad sectors on that disk reported by SMART. If that is indeed the case it might be dificult for the people who might be able to help you, to help you. Please ignore me if I confused your mail with another. With kind regard, Sander -- Humilis IT Services and Solutions http://www.humilis.net -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)
Hi Sander, On Wed, 2010-01-06 at 08:52 +0100, Sander wrote: I don't have your original mail, but I think I remember you mentioned a lot of bad sectors on that disk reported by SMART. If that is indeed the case it might be dificult for the people who might be able to help you, to help you. Thanks for your response. You're correct about the bad sector warning. So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. (Also, I assumed that if a drive hadn't filled its bad sector remapping table, it could handle it at the hardware level, and SMART's warning was just that -- a warning, not a dire pronouncement of utter unsuitability -- but that's something else.) Steve -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)
Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: Hi Sander, On Wed, 2010-01-06 at 08:52 +0100, Sander wrote: I don't have your original mail, but I think I remember you mentioned a lot of bad sectors on that disk reported by SMART. If that is indeed the case it might be dificult for the people who might be able to help you, to help you. Thanks for your response. You're correct about the bad sector warning. So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. (Also, I assumed that if a drive hadn't filled its bad sector remapping table, it could handle it at the hardware level, and SMART's warning was just that -- a warning, not a dire pronouncement of utter unsuitability -- but that's something else.) Bad sectors are only remapped by the drive on write time. As long as this isn't the case, they are only marked as pending. As you have written, that SMART detected many bad blocks, I suspect the FS is really damaged. And as btrfsck is limited, I don't think it can fix this. regards, Johannes -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)
Hi Johannes, On Wed, 2010-01-06 at 18:24 +0100, Johannes Hirte wrote: Am Mittwoch 06 Januar 2010 16:59:55 schrieb Steve Freitas: Thanks for your response. You're correct about the bad sector warning. So please correct me if I have some mistaken assumptions. I thought btrfs would be tolerant of that -- if a block failed the checksum test, it would reconstruct and remap it. Only if enough redundancy is left. And with the default setup btrfs is only mirroring the metadata not the data. Okay. What capacity does btrfs have for reconstructing data, and how do I enable it (if any) for a new partition? I think I've confused checksums with magical ponies. Bad sectors are only remapped by the drive on write time. As long as this isn't the case, they are only marked as pending. As you have written, that SMART detected many bad blocks, I suspect the FS is really damaged. And as btrfsck is limited, I don't think it can fix this. Alright, I'll trash it and start over with a different drive. Thanks, Steve -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)
Should I take it by the lack of list response that I should just flush this partition down the toilet and start over? Or is everybody either flummoxed or on vacation? Steve On Sun, 2010-01-03 at 16:37 -0800, Steve Freitas wrote: On Sun, 2010-01-03 at 14:57 -0800, Steve Freitas wrote: Got some more information. I installed Debian on another disk (rescue) running 2.6.32, pulled the latest btrfs module code from git, applied an earlier mentioned patch[1], then compiled and loaded the new module. It's able to mount the volume initially... I've just tried it again with the pure git pull, no patch, and the result (of an ls -R /mnt/btrfs_vol) was the same. 'Cept this time it never gave me a kernel traceback, just unending lines like: Jan 3 16:36:36 rescue kernel: [ 1046.494252] parent transid verify failed on 69140480 wanted 28342 found 29646 Steve -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs volume mounts and dies (was Re: Segfault in btrfsck)
On Sun, 2010-01-03 at 14:57 -0800, Steve Freitas wrote: Got some more information. I installed Debian on another disk (rescue) running 2.6.32, pulled the latest btrfs module code from git, applied an earlier mentioned patch[1], then compiled and loaded the new module. It's able to mount the volume initially... I've just tried it again with the pure git pull, no patch, and the result (of an ls -R /mnt/btrfs_vol) was the same. 'Cept this time it never gave me a kernel traceback, just unending lines like: Jan 3 16:36:36 rescue kernel: [ 1046.494252] parent transid verify failed on 69140480 wanted 28342 found 29646 Steve -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html