Failed Disk RAID10 Problems

2014-05-28 Thread Justin Brown
Hi, I have a Btrfs RAID 10 (data and metadata) file system that I believe suffered a disk failure. In my attempt to replace the disk, I think that I've made the problem worse and need some help recovering it. I happened to notice a lot of errors in the journal: end_request: I/O error, dev

Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy
On May 28, 2014, at 12:19 AM, Justin Brown justin.br...@fandingo.org wrote: Hi, I have a Btrfs RAID 10 (data and metadata) file system that I believe suffered a disk failure. In my attempt to replace the disk, I think that I've made the problem worse and need some help recovering it. I

Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy
On May 28, 2014, at 1:03 AM, Chris Murphy li...@colorremedies.com wrote: For future reference, it should to add a device and then use btrfs device delete missing. it should work (if not it's probably a bug). Chris Murphy -- To unsubscribe from this list: send the line unsubscribe

[PATCH 1/1] btrfs: use btrfs_scratch_super function to zero super

2014-05-28 Thread Anand Jain
From: Anand Jain anand.j...@oracle.com Signed-off-by: Anand Jain anand.j...@oracle.com --- fs/btrfs/volumes.c | 51 +++ fs/btrfs/volumes.h |2 +- 2 files changed, 8 insertions(+), 45 deletions(-) diff --git a/fs/btrfs/volumes.c

[PATCH 1/2] btrfs-progs: fix uninitialized number count in chunk-recover

2014-05-28 Thread Gui Hecheng
When count the number of unordered device extents in chunk-recover, the counter should be reinitialized to be used. Also, introduce a new function for the counting job. Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com --- chunk-recover.c | 20 ++-- 1 file changed, 14

[PATCH 4/4] Btrfs-progs: fsck: fix wrong check for btrfs_read_fs_root()

2014-05-28 Thread Wang Shilong
When encountering a corrupted fs root node, fsck hit following message: Check tree block failed, want=29360128, have=0 Check tree block failed, want=29360128, have=0 Check tree block failed, want=29360128, have=0 Check tree block failed, want=29360128, have=0 Check tree block failed,

[PATCH 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode

2014-05-28 Thread Wang Shilong
Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- cmds-check.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/cmds-check.c b/cmds-check.c index db7df80..0e4e042 100644 --- a/cmds-check.c +++ b/cmds-check.c @@ -6810,8 +6810,7 @@ int cmd_check(int argc, char

[PATCH 2/4] Btrfs-progs: fsck: disallow partial opening if critical roots corrupted

2014-05-28 Thread Wang Shilong
If btrfs tree root is corrupted, fsck will hit the following segmentation. enabling repair mode Check tree block failed, want=29376512, have=0 Check tree block failed, want=29376512, have=0 Check tree block failed, want=29376512, have=0 Check tree block failed, want=29376512, have=0 Check tree

[PATCH 3/4] Btrfs-progs: fsck: deal with corrupted csum root

2014-05-28 Thread Wang Shilong
If checksum root is corrupted, fsck will get segmentation. This is because if we fail to load checksum root, root's node is NULL which cause NULL pointer deferences later. To fix this problem, we just did something like extent tree rebuilding. Allocate a new one and clear uptodate flag. We will

[PATCH v2] Btrfs-progs: fsck: add an option to check data csums

2014-05-28 Thread Wang Shilong
This patch adds an option '--check-data-csum' to verify data checksums. fsck won't check data csums unless users specify this option explictly. Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- v1-v2: addressed comments from david. read as much as data once a time. ---

Re: 3.15-rc6 - btrfs-transacti:4157 blocked for more than 120 seconds.

2014-05-28 Thread Chris Mason
On 05/28/2014 01:53 AM, Torbjørn wrote: It's actually a raid10 array of 11 dm-crypt devices. I'm able to read data from the array (accessing files), and also read directly from all the underlying dm-crypt devices using dd, if that's what you meant. I have not rebooted the system since that

Re: [PATCH 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode

2014-05-28 Thread Eric Sandeen
The subject and the comment say what this change does, but that's obvious from reading the code. Nothing says *why* the change has been made. What does this fix, and how does it fix it? Can you add/update the commit log so that some reader in the future (or for that matter, a reviewer in the

Re: [PATCH 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode

2014-05-28 Thread Shilong Wang
2014-05-28 21:56 GMT+08:00 Eric Sandeen sand...@redhat.com: The subject and the comment say what this change does, but that's obvious from reading the code. Nothing says *why* the change has been made. What does this fix, and how does it fix it? Yup, the reason that we allow partial opening

Re: 3.15-rc6 - btrfs-transacti:4157 blocked for more than 120 seconds.

2014-05-28 Thread Torbjørn
On 05/28/2014 03:41 PM, Chris Mason wrote: On 05/28/2014 01:53 AM, Torbjørn wrote: It's actually a raid10 array of 11 dm-crypt devices. I'm able to read data from the array (accessing files), and also read directly from all the underlying dm-crypt devices using dd, if that's what you meant. I

Re: [PATCH 1/4] Btrfs-progs: fsck: only allow partial opening under repair mode

2014-05-28 Thread Eric Sandeen
On 5/28/14, 9:10 AM, Shilong Wang wrote: 2014-05-28 21:56 GMT+08:00 Eric Sandeen sand...@redhat.com: The subject and the comment say what this change does, but that's obvious from reading the code. Nothing says *why* the change has been made. What does this fix, and how does it fix it?

Re: [PATCH] btrfs-show-super: don't try to print not-superblocks

2014-05-28 Thread David Sterba
On Tue, May 13, 2014 at 09:03:04PM -0500, Eric Sandeen wrote: If we point btrfs-show-super at a not-btrfs-device and try to print all superblocks, bad things are apt to happen: superblock: bytenr=274877906944, device=/dev/sdc2 -

Re: [PATCH 1/3] btrfs-progs: cleanup btrfs-rescue output msgs

2014-05-28 Thread David Sterba
On Thu, May 15, 2014 at 09:29:07AM +0800, Gui Hecheng wrote: Use enum defined error codes to represent different kinds of errs for super-recover and chunk-recover. I think this change hides the low-level errors (like ENOMEM) that can possibly result into recovery not possible, though it can be

Re: [PATCH] btrfs-progs: provide better error message for raid profile mismatch

2014-05-28 Thread David Sterba
On Fri, May 16, 2014 at 05:20:56PM +0900, Hidetoshi Seto wrote: Current error messages are like following: Error: unable to create FS with metadata profile 32 (have 2 devices) Error: unable to create FS with metadata profile 256 (have 2 devices) Obviously it is hard for users to

Re: [PATCH] btrfs-image: Fix a data race in build_chunk_tree.

2014-05-28 Thread David Sterba
On Sun, May 18, 2014 at 10:40:42PM -0700, Adam Buchbinder wrote: A mdrestore_struct was being written to without its mutex being held. This race was found with ThreadSanitizer; the relevant part of the report looks like this: WARNING: ThreadSanitizer: data race (pid=18828) Write of size 8

Re: [PATCH] btrfs-progs: Improve the parse_size() error message.

2014-05-28 Thread David Sterba
On Tue, May 20, 2014 at 03:51:45PM +0800, Qu Wenruo wrote: When using parse_size(), even non-numeric value is passed, it will only give error message ERROR: size value is empty, which is quite confusing for end users. This patch will introduce more meaningful error message for the following

Re: [PATCH] Add some simple end-to-end tests for btrfs-convert.

2014-05-28 Thread David Sterba
On Wed, May 21, 2014 at 10:20:27AM -0700, Adam Buchbinder wrote: These use the system's mke2fs, and don't require loop devices or root privileges. Nice, thanks. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More

Fwd: Failed Disk RAID10 Problems

2014-05-28 Thread Justin Brown
Chris, Thanks for the tip. I was able to mount the drive as degraded and recovery. Then, I deleted the faulty drive, leaving me with the following array: Label: media uuid: 7b7afc82-f77c-44c0-b315-669ebd82f0c5 Total devices 6 FS bytes used 2.40TiB devid1 size 931.51GiB used 919.88GiB

Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy
On May 28, 2014, at 12:39 PM, Justin Brown justin.br...@fandingo.org wrote: Chris, Thanks for the tip. I was able to mount the drive as degraded and recovery. Then, I deleted the faulty drive, leaving me with the following array: Label: media uuid:

Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy
On May 28, 2014, at 12:39 PM, Justin Brown justin.br...@fandingo.org wrote: Chris, Thanks for the tip. I was able to mount the drive as degraded and recovery. Then, I deleted the faulty drive, leaving me with the following array: Label: media uuid:

Re: [PATCH] btrfs-progs: Improve the parse_size() error message.

2014-05-28 Thread Mike Fleetwood
On 20 May 2014 08:51, Qu Wenruo quwen...@cn.fujitsu.com wrote: When using parse_size(), even non-numeric value is passed, it will only give error message ERROR: size value is empty, which is quite confusing for end users. This patch will introduce more meaningful error message for the

Re: [PATCH] btrfs-progs: Improve the parse_size() error message.

2014-05-28 Thread Qu Wenruo
Thanks for the commenting. Original Message Subject: Re: [PATCH] btrfs-progs: Improve the parse_size() error message. From: David Sterba dste...@suse.cz To: Qu Wenruo quwen...@cn.fujitsu.com Date: 2014年05月29日 01:20 On Tue, May 20, 2014 at 03:51:45PM +0800, Qu Wenruo wrote:

Re: [PATCH] btrfs-progs: Improve the parse_size() error message.

2014-05-28 Thread Qu Wenruo
Original Message Subject: Re: [PATCH] btrfs-progs: Improve the parse_size() error message. From: Mike Fleetwood mike.fleetw...@googlemail.com To: Qu Wenruo quwen...@cn.fujitsu.com Date: 2014年05月29日 05:07 On 20 May 2014 08:51, Qu Wenruo quwen...@cn.fujitsu.com wrote: When

[PATCH] btrfs-progs: Add dev uuid output for print_dev_item().

2014-05-28 Thread Qu Wenruo
The original print_dev_item() only prints device id,total bytes and bytes used. When it comes to debug things related to duplicated device id, dev uuid is needed to distinguish different device since device is is no reliable. This patch added dev uuid output. Signed-off-by: Qu Wenruo

[PATCH] btrfs: replace EINVAL with ERANGE for resize when ULLONG_MAX

2014-05-28 Thread Gui Hecheng
To be accurate about the error case, if the new size is beyond ULLONG_MAX, return ERANGE instead of EINVAL. Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com --- fs/btrfs/ioctl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index

[PATCH v2] btrfs-progs: Improve the parse_size() error message.

2014-05-28 Thread Qu Wenruo
When using parse_size(), even non-numeric value is passed, it will only give error message ERROR: size value is empty, which is quite confusing for end users. This patch will introduce more meaningful error message for the following new cases 1) Invalid size string (non-numeric string) 2) Minus

Re: [PATCH 1/3] btrfs-progs: cleanup btrfs-rescue output msgs

2014-05-28 Thread Gui Hecheng
On Wed, 2014-05-28 at 18:24 +0200, David Sterba wrote: On Thu, May 15, 2014 at 09:29:07AM +0800, Gui Hecheng wrote: Use enum defined error codes to represent different kinds of errs for super-recover and chunk-recover. I think this change hides the low-level errors (like ENOMEM) that can