Re: Btrfs Storage Array Corrupted

2012-03-26 Thread Travis Shivers
Hello,

I was just wondering what the status of the patch that would fix my
problem is. I have not heard from anyone in over a month, and I just
wanted to check in and see how it is progressing, or if you have put
it on hold for more important work like kernel 3.4. Thank you again
for helping me.

On Fri, Mar 16, 2012 at 12:26 PM, Travis Shivers ttshiv...@gmail.com wrote:

 When you finish the patch, where should I get it from? Is it a general btrfs 
 patch, and I would grab it in the next kernel release, or is it a specific 
 patch for me, and I would grab it from a btrfs repository? Thanks again for 
 all of your help.


 On Wed, Feb 29, 2012 at 5:58 PM, Travis Shivers ttshiv...@gmail.com wrote:

 I was running a fairly old version of the kernel:
 Linux server 3.0.0-16-generic #28-Ubuntu SMP Fri Jan 27 17:44:39 UTC
 2012 x86_64 x86_64 x86_64 GNU/Linux

 On Wed, Feb 29, 2012 at 5:44 PM, Chris Mason chris.ma...@oracle.com wrote:
  On Wed, Feb 29, 2012 at 05:11:24PM -0600, Travis Shivers wrote:
  Thank you all for helping. My btrfs array consists of 4 disks: 2 (2
  TB) disks and 2(500 GB) disks. Since I have disks of different sizes,
  I have the array being mirrored so that there are two copies of a file
  on two separate disks. The data and metadata are mirrored.
 
  I originally made the array by using this command:
 
  # mkfs.btrfs -m raid1 -d raid1 /dev/sd[abcd]
  (The drives were originally those letters)
 
 
  All of the disks sit in an external 4 bay ESATA enclosure going into a
  PCI-E RAID card set up as JBOD, so I can use btrfs' software
  mirroring. This is the enclosure that I have:
  http://www.newegg.com/Product/Product.aspx?Item=N82E16816132029
 
  The corruption was unexpected. I am not entirely sure what caused it,
  but a few days before the corruption, there were several power
  outages. I do not think that the problem is with the actual hard drive
  hardware since they are fairly new (6 months old) and they pass all
  SMART tests. After a reboot, the btrfs array refused to mount and
  started giving off errors. I do weekly scrubs, balances, and
  defragmentation.
 
  Ok, all of this should have worked.  Which kernel were you running when
  you had the power outages?
 
  I'm testing out the patch to skip the extent allocation tree at mount.
  That will be the easiest way to get to the data (readonly, but it'll
  work).
 
  -chris


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-03-16 Thread Travis Shivers
When you finish the patch, where should I get it from? Is it a general
btrfs patch, and I would grab it in the next kernel release, or is it
a specific patch for me, and I would grab it from a btrfs repository?
Thanks again for all of your help.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-29 Thread Chris Mason
On Tue, Feb 28, 2012 at 09:36:35PM -0600, Travis Shivers wrote:
 I upgraded my kernel so my version is now:
 Linux server 3.3.0-030300rc5-generic #201202251535 SMP Sat Feb 25
 20:36:29 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
 
 The problem has not been solved and I still get the previous errors.

Ok,

Step one is to grab the development version of btrfs-progs, which
currently sits in the dangerdonteveruse branch:

git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git 
dangerdonteveruse

Run btrfs-debug-tree -R /dev/sdh

and then run btrfs-debug-tree -b 5568194695168 /dev/sdh

and then run btrfsck /dev/sdh

Send the results of all three here, it should tell us which tree that
block belongs to, and from there we'll figure out the best way to fix
it.

-chris

 
 # mount /dev/sdh /mnt/main
 mount: wrong fs type, bad option, bad superblock on /dev/sdh,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail  or so
 
 # dmesg
 [  232.985248] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
 4 transid 43477 /dev/sdi
 [  232.985434] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
 3 transid 43477 /dev/sdh
 [  233.027881] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
 2 transid 43477 /dev/sdg
 [  233.065675] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
 1 transid 43476 /dev/sdf
 [  284.384320] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
 3 transid 43477 /dev/sdh
 [  284.427076] btrfs: disk space caching is enabled
 [  284.442565] verify_parent_transid: 2 callbacks suppressed
 [  284.442572] parent transid verify failed on 5568194695168 wanted
 43477 found 43151
 [  284.442834] parent transid verify failed on 5568194695168 wanted
 43477 found 43151
 [  284.443151] parent transid verify failed on 5568194695168 wanted
 43477 found 43151
 [  284.443159] parent transid verify failed on 5568194695168 wanted
 43477 found 43151
 [  284.445740] btrfs: open_ctree failed
 
 
 On Tue, Feb 28, 2012 at 9:16 PM, cwillu cwi...@cwillu.com wrote:
  On Tue, Feb 28, 2012 at 9:00 PM, Travis Shivers ttshiv...@gmail.com wrote:
  Where should I grab the source from? The main repo that you have
  listed on your main wiki page
  (https://btrfs.wiki.kernel.org/articles/b/t/r/Btrfs_source_repositories.html)
  is down: 
  git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git
 
  The btrfs wiki is at http://btrfs.ipv5.de .  The kernel.org one is a
  static snapshot of the contents made nearly a year ago, prior to the
  kernel.org break-in, and should be ignored.
 
  git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git is
  the development tree, although the above patch is in mainline as of
  3.3rc5, which probably makes that the easiest way to try it.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-29 Thread Travis Shivers
Here is the output from the commands:

# ./btrfs-debug-tree -R /dev/sdh
failed to read /dev/sr0: No medium found
failed to read /dev/sde: No medium found
failed to read /dev/sdd: No medium found
failed to read /dev/sdc: No medium found
failed to read /dev/sdb: No medium found
failed to read /dev/sda: No medium found
parent transid verify failed on 5568194695168 wanted 43477 found 43151
parent transid verify failed on 5568194695168 wanted 43477 found 43151
parent transid verify failed on 5568194695168 wanted 43477 found 43151
parent transid verify failed on 5568194695168 wanted 43477 found 43151
Ignoring transid failure
parent transid verify failed on 5568194748416 wanted 43477 found 43151
parent transid verify failed on 5568194748416 wanted 43477 found 43151
parent transid verify failed on 5568194748416 wanted 43477 found 43151
parent transid verify failed on 5568194748416 wanted 43477 found 43151
Ignoring transid failure
root tree: 5568194412544 level 1
chunk tree: 20979712 level 1
extent tree key (EXTENT_TREE ROOT_ITEM 0) 5568194416640 level 3
device tree key (DEV_TREE ROOT_ITEM 0) 4895076519936 level 1
fs tree key (FS_TREE ROOT_ITEM 0) 4895092506624 level 2
checksum tree key (CSUM_TREE ROOT_ITEM 0) 5568194695168 level 0
parent transid verify failed on 5568194801664 wanted 43477 found 43151
parent transid verify failed on 5568194801664 wanted 43477 found 43151
parent transid verify failed on 5568194801664 wanted 43477 found 43151
parent transid verify failed on 5568194801664 wanted 43477 found 43151
Ignoring transid failure
parent transid verify failed on 5568194674688 wanted 43477 found 43151
parent transid verify failed on 5568194674688 wanted 43477 found 43151
parent transid verify failed on 5568194674688 wanted 43477 found 43151
parent transid verify failed on 5568194674688 wanted 43477 found 43151
Ignoring transid failure
parent transid verify failed on 5568194678784 wanted 43477 found 43151
parent transid verify failed on 5568194678784 wanted 43477 found 43151
parent transid verify failed on 5568194678784 wanted 43477 found 43151
parent transid verify failed on 5568194678784 wanted 43477 found 43151
Ignoring transid failure
parent transid verify failed on 5568194809856 wanted 43477 found 43151
parent transid verify failed on 5568194809856 wanted 43477 found 43151
parent transid verify failed on 5568194809856 wanted 43477 found 43151
parent transid verify failed on 5568194809856 wanted 43477 found 43151
Ignoring transid failure
parent transid verify failed on 5568194875392 wanted 43477 found 42983
parent transid verify failed on 5568194875392 wanted 43477 found 42983
parent transid verify failed on 5568194875392 wanted 43477 found 42983
parent transid verify failed on 5568194875392 wanted 43477 found 42983
Ignoring transid failure
parent transid verify failed on 5568195104768 wanted 43477 found 43151
parent transid verify failed on 5568195104768 wanted 43477 found 43151
parent transid verify failed on 5568195104768 wanted 43477 found 43151
parent transid verify failed on 5568195104768 wanted 43477 found 43151
Ignoring transid failure
parent transid verify failed on 5568195043328 wanted 43477 found 43151
parent transid verify failed on 5568195162112 wanted 43477 found 43175
parent transid verify failed on 5568195162112 wanted 43477 found 43175
parent transid verify failed on 5568195162112 wanted 43477 found 43175
parent transid verify failed on 5568195162112 wanted 43477 found 43175
Ignoring transid failure
parent transid verify failed on 5568195166208 wanted 43477 found 43175
parent transid verify failed on 5568195166208 wanted 43477 found 43175
parent transid verify failed on 5568195166208 wanted 43477 found 43175
parent transid verify failed on 5568195166208 wanted 43477 found 43175
Ignoring transid failure
btrfs root backup slot 0
tree root gen 9799893461141291008 block 0
extent root gen 67174399 block 976369115086847
chunk root gen 18446605274118684671 block 9799972705260863487
device root gen 977658994114559 block 18446638534628474880
csum root gen 94490787839 block 18446638559949619199
fs root gen 262144 block 1048576
974850661629952 used 0 total 977659432419327 devices
btrfs root backup slot 1
tree root gen 16777216 block 38655295488
extent root gen 1179648 block 6989415099341275135
chunk root gen 18446605285113004031 block 977659432353792
device root gen 9223372036861329408 block 0
csum root gen 65535 block 977659424489472
fs root gen 4295032832 block 25769803776
282399669551104 used 282400664715264 total
9799892621752008704 devices
btrfs root backup slot 2
tree root gen 65535 block 18446744073709551615
extent root gen 977659447099391 block 977659447033856
chunk root gen 0 block 0
device root gen 

Re: Btrfs Storage Array Corrupted

2012-02-29 Thread Chris Mason
On Wed, Feb 29, 2012 at 03:57:19PM -0600, Travis Shivers wrote:
 Here is the output from the commands:
 
 # ./btrfs-debug-tree -R /dev/sdh
 failed to read /dev/sr0: No medium found
 failed to read /dev/sde: No medium found
 failed to read /dev/sdd: No medium found
 failed to read /dev/sdc: No medium found
 failed to read /dev/sdb: No medium found
 failed to read /dev/sda: No medium found
 parent transid verify failed on 5568194695168 wanted 43477 found 43151

So far all the blocks that have come up look like they are in the extent
allocation tree.  This helps because it is the easiest to recover.

I can also make a patch for you against 3.3-rc that skips reading it
entirely, which should make it possible to copy things off.

But before I do that, could you describe the raid array?  Was it
mirrored or raid10?  What exactly happened when it stopped working?

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-29 Thread Travis Shivers
Thank you all for helping. My btrfs array consists of 4 disks: 2 (2
TB) disks and 2(500 GB) disks. Since I have disks of different sizes,
I have the array being mirrored so that there are two copies of a file
on two separate disks. The data and metadata are mirrored.

I originally made the array by using this command:

# mkfs.btrfs -m raid1 -d raid1 /dev/sd[abcd]
(The drives were originally those letters)


All of the disks sit in an external 4 bay ESATA enclosure going into a
PCI-E RAID card set up as JBOD, so I can use btrfs' software
mirroring. This is the enclosure that I have:
http://www.newegg.com/Product/Product.aspx?Item=N82E16816132029

The corruption was unexpected. I am not entirely sure what caused it,
but a few days before the corruption, there were several power
outages. I do not think that the problem is with the actual hard drive
hardware since they are fairly new (6 months old) and they pass all
SMART tests. After a reboot, the btrfs array refused to mount and
started giving off errors. I do weekly scrubs, balances, and
defragmentation.

Here is what btrfs filesystem show says:

# btrfs filesystem show
Label: none  uuid: 2c11a326-5630-484e-9f1d-9dab777a1028
Total devices 4 FS bytes used 1.08TB
devid1 size 1.82TB used 1.08TB path /dev/sdf
devid2 size 1.82TB used 1.08TB path /dev/sdg
devid3 size 465.76GB used 8.00MB path /dev/sdh
devid4 size 465.76GB used 8.00MB path /dev/sdi

Btrfs Btrfs v0.19

These are my normal mount line for the array in /etc/fstab

UUID=2c11a326-5630-484e-9f1d-9dab777a1028 /mnt/main btrfs
noatime,nodiratime,compress=lzo,space_cache,inode_cache 0 1


On Wed, Feb 29, 2012 at 4:14 PM, Chris Mason chris.ma...@oracle.com wrote:
 On Wed, Feb 29, 2012 at 03:57:19PM -0600, Travis Shivers wrote:
 Here is the output from the commands:

 # ./btrfs-debug-tree -R /dev/sdh
 failed to read /dev/sr0: No medium found
 failed to read /dev/sde: No medium found
 failed to read /dev/sdd: No medium found
 failed to read /dev/sdc: No medium found
 failed to read /dev/sdb: No medium found
 failed to read /dev/sda: No medium found
 parent transid verify failed on 5568194695168 wanted 43477 found 43151

 So far all the blocks that have come up look like they are in the extent
 allocation tree.  This helps because it is the easiest to recover.

 I can also make a patch for you against 3.3-rc that skips reading it
 entirely, which should make it possible to copy things off.

 But before I do that, could you describe the raid array?  Was it
 mirrored or raid10?  What exactly happened when it stopped working?

 -chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-29 Thread Chris Mason
On Wed, Feb 29, 2012 at 05:11:24PM -0600, Travis Shivers wrote:
 Thank you all for helping. My btrfs array consists of 4 disks: 2 (2
 TB) disks and 2(500 GB) disks. Since I have disks of different sizes,
 I have the array being mirrored so that there are two copies of a file
 on two separate disks. The data and metadata are mirrored.
 
 I originally made the array by using this command:
 
 # mkfs.btrfs -m raid1 -d raid1 /dev/sd[abcd]
 (The drives were originally those letters)
 
 
 All of the disks sit in an external 4 bay ESATA enclosure going into a
 PCI-E RAID card set up as JBOD, so I can use btrfs' software
 mirroring. This is the enclosure that I have:
 http://www.newegg.com/Product/Product.aspx?Item=N82E16816132029
 
 The corruption was unexpected. I am not entirely sure what caused it,
 but a few days before the corruption, there were several power
 outages. I do not think that the problem is with the actual hard drive
 hardware since they are fairly new (6 months old) and they pass all
 SMART tests. After a reboot, the btrfs array refused to mount and
 started giving off errors. I do weekly scrubs, balances, and
 defragmentation.

Ok, all of this should have worked.  Which kernel were you running when
you had the power outages?

I'm testing out the patch to skip the extent allocation tree at mount.
That will be the easiest way to get to the data (readonly, but it'll
work).

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-29 Thread Travis Shivers
I was running a fairly old version of the kernel:
Linux server 3.0.0-16-generic #28-Ubuntu SMP Fri Jan 27 17:44:39 UTC
2012 x86_64 x86_64 x86_64 GNU/Linux

On Wed, Feb 29, 2012 at 5:44 PM, Chris Mason chris.ma...@oracle.com wrote:
 On Wed, Feb 29, 2012 at 05:11:24PM -0600, Travis Shivers wrote:
 Thank you all for helping. My btrfs array consists of 4 disks: 2 (2
 TB) disks and 2(500 GB) disks. Since I have disks of different sizes,
 I have the array being mirrored so that there are two copies of a file
 on two separate disks. The data and metadata are mirrored.

 I originally made the array by using this command:

 # mkfs.btrfs -m raid1 -d raid1 /dev/sd[abcd]
 (The drives were originally those letters)


 All of the disks sit in an external 4 bay ESATA enclosure going into a
 PCI-E RAID card set up as JBOD, so I can use btrfs' software
 mirroring. This is the enclosure that I have:
 http://www.newegg.com/Product/Product.aspx?Item=N82E16816132029

 The corruption was unexpected. I am not entirely sure what caused it,
 but a few days before the corruption, there were several power
 outages. I do not think that the problem is with the actual hard drive
 hardware since they are fairly new (6 months old) and they pass all
 SMART tests. After a reboot, the btrfs array refused to mount and
 started giving off errors. I do weekly scrubs, balances, and
 defragmentation.

 Ok, all of this should have worked.  Which kernel were you running when
 you had the power outages?

 I'm testing out the patch to skip the extent allocation tree at mount.
 That will be the easiest way to get to the data (readonly, but it'll
 work).

 -chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread Fajar A. Nugraha
On Wed, Feb 29, 2012 at 7:13 AM, Travis Shivers ttshiv...@gmail.com wrote:
 # ./btrfs-zero-log /dev/sdh
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 parent transid verify failed on 5568194695168 wanted 43477 found 43151
 Ignoring transid failure

Did you try a read-only mount (-o ro) after you run btrfs-zero-log?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread Gurudas Pai




# btrfs-debug-tree /dev/sdh

couldn't open because of unsupported option features (8).
btrfs-debug-tree: disk-io.c:679: open_ctree_fd: Assertion `!(1)' failed




See if following patch helps.

Author: Chris Masonchris.ma...@oracle.com
Date:   Wed Feb 22 12:36:24 2012 -0500

Btrfs: clear the extent uptodate bits during parent transid failures

If btrfs reads a block and finds a parent transid mismatch, it clears
the uptodate flags on the extent buffer, and the pages inside it.  But
we only clear the uptodate bits in the state tree if the block straddles
more than one page.

This is from an old optimization from to reduce contention on the extent
state tree.  But it is buggy because the code that retries a read from
a different copy of the block is going to find the uptodate state bits
set and skip the IO.

The end result of the bug is that we'll never actually read the good
copy (if there is one).

The fix here is to always clear the uptodate state bits, which is safe
because this code is only called when the parent transid fails.

Signed-off-by: Chris Masonchris.ma...@oracle.com

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1e8d5e5..a4dc892 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3852,10 +3852,9 @@ int clear_extent_buffer_uptodate(struct extent_io_tree 
*tree,
num_pages = num_extent_pages(eb-start, eb-len);
clear_bit(EXTENT_BUFFER_UPTODATE,eb-bflags);

-   if (eb_straddles_pages(eb)) {
-   clear_extent_uptodate(tree, eb-start, eb-start + eb-len - 1,
- cached_state, GFP_NOFS);
-   }
+   clear_extent_uptodate(tree, eb-start, eb-start + eb-len - 1,
+ cached_state, GFP_NOFS);
+
for (i = 0; i  num_pages; i++) {
page = extent_buffer_page(eb, i);
if (page)
--

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread Chris Mason
On Tue, Feb 28, 2012 at 05:41:45PM -0800, Gurudas Pai wrote:
 
 
 # btrfs-debug-tree /dev/sdh
 couldn't open because of unsupported option features (8).
 btrfs-debug-tree: disk-io.c:679: open_ctree_fd: Assertion `!(1)' failed
 
 
 
 
 See if following patch helps.

This patch may just fix things (if we're really lucky).  If not, yes we
can help figure out where the problem is and get past it.

-chris

 
 Author: Chris Masonchris.ma...@oracle.com
 Date:   Wed Feb 22 12:36:24 2012 -0500
 
 Btrfs: clear the extent uptodate bits during parent transid failures
 
 If btrfs reads a block and finds a parent transid mismatch, it clears
 the uptodate flags on the extent buffer, and the pages inside it.  But
 we only clear the uptodate bits in the state tree if the block straddles
 more than one page.
 
 This is from an old optimization from to reduce contention on the extent
 state tree.  But it is buggy because the code that retries a read from
 a different copy of the block is going to find the uptodate state bits
 set and skip the IO.
 
 The end result of the bug is that we'll never actually read the good
 copy (if there is one).
 
 The fix here is to always clear the uptodate state bits, which is safe
 because this code is only called when the parent transid fails.
 
 Signed-off-by: Chris Masonchris.ma...@oracle.com
 
 diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
 index 1e8d5e5..a4dc892 100644
 --- a/fs/btrfs/extent_io.c
 +++ b/fs/btrfs/extent_io.c
 @@ -3852,10 +3852,9 @@ int clear_extent_buffer_uptodate(struct extent_io_tree 
 *tree,
   num_pages = num_extent_pages(eb-start, eb-len);
   clear_bit(EXTENT_BUFFER_UPTODATE,eb-bflags);
 
 - if (eb_straddles_pages(eb)) {
 - clear_extent_uptodate(tree, eb-start, eb-start + eb-len - 1,
 -   cached_state, GFP_NOFS);
 - }
 + clear_extent_uptodate(tree, eb-start, eb-start + eb-len - 1,
 +   cached_state, GFP_NOFS);
 +
   for (i = 0; i  num_pages; i++) {
   page = extent_buffer_page(eb, i);
   if (page)
 --
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread Travis Shivers
Where should I grab the source from? The main repo that you have
listed on your main wiki page
(https://btrfs.wiki.kernel.org/articles/b/t/r/Btrfs_source_repositories.html)
is down: 
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git

Has this patch been pushed to a repository where I can grab it?

On Tue, Feb 28, 2012 at 8:50 PM, Chris Mason chris.ma...@oracle.com wrote:
 On Tue, Feb 28, 2012 at 05:41:45PM -0800, Gurudas Pai wrote:


 # btrfs-debug-tree /dev/sdh
 couldn't open because of unsupported option features (8).
 btrfs-debug-tree: disk-io.c:679: open_ctree_fd: Assertion `!(1)' failed




 See if following patch helps.

 This patch may just fix things (if we're really lucky).  If not, yes we
 can help figure out where the problem is and get past it.

 -chris


 Author: Chris Masonchris.ma...@oracle.com
 Date:   Wed Feb 22 12:36:24 2012 -0500

     Btrfs: clear the extent uptodate bits during parent transid failures

     If btrfs reads a block and finds a parent transid mismatch, it clears
     the uptodate flags on the extent buffer, and the pages inside it.  But
     we only clear the uptodate bits in the state tree if the block straddles
     more than one page.

     This is from an old optimization from to reduce contention on the extent
     state tree.  But it is buggy because the code that retries a read from
     a different copy of the block is going to find the uptodate state bits
     set and skip the IO.

     The end result of the bug is that we'll never actually read the good
     copy (if there is one).

     The fix here is to always clear the uptodate state bits, which is safe
     because this code is only called when the parent transid fails.

     Signed-off-by: Chris Masonchris.ma...@oracle.com

 diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
 index 1e8d5e5..a4dc892 100644
 --- a/fs/btrfs/extent_io.c
 +++ b/fs/btrfs/extent_io.c
 @@ -3852,10 +3852,9 @@ int clear_extent_buffer_uptodate(struct 
 extent_io_tree *tree,
       num_pages = num_extent_pages(eb-start, eb-len);
       clear_bit(EXTENT_BUFFER_UPTODATE,eb-bflags);

 -     if (eb_straddles_pages(eb)) {
 -             clear_extent_uptodate(tree, eb-start, eb-start + eb-len - 1,
 -                                   cached_state, GFP_NOFS);
 -     }
 +     clear_extent_uptodate(tree, eb-start, eb-start + eb-len - 1,
 +                           cached_state, GFP_NOFS);
 +
       for (i = 0; i  num_pages; i++) {
               page = extent_buffer_page(eb, i);
               if (page)
 --

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread cwillu
On Tue, Feb 28, 2012 at 9:00 PM, Travis Shivers ttshiv...@gmail.com wrote:
 Where should I grab the source from? The main repo that you have
 listed on your main wiki page
 (https://btrfs.wiki.kernel.org/articles/b/t/r/Btrfs_source_repositories.html)
 is down: 
 git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git

The btrfs wiki is at http://btrfs.ipv5.de .  The kernel.org one is a
static snapshot of the contents made nearly a year ago, prior to the
kernel.org break-in, and should be ignored.

git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git is
the development tree, although the above patch is in mainline as of
3.3rc5, which probably makes that the easiest way to try it.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs Storage Array Corrupted

2012-02-28 Thread Travis Shivers
I upgraded my kernel so my version is now:
Linux server 3.3.0-030300rc5-generic #201202251535 SMP Sat Feb 25
20:36:29 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

The problem has not been solved and I still get the previous errors.

# mount /dev/sdh /mnt/main
mount: wrong fs type, bad option, bad superblock on /dev/sdh,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

# dmesg
[  232.985248] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
4 transid 43477 /dev/sdi
[  232.985434] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
3 transid 43477 /dev/sdh
[  233.027881] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
2 transid 43477 /dev/sdg
[  233.065675] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
1 transid 43476 /dev/sdf
[  284.384320] device fsid 2c11a326-5630-484e-9f1d-9dab777a1028 devid
3 transid 43477 /dev/sdh
[  284.427076] btrfs: disk space caching is enabled
[  284.442565] verify_parent_transid: 2 callbacks suppressed
[  284.442572] parent transid verify failed on 5568194695168 wanted
43477 found 43151
[  284.442834] parent transid verify failed on 5568194695168 wanted
43477 found 43151
[  284.443151] parent transid verify failed on 5568194695168 wanted
43477 found 43151
[  284.443159] parent transid verify failed on 5568194695168 wanted
43477 found 43151
[  284.445740] btrfs: open_ctree failed


On Tue, Feb 28, 2012 at 9:16 PM, cwillu cwi...@cwillu.com wrote:
 On Tue, Feb 28, 2012 at 9:00 PM, Travis Shivers ttshiv...@gmail.com wrote:
 Where should I grab the source from? The main repo that you have
 listed on your main wiki page
 (https://btrfs.wiki.kernel.org/articles/b/t/r/Btrfs_source_repositories.html)
 is down: 
 git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git

 The btrfs wiki is at http://btrfs.ipv5.de .  The kernel.org one is a
 static snapshot of the contents made nearly a year ago, prior to the
 kernel.org break-in, and should be ignored.

 git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git is
 the development tree, although the above patch is in mainline as of
 3.3rc5, which probably makes that the easiest way to try it.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html