mount after reboot of btrfs RAID-10 fails with "btrfs: failed to read the system array on sda"

2010-01-23 Thread 0bo0
after a simple reboot,

btrfs-show
Label: TEST uuid: 2ac85206-2d88-47d7-a1e7-a93d80b199f8
Total devices 4 FS bytes used 28.00KB
devid1 size 931.51GB used 2.03GB path /dev/sda
devid2 size 931.51GB used 2.01GB path /dev/sdb
devid3 size 931.51GB used 2.01GB path /dev/sdc
devid4 size 931.51GB used 2.01GB path /dev/sdd

but,

mount /dev/sda /mnt
mount: wrong fs type, bad option, bad superblock on /dev/sda,
   missing codepage or helper program, or other error
   In some cases useful info is found in syslog - try
   dmesg | tail  or so

where,

tail -f /var/log/messages,

Jan 23 21:49:23 test kernel: [   94.949335] device fsid
f9452f77524a701a-28bb2c0e9bab5a99 devid 1 transid 17 /dev/sda
Jan 23 21:49:23 test kernel: [   94.951716] btrfs: failed to 
read
the system array on sda
Jan 23 21:49:23 test kernel: [   94.952748] btrfs: open_ctree 
failed

mkfs.btrfs -m raid10 -d raid10 /dev/sd[abcd]

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

adding device /dev/sdb id 2
adding device /dev/sdc id 3
adding device /dev/sdd id 4
fs created label (null) on /dev/sda
nodesize 4096 leafsize 4096 sectorsize 4096 size 3.64TB
Btrfs Btrfs v0.19

mount /dev/sda /mnt
df -H | grep -i sda
/dev/sda   4.1T29k   4.1T   1% /mnt

fyi,

lsb_release -ri
  Distributor ID: SUSE LINUX
  Release:11.2
uname -a
  Linux test 2.6.31.8-0.1-xen #1 SMP 2009-12-15 23:55:40 +0100 x86_64
x86_64 x86_64 GNU/Linux
rpm -qa | grep btr
  btrfsprogs-0.19-10.1.x86_64


a bug?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RAID-10 arrays built with btrfs & md report 2x difference in available size?

2010-01-23 Thread 0bo0
I created a btrfs RAID-10 array across 4-drives,

 mkfs.btrfs -L TEST -m raid10 -d raid10 /dev/sda /dev/sdb /dev/sdc /dev/sdd
 btrfs-show
Label: TEST  uuid: 2ac85206-2d88-47d7-a1e7-a93d80b199f8
Total devices 4 FS bytes used 28.00KB
devid1 size 931.51GB used 2.03GB path /dev/sda
devid2 size 931.51GB used 2.01GB path /dev/sdb
devid4 size 931.51GB used 2.01GB path /dev/sdd
devid3 size 931.51GB used 2.01GB path /dev/sdc

@ mount,

 mount /dev/sda /mnt
 df -H | grep /dev/sda
/dev/sda   4.1T29k   4.1T   1% /mnt

for RAID-10 across 4-drives, shouldn't the reported/available size be
1/2x4TB ~ 2TB?

e.g., using mdadm to build a RAID-10 array across the same drives,

 mdadm -v --create /dev/md0 --level=raid10 --raid-devices=4 /dev/sd[abcd]1
 pvcreate /dev/md0
pvs
  PV VG   Fmt  Attr PSize   PFree
  /dev/md0lvm2 --   1.82T 1.82T

is the difference in available array space real, an artifact, or a
misunderstanding on my part?

thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


when/why to use diffferent raid values for btrfs data & metadata?

2010-01-23 Thread mail ignored
Hi,

Just getting started with btrfs.

I understand that btrfs stores data/metadata in two different tree
structures – one for file/directory names, and one for data blocks.

Reading @,

 http://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices
  Use raid10 for both data and metadata
mkfs.btrfs -m raid10 -d raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sde

and @,

 "Churning Butter(FS): An Interview with Chris Mason"
  http://www.linux-mag.com/id/7329

CM Today you can do this:
mkfs.btrfs -m raid1 -d raid10 /dev/sda /dev/sdb /dev/sdc /dev/sdd
And you’ll get metadata on raid1 and data on raid10. The raid10 will
use all four drives and the raid1 will use two drives at a time. Yes,
btrfs allows you to pick different values for data or metadata.

The fact that I *can* setup data & metadata differently is clear.  But
I'm not at all clear *why* I'd want to, or what the advantages are.
I'd guess it's a balance/combination of performance & resiliency.

Naively "-m raid10 -d raid10" seems to make the most sense -- if i
have it, use it.

Are there any benchmarks, guidelines or recommendations?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


severe hardlink bug

2010-01-23 Thread Michael Niederle
I'm using btrfs with a kernel 2.6.32.2 (builtin) as the root file system of a
Gentoo Linux installation.

While attempting to install the plt-scheme package a strange error about link
counts occurred ([Error 31] Too many Links).

I was able to create a simple scenario to reproduce the error:

Create an empty directory and run the following script within the directory:

#!/bin/bash

echo "abc" >.tab-snip-class_scrbl.dep._portage_merge_.9766
for x in $(seq 80)
do echo $x; ln .tab-snip-class_scrbl.dep._portage_merge_.9766 \
.tab-snip-class_scrbl.dep._portage_merge_.9766.$x; done


(The names are taken from the original Gentoo package.)

When I run this script the first 66 links are created properly. All further
link creations fail:

...
63
64
65
66
67
ln: creating hard link to `.tab-snip-class_scrbl.dep._portage_merge_.9766': Too
many links 68
ln: creating hard link to `.tab-snip-class_scrbl.dep._portage_merge_.9766': Too
many links 69
ln: creating hard link to `.tab-snip-class_scrbl.dep._portage_merge_.9766': Too
many links 70
ln: creating hard link to `.tab-snip-class_scrbl.dep._portage_merge_.9766': Too
many links
...


I was able to reproduce the bug independently on two different machines.

During earlier tests I determined that filename length has an effect as well as
the link count.

Greetings, Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs st_nlink for directories

2010-01-23 Thread Chris Mason
On Sun, Jan 24, 2010 at 02:12:59AM +0530, Aneesh Kumar K. V wrote:
> On Fri, 22 Jan 2010 20:28:12 -0600, Neil Schemenauer  wrote:
> > Hi,
> > 
> > It looks like Btrfs does not follow Unix traditions for st_nlink
> > attribute of directories. It seems to be always one, no matter the
> > number of sub-directories.
> > 
> > Is this intentional? I couldn't find it discussed anywhere. I
> > gather the Mac OS HFS+ doesn't follow traditional st_nlink behavior
> > as well. The 'find' man page has this note:
> 
> I have sent patches with message-id
> 1264279089-14913-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com
> to the list. Let me know if they works for your

Thanks for taking a look at this Aneesh, but in btrfs we always have a
link count of one on directories.

It's a design decision so that we don't end up limited in the total
number of subdirs we can create.  reiser3 did something similar,
switching to 1 when the link count got high.  I think the other
filesystems may have added something along these lines as well by now.

Btrfs just leaves it at one all the time.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] btrfs: Update directory link count correctly while creation

2010-01-23 Thread Chris Mason
On Sun, Jan 24, 2010 at 02:08:08AM +0530, Aneesh Kumar K.V wrote:
> This patch make sure we update directory link count correctly
> during mkdir and rename

Directories always have a link count of 1, this is the widely accepted
way of saying: we don't count subdirs via the link count.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs st_nlink for directories

2010-01-23 Thread Aneesh Kumar K. V
On Fri, 22 Jan 2010 20:28:12 -0600, Neil Schemenauer  wrote:
> Hi,
> 
> It looks like Btrfs does not follow Unix traditions for st_nlink
> attribute of directories. It seems to be always one, no matter the
> number of sub-directories.
> 
> Is this intentional? I couldn't find it discussed anywhere. I
> gather the Mac OS HFS+ doesn't follow traditional st_nlink behavior
> as well. The 'find' man page has this note:

I have sent patches with message-id
1264279089-14913-1-git-send-email-aneesh.ku...@linux.vnet.ibm.com
to the list. Let me know if they works for your

-aneesh


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] btrfs: Drop the link count update from btrfs_unlink_inode

2010-01-23 Thread Aneesh Kumar K.V
This helps to cleanup the rename and unlink case

Signed-off-by: Aneesh Kumar K.V 
---
 fs/btrfs/compat.h   |1 +
 fs/btrfs/inode.c|   12 
 fs/btrfs/tree-log.c |5 ++---
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/fs/btrfs/compat.h b/fs/btrfs/compat.h
index 7c4503e..19c82bb 100644
--- a/fs/btrfs/compat.h
+++ b/fs/btrfs/compat.h
@@ -3,5 +3,6 @@
 
 #define btrfs_drop_nlink(inode) drop_nlink(inode)
 #define btrfs_inc_nlink(inode) inc_nlink(inode)
+#define btrfs_clear_nlink(inode) clear_nlink(inode)
 
 #endif /* _COMPAT_H_ */
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b330e27..32aa27d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2550,7 +2550,6 @@ err:
btrfs_i_size_write(dir, dir->i_size - name_len * 2);
inode->i_ctime = dir->i_mtime = dir->i_ctime = CURRENT_TIME;
btrfs_update_inode(trans, root, dir);
-   btrfs_drop_nlink(inode);
ret = btrfs_update_inode(trans, root, inode);
 out:
return ret;
@@ -2583,9 +2582,10 @@ static int btrfs_unlink(struct inode *dir, struct dentry 
*dentry)
btrfs_set_trans_block_group(trans, dir);
 
btrfs_record_unlink_dir(trans, dir, dentry->d_inode, 0);
-
ret = btrfs_unlink_inode(trans, root, dir, dentry->d_inode,
 dentry->d_name.name, dentry->d_name.len);
+   btrfs_drop_nlink(dentry->d_inode);
+   btrfs_update_inode(trans, root, dentry->d_inode);
 
if (inode->i_nlink == 0)
ret = btrfs_orphan_add(trans, inode);
@@ -2701,8 +2701,11 @@ static int btrfs_rmdir(struct inode *dir, struct dentry 
*dentry)
/* now the directory is empty */
err = btrfs_unlink_inode(trans, root, dir, dentry->d_inode,
 dentry->d_name.name, dentry->d_name.len);
-   if (!err)
+   if (!err) {
btrfs_i_size_write(inode, 0);
+   btrfs_drop_nlink(inode);
+   btrfs_update_inode(trans, root, inode);
+   }
 out:
nr = trans->blocks_used;
ret = btrfs_end_transaction_throttle(trans, root);
@@ -5575,7 +5578,6 @@ static int btrfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
old_dentry->d_name.name,
old_dentry->d_name.len);
} else {
-   btrfs_inc_nlink(old_dentry->d_inode);
ret = btrfs_unlink_inode(trans, root, old_dir,
 old_dentry->d_inode,
 old_dentry->d_name.name,
@@ -5598,6 +5600,8 @@ static int btrfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
 new_dentry->d_inode,
 new_dentry->d_name.name,
 new_dentry->d_name.len);
+   btrfs_drop_nlink(new_dentry->d_inode);
+   btrfs_update_inode(trans, dest, new_dentry->d_inode);
}
BUG_ON(ret);
if (new_inode->i_nlink == 0) {
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 4a9434b..fb2d6a1 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -671,6 +671,8 @@ static noinline int drop_one_dir_item(struct 
btrfs_trans_handle *trans,
BUG_ON(ret);
 
ret = btrfs_unlink_inode(trans, root, dir, inode, name, name_len);
+   btrfs_drop_nlink(inode);
+   btrfs_update_inode(trans, root, inode);
BUG_ON(ret);
kfree(name);
 
@@ -869,9 +871,7 @@ conflict_again:
 
if (!backref_in_log(log, key, victim_name,
victim_name_len)) {
-   btrfs_inc_nlink(inode);
btrfs_release_path(root, path);
-
ret = btrfs_unlink_inode(trans, root, dir,
 inode, victim_name,
 victim_name_len);
@@ -1426,7 +1426,6 @@ again:
ret = link_to_fixup_dir(trans, root,
path, location.objectid);
BUG_ON(ret);
-   btrfs_inc_nlink(inode);
ret = btrfs_unlink_inode(trans, root, dir, inode,
 name, name_len);
BUG_ON(ret);
-- 
1.6.6.1.394.gdedc0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] btrfs: Update directory link count correctly while creation

2010-01-23 Thread Aneesh Kumar K.V
This patch make sure we update directory link count correctly
during mkdir and rename


Signed-off-by: Aneesh Kumar K.V 
---
 fs/btrfs/inode.c |   21 +
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 32aa27d..9a56b3b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4575,6 +4575,10 @@ static int btrfs_mkdir(struct inode *dir, struct dentry 
*dentry, int mode)
btrfs_set_trans_block_group(trans, inode);
 
btrfs_i_size_write(inode, 0);
+   /*
+* Directories have link count 2
+*/
+   btrfs_inc_nlink(inode);
err = btrfs_update_inode(trans, root, inode);
if (err)
goto out_fail;
@@ -4585,6 +4589,13 @@ static int btrfs_mkdir(struct inode *dir, struct dentry 
*dentry, int mode)
if (err)
goto out_fail;
 
+   /*
+* if we are adding a directory we need
+* to bump the link count of parent
+*/
+   btrfs_inc_nlink(dentry->d_parent->d_inode);
+   btrfs_update_inode(trans, root, dentry->d_parent->d_inode);
+
d_instantiate(dentry, inode);
drop_on_err = 0;
btrfs_update_inode_block_group(trans, inode);
@@ -5613,6 +5624,16 @@ static int btrfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
ret = btrfs_add_link(trans, new_dir, old_inode,
 new_dentry->d_name.name,
 new_dentry->d_name.len, 0, index);
+
+   if (!new_inode && S_ISDIR(old_inode->i_mode)) {
+   /*
+* Bump the parent directory link count, if we
+* end up adding a new inode
+*/
+   btrfs_inc_nlink(new_dir);
+   btrfs_update_inode(trans, dest, new_dir);
+   }
+
BUG_ON(ret);
 
if (old_inode->i_ino != BTRFS_FIRST_FREE_OBJECTID) {
-- 
1.6.6.1.394.gdedc0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] btrfs: Update the directory link count correctly

2010-01-23 Thread Aneesh Kumar K.V
This patch make sure we update the directory link count properly
during delete

Signed-off-by: Aneesh Kumar K.V 
---
 fs/btrfs/inode.c |   11 ++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9a56b3b..12b9d6e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2703,8 +2703,17 @@ static int btrfs_rmdir(struct inode *dir, struct dentry 
*dentry)
 dentry->d_name.name, dentry->d_name.len);
if (!err) {
btrfs_i_size_write(inode, 0);
-   btrfs_drop_nlink(inode);
+   /*
+* Directories have link count 2.
+*/
+   btrfs_clear_nlink(dentry->d_inode);
btrfs_update_inode(trans, root, inode);
+   /*
+* If we are removing a directory drop the link count from
+* the parent directory
+*/
+   btrfs_drop_nlink(dir);
+   btrfs_update_inode(trans, root, dir);
}
 out:
nr = trans->blocks_used;
-- 
1.6.6.1.394.gdedc0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [btrfs] kernel BUG at include/linux/spinlock.h:376!

2010-01-23 Thread Johannes Hirte
Am Donnerstag 14 Januar 2010 20:37:08 schrieb Chris Mason:
> On Thu, Jan 07, 2010 at 10:29:32PM +0100, Johannes Hirte wrote:
> > One of my btrfs filesystems gives the following bug message on access:
> > 
> > Jan  6 23:08:12 datengrab kernel: [ cut here ]
> > Jan  6 23:08:12 datengrab kernel: kernel BUG at
> > include/linux/spinlock.h:376! Jan  6 23:08:12 datengrab kernel: invalid
> > opcode:  [#1] SMP
> > Jan  6 23:08:12 datengrab kernel: last sysfs file:
> > /sys/devices/pci:00/:00:18.3/temp1_input
> > Jan  6 23:08:12 datengrab kernel: CPU 1
> > Jan  6 23:08:12 datengrab kernel: Pid: 2837, comm: btrfs-endio-wri Not
> > tainted 2.6.33-rc3-00033-g03b7675 #12 TYAN Tiger K8W Dual AMD Opteron,
> > S2875/To Be Filled
> > By O.E.M.
> > Jan  6 23:08:12 datengrab kernel: RIP: 0010:[] 
> > [] btrfs_assert_tree_locked+0x16/0x1c
> 
> Well, we really should have this tree block locked, but
> btrfs_mark_extent_written is doing some special things.  Is the trace
> always the same?

Sorry for the long delay. Yes the trace was always the same. I can't test 
patches, since I'm not working on the corrupted FS anymore. The bug only 
occurred on the corrupted filesystem. But as I've seen, Yan Zheng has tracked 
it down (commit 6c7d54ac87f338c479d9729e8392eca3f76e11e1). 

I still suspect that the FS corruption was caused by this bug. It didn't 
happened again. If so, I'll report.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfsck failed

2010-01-23 Thread Michael Niederle
I tried an (offline) btrfsck and got the following error message:

> btrfsck /dev/sdb3
btrfsck: btrfsck.c:584: splice_shared_node: Assertion `!(src ==
&src_node->root_cache)' failed. Aborted

I use kernel version 2.6.32.2 with builtin btrfs-drivers.

Greetings, Michael
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html