Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-14 Thread Yan, Zheng
I found another bug. There are codes (btrfs_save_ino_cache) that
modify fs trees after
create_pending_snapshots is called. This can corrupt your fs.

On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Li Zefan
Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.
 
 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?
 

I don't think free inode cache isthe cause of the bug here (even if inode_cache
is turned on).

What I have found out is:

1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

So the top commit is the removal of trans_mutex and no delayed_inode patch
or free inode cache patchset in the tree, and bug can be triggered.

2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

So the top commit is the one before trans_mutex removal, and no bug triggered.

3. test linus' tree

bug triggered.

4. revert that suspicoius commit manually from linus' tree

no bug.

so either that commit is buggy or it reveals some bugs covered by the 
trans_mutex.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Yan, Zheng
Add a mutex to btrfs_init_reloc_root() to prevent the reloc tree
creation from concurrent execution.

On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Li Zefan
Yan, Zheng wrote:
 Add a mutex to btrfs_init_reloc_root() to prevent the reloc tree
 creation from concurrent execution.

Thanks!

Unfortunately I can still encounter BUG() in difference places in
each run:

kernel BUG at fs/btrfs/extent-tree.c:6173!

kernel BUG at fs/btrfs/volumes.c:2567!

 
 On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug 
 triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Li Zefan's message of 2011-06-13 03:13:13 -0400:
 Cc: Josef
 
  I encountered following panic using 'btrfs-unstable + for-linus'
  kernel.
 
  I ran btrfs fi bal /test5 command, and mount option of /test5
  is as follows:
 
   /dev/sdc3 on /test5 type btrfs 
  (rw,space_cache,compress=lzo,inode_cache)
 
  So, just a btrfs fi bal would lead to the bug?
  I think so.
 
  It should be specific to the inode caching code.  The balancing code is
  finding the inode map cache extents, but it doesn't know how to relocate
  them.
  
  However, the panic has occurred even if inode_cahce is turned off.
  Is this another problem?
  
 
 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).
 
 What I have found out is:
 
 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212
 
 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.
 
 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be
 
 So the top commit is the one before trans_mutex removal, and no bug triggered.
 
 3. test linus' tree
 
 bug triggered.
 
 4. revert that suspicoius commit manually from linus' tree
 
 no bug.
 
 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

Ok, what dataset are you balancing?

What are you doing concurrently with the balance?

I haven't been able to trigger this here.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Yan, Zheng
The usage of trans_mutex in relocation code is subtle. It controls
interaction of relocation
with transaction start, transaction commit and snapshot creation.
Simple replacing
trans_mutex with trans_lock is wrong.


On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

What requirements of the relocation stuff are we currently missing?  I'd
rather add extra waiting for relocation than go back to the mutex.

-chris

 
 On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
  Cc: Josef
 
  I encountered following panic using 'btrfs-unstable + for-linus'
  kernel.
 
  I ran btrfs fi bal /test5 command, and mount option of /test5
  is as follows:
 
   /dev/sdc3 on /test5 type btrfs 
  (rw,space_cache,compress=lzo,inode_cache)
 
  So, just a btrfs fi bal would lead to the bug?
  I think so.
 
  It should be specific to the inode caching code.  The balancing code is
  finding the inode map cache extents, but it doesn't know how to relocate
  them.
 
  However, the panic has occurred even if inode_cahce is turned off.
  Is this another problem?
 
 
  I don't think free inode cache isthe cause of the bug here (even if 
  inode_cache
  is turned on).
 
  What I have found out is:
 
  1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212
 
  So the top commit is the removal of trans_mutex and no delayed_inode patch
  or free inode cache patchset in the tree, and bug can be triggered.
 
  2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be
 
  So the top commit is the one before trans_mutex removal, and no bug 
  triggered.
 
  3. test linus' tree
 
  bug triggered.
 
  4. revert that suspicoius commit manually from linus' tree
 
  no bug.
 
  so either that commit is buggy or it reveals some bugs covered by the 
  trans_mutex.
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-btrfs in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Chris Mason's message of 2011-06-13 09:12:06 -0400:
 Excerpts from Li Zefan's message of 2011-06-13 03:13:13 -0400:
  Cc: Josef
  
   I encountered following panic using 'btrfs-unstable + for-linus'
   kernel.
  
   I ran btrfs fi bal /test5 command, and mount option of /test5
   is as follows:
  
/dev/sdc3 on /test5 type btrfs 
   (rw,space_cache,compress=lzo,inode_cache)
  
   So, just a btrfs fi bal would lead to the bug?
   I think so.
  
   It should be specific to the inode caching code.  The balancing code is
   finding the inode map cache extents, but it doesn't know how to relocate
   them.
   
   However, the panic has occurred even if inode_cahce is turned off.
   Is this another problem?
   
  
  I don't think free inode cache isthe cause of the bug here (even if 
  inode_cache
  is turned on).
  
  What I have found out is:
  
  1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212
  
  So the top commit is the removal of trans_mutex and no delayed_inode patch
  or free inode cache patchset in the tree, and bug can be triggered.
  
  2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be
  
  So the top commit is the one before trans_mutex removal, and no bug 
  triggered.
  
  3. test linus' tree
  
  bug triggered.
  
  4. revert that suspicoius commit manually from linus' tree
  
  no bug.
  
  so either that commit is buggy or it reveals some bugs covered by the 
  trans_mutex.
 
 Ok, what dataset are you balancing?
 
 What are you doing concurrently with the balance?
 
 I haven't been able to trigger this here.

There we go.  It took about two hours but I hit this with the balancing
exerciser that was posted.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

So, I've got a mutex around the reloc_root here and that was almost but
not quite enough.  It looks like the biggest problem is that we need to
wait in btrfs_record_root_in_trans for anyone inside merge_reloc_roots.

I'm surviving much longer with a patch in place that synchronizes
btrfs_record_root_in_trans better.

Zheng if you have other comments on the locking please let me know.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Yan, Zheng
On Tue, Jun 14, 2011 at 3:55 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

 So, I've got a mutex around the reloc_root here and that was almost but
 not quite enough.  It looks like the biggest problem is that we need to
 wait in btrfs_record_root_in_trans for anyone inside merge_reloc_roots.

 I'm surviving much longer with a patch in place that synchronizes
 btrfs_record_root_in_trans better.

 Zheng if you have other comments on the locking please let me know.


following untested patch may help.
---
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 378b5b4..0b20dda 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -951,6 +951,7 @@ struct btrfs_fs_info {
struct mutex cleaner_mutex;
struct mutex chunk_mutex;
struct mutex volume_mutex;
+   struct mutex reloc_mutex;
/*
 * this protects the ordered operations list only while we are
 * processing all of the entries on it.  This way we make
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 9f68c68..28f8b11 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1714,6 +1714,7 @@ struct btrfs_root *open_ctree(struct super_block *sb,
mutex_init(fs_info-transaction_kthread_mutex);
mutex_init(fs_info-cleaner_mutex);
mutex_init(fs_info-volume_mutex);
+   mutex_init(fs_info-reloc_mutex);
init_rwsem(fs_info-extent_commit_sem);
init_rwsem(fs_info-cleanup_work_sem);
init_rwsem(fs_info-subvol_sem);
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index b1ef27c..620e4af 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1330,18 +1330,20 @@ int btrfs_init_reloc_root(struct
btrfs_trans_handle *trans,
  struct btrfs_root *root)
 {
struct btrfs_root *reloc_root;
-   struct reloc_control *rc = root-fs_info-reloc_ctl;
+   struct reloc_control *rc;
int clear_rsv = 0;

+   mutex_lock(root-fs_info-reloc_mutex);
if (root-reloc_root) {
reloc_root = root-reloc_root;
reloc_root-last_trans = trans-transid;
-   return 0;
+   goto unlock;
}

+   rc = root-fs_info-reloc_ctl;
if (!rc || !rc-create_reloc_tree ||
root-root_key.objectid == BTRFS_TREE_RELOC_OBJECTID)
-   return 0;
+   goto unlock;

if (!trans-block_rsv) {
trans-block_rsv = rc-block_rsv;
@@ -1353,6 +1355,8 @@ int btrfs_init_reloc_root(struct
btrfs_trans_handle *trans,

__add_reloc_root(reloc_root);
root-reloc_root = reloc_root;
+unlock:
+   mutex_unlock(root-fs_info-reloc_mutex);
return 0;
 }

@@ -1367,8 +1371,9 @@ int btrfs_update_reloc_root(struct
btrfs_trans_handle *trans,
int del = 0;
int ret;

+   mutex_lock(root-fs_info-reloc_mutex);
if (!root-reloc_root)
-   return 0;
+   goto unlock;

reloc_root = root-reloc_root;
root_item = reloc_root-root_item;
@@ -1390,6 +1395,8 @@ int btrfs_update_reloc_root(struct
btrfs_trans_handle *trans,
ret = btrfs_update_root(trans, root-fs_info-tree_root,
reloc_root-root_key, root_item);
BUG_ON(ret);
+unlock:
+   mutex_unlock(root-fs_info-reloc_mutex);
return 0;
 }

@@ -2142,10 +2149,10 @@ int prepare_to_merge(struct reloc_control *rc, int err)
u64 num_bytes = 0;
int ret;

-   spin_lock(root-fs_info-trans_lock);
+   mutex_lock(root-fs_info-reloc_mutex);
rc-merging_rsv_size += root-nodesize * (BTRFS_MAX_LEVEL - 1) * 2;
rc-merging_rsv_size += rc-nodes_relocated * 2;
-   spin_unlock(root-fs_info-trans_lock);
+   mutex_unlock(root-fs_info-reloc_mutex);
 again:
if (!err) {
num_bytes = rc-merging_rsv_size;
@@ -2214,9 +2221,9 @@ int merge_reloc_roots(struct reloc_control *rc)
int ret;
 again:
root = rc-extent_root;
-   spin_lock(root-fs_info-trans_lock);
+   mutex_lock(root-fs_info-reloc_mutex);
list_splice_init(rc-reloc_roots, reloc_roots);
-   spin_unlock(root-fs_info-trans_lock);
+   mutex_unlock(root-fs_info-reloc_mutex);

while (!list_empty(reloc_roots)) {
found = 1;
@@ -3590,17 +3597,17 @@ next:
 static void set_reloc_control(struct reloc_control *rc)
 {
struct btrfs_fs_info *fs_info = rc-extent_root-fs_info;
-   spin_lock(fs_info-trans_lock);
+   mutex_lock(fs_info-reloc_mutex);
fs_info-reloc_ctl = rc;
-   spin_unlock(fs_info-trans_lock);
+   mutex_unlock(fs_info-reloc_mutex);
 }

 static void unset_reloc_control(struct reloc_control *rc)
 {
 

Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Li Zefan
Yan, Zheng wrote:
 On Tue, Jun 14, 2011 at 3:55 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

 So, I've got a mutex around the reloc_root here and that was almost but
 not quite enough.  It looks like the biggest problem is that we need to
 wait in btrfs_record_root_in_trans for anyone inside merge_reloc_roots.

 I'm surviving much longer with a patch in place that synchronizes
 btrfs_record_root_in_trans better.

 Zheng if you have other comments on the locking please let me know.

 
 following untested patch may help.

I've tested this patch, and the bug was triggered in minutes as usual.

Also I've tested a patch in an offline email from Chris, which survived
the test.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html