Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-14 Thread Yan, Zheng
I found another bug. There are codes (btrfs_save_ino_cache) that
modify fs trees after
create_pending_snapshots is called. This can corrupt your fs.

On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Li Zefan
Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.
 
 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?
 

I don't think free inode cache isthe cause of the bug here (even if inode_cache
is turned on).

What I have found out is:

1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

So the top commit is the removal of trans_mutex and no delayed_inode patch
or free inode cache patchset in the tree, and bug can be triggered.

2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

So the top commit is the one before trans_mutex removal, and no bug triggered.

3. test linus' tree

bug triggered.

4. revert that suspicoius commit manually from linus' tree

no bug.

so either that commit is buggy or it reveals some bugs covered by the 
trans_mutex.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Yan, Zheng
Add a mutex to btrfs_init_reloc_root() to prevent the reloc tree
creation from concurrent execution.

On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Li Zefan
Yan, Zheng wrote:
 Add a mutex to btrfs_init_reloc_root() to prevent the reloc tree
 creation from concurrent execution.

Thanks!

Unfortunately I can still encounter BUG() in difference places in
each run:

kernel BUG at fs/btrfs/extent-tree.c:6173!

kernel BUG at fs/btrfs/volumes.c:2567!

 
 On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug 
 triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Li Zefan's message of 2011-06-13 03:13:13 -0400:
 Cc: Josef
 
  I encountered following panic using 'btrfs-unstable + for-linus'
  kernel.
 
  I ran btrfs fi bal /test5 command, and mount option of /test5
  is as follows:
 
   /dev/sdc3 on /test5 type btrfs 
  (rw,space_cache,compress=lzo,inode_cache)
 
  So, just a btrfs fi bal would lead to the bug?
  I think so.
 
  It should be specific to the inode caching code.  The balancing code is
  finding the inode map cache extents, but it doesn't know how to relocate
  them.
  
  However, the panic has occurred even if inode_cahce is turned off.
  Is this another problem?
  
 
 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).
 
 What I have found out is:
 
 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212
 
 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.
 
 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be
 
 So the top commit is the one before trans_mutex removal, and no bug triggered.
 
 3. test linus' tree
 
 bug triggered.
 
 4. revert that suspicoius commit manually from linus' tree
 
 no bug.
 
 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

Ok, what dataset are you balancing?

What are you doing concurrently with the balance?

I haven't been able to trigger this here.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Yan, Zheng
The usage of trans_mutex in relocation code is subtle. It controls
interaction of relocation
with transaction start, transaction commit and snapshot creation.
Simple replacing
trans_mutex with trans_lock is wrong.


On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
 Cc: Josef

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.

 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

 However, the panic has occurred even if inode_cahce is turned off.
 Is this another problem?


 I don't think free inode cache isthe cause of the bug here (even if 
 inode_cache
 is turned on).

 What I have found out is:

 1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212

 So the top commit is the removal of trans_mutex and no delayed_inode patch
 or free inode cache patchset in the tree, and bug can be triggered.

 2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be

 So the top commit is the one before trans_mutex removal, and no bug triggered.

 3. test linus' tree

 bug triggered.

 4. revert that suspicoius commit manually from linus' tree

 no bug.

 so either that commit is buggy or it reveals some bugs covered by the 
 trans_mutex.

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

What requirements of the relocation stuff are we currently missing?  I'd
rather add extra waiting for relocation than go back to the mutex.

-chris

 
 On Mon, Jun 13, 2011 at 3:13 PM, Li Zefan l...@cn.fujitsu.com wrote:
  Cc: Josef
 
  I encountered following panic using 'btrfs-unstable + for-linus'
  kernel.
 
  I ran btrfs fi bal /test5 command, and mount option of /test5
  is as follows:
 
   /dev/sdc3 on /test5 type btrfs 
  (rw,space_cache,compress=lzo,inode_cache)
 
  So, just a btrfs fi bal would lead to the bug?
  I think so.
 
  It should be specific to the inode caching code.  The balancing code is
  finding the inode map cache extents, but it doesn't know how to relocate
  them.
 
  However, the panic has occurred even if inode_cahce is turned off.
  Is this another problem?
 
 
  I don't think free inode cache isthe cause of the bug here (even if 
  inode_cache
  is turned on).
 
  What I have found out is:
 
  1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212
 
  So the top commit is the removal of trans_mutex and no delayed_inode patch
  or free inode cache patchset in the tree, and bug can be triggered.
 
  2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be
 
  So the top commit is the one before trans_mutex removal, and no bug 
  triggered.
 
  3. test linus' tree
 
  bug triggered.
 
  4. revert that suspicoius commit manually from linus' tree
 
  no bug.
 
  so either that commit is buggy or it reveals some bugs covered by the 
  trans_mutex.
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-btrfs in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Chris Mason's message of 2011-06-13 09:12:06 -0400:
 Excerpts from Li Zefan's message of 2011-06-13 03:13:13 -0400:
  Cc: Josef
  
   I encountered following panic using 'btrfs-unstable + for-linus'
   kernel.
  
   I ran btrfs fi bal /test5 command, and mount option of /test5
   is as follows:
  
/dev/sdc3 on /test5 type btrfs 
   (rw,space_cache,compress=lzo,inode_cache)
  
   So, just a btrfs fi bal would lead to the bug?
   I think so.
  
   It should be specific to the inode caching code.  The balancing code is
   finding the inode map cache extents, but it doesn't know how to relocate
   them.
   
   However, the panic has occurred even if inode_cahce is turned off.
   Is this another problem?
   
  
  I don't think free inode cache isthe cause of the bug here (even if 
  inode_cache
  is turned on).
  
  What I have found out is:
  
  1. git checkout a4abeea41adfa3c143c289045f4625dfaeba2212
  
  So the top commit is the removal of trans_mutex and no delayed_inode patch
  or free inode cache patchset in the tree, and bug can be triggered.
  
  2. git checkout 2a1eb4614d984d5cd4c928784e9afcf5c07f93be
  
  So the top commit is the one before trans_mutex removal, and no bug 
  triggered.
  
  3. test linus' tree
  
  bug triggered.
  
  4. revert that suspicoius commit manually from linus' tree
  
  no bug.
  
  so either that commit is buggy or it reveals some bugs covered by the 
  trans_mutex.
 
 Ok, what dataset are you balancing?
 
 What are you doing concurrently with the balance?
 
 I haven't been able to trigger this here.

There we go.  It took about two hours but I hit this with the balancing
exerciser that was posted.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Chris Mason
Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

So, I've got a mutex around the reloc_root here and that was almost but
not quite enough.  It looks like the biggest problem is that we need to
wait in btrfs_record_root_in_trans for anyone inside merge_reloc_roots.

I'm surviving much longer with a patch in place that synchronizes
btrfs_record_root_in_trans better.

Zheng if you have other comments on the locking please let me know.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Yan, Zheng
On Tue, Jun 14, 2011 at 3:55 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

 So, I've got a mutex around the reloc_root here and that was almost but
 not quite enough.  It looks like the biggest problem is that we need to
 wait in btrfs_record_root_in_trans for anyone inside merge_reloc_roots.

 I'm surviving much longer with a patch in place that synchronizes
 btrfs_record_root_in_trans better.

 Zheng if you have other comments on the locking please let me know.


following untested patch may help.
---
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 378b5b4..0b20dda 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -951,6 +951,7 @@ struct btrfs_fs_info {
struct mutex cleaner_mutex;
struct mutex chunk_mutex;
struct mutex volume_mutex;
+   struct mutex reloc_mutex;
/*
 * this protects the ordered operations list only while we are
 * processing all of the entries on it.  This way we make
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 9f68c68..28f8b11 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1714,6 +1714,7 @@ struct btrfs_root *open_ctree(struct super_block *sb,
mutex_init(fs_info-transaction_kthread_mutex);
mutex_init(fs_info-cleaner_mutex);
mutex_init(fs_info-volume_mutex);
+   mutex_init(fs_info-reloc_mutex);
init_rwsem(fs_info-extent_commit_sem);
init_rwsem(fs_info-cleanup_work_sem);
init_rwsem(fs_info-subvol_sem);
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index b1ef27c..620e4af 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -1330,18 +1330,20 @@ int btrfs_init_reloc_root(struct
btrfs_trans_handle *trans,
  struct btrfs_root *root)
 {
struct btrfs_root *reloc_root;
-   struct reloc_control *rc = root-fs_info-reloc_ctl;
+   struct reloc_control *rc;
int clear_rsv = 0;

+   mutex_lock(root-fs_info-reloc_mutex);
if (root-reloc_root) {
reloc_root = root-reloc_root;
reloc_root-last_trans = trans-transid;
-   return 0;
+   goto unlock;
}

+   rc = root-fs_info-reloc_ctl;
if (!rc || !rc-create_reloc_tree ||
root-root_key.objectid == BTRFS_TREE_RELOC_OBJECTID)
-   return 0;
+   goto unlock;

if (!trans-block_rsv) {
trans-block_rsv = rc-block_rsv;
@@ -1353,6 +1355,8 @@ int btrfs_init_reloc_root(struct
btrfs_trans_handle *trans,

__add_reloc_root(reloc_root);
root-reloc_root = reloc_root;
+unlock:
+   mutex_unlock(root-fs_info-reloc_mutex);
return 0;
 }

@@ -1367,8 +1371,9 @@ int btrfs_update_reloc_root(struct
btrfs_trans_handle *trans,
int del = 0;
int ret;

+   mutex_lock(root-fs_info-reloc_mutex);
if (!root-reloc_root)
-   return 0;
+   goto unlock;

reloc_root = root-reloc_root;
root_item = reloc_root-root_item;
@@ -1390,6 +1395,8 @@ int btrfs_update_reloc_root(struct
btrfs_trans_handle *trans,
ret = btrfs_update_root(trans, root-fs_info-tree_root,
reloc_root-root_key, root_item);
BUG_ON(ret);
+unlock:
+   mutex_unlock(root-fs_info-reloc_mutex);
return 0;
 }

@@ -2142,10 +2149,10 @@ int prepare_to_merge(struct reloc_control *rc, int err)
u64 num_bytes = 0;
int ret;

-   spin_lock(root-fs_info-trans_lock);
+   mutex_lock(root-fs_info-reloc_mutex);
rc-merging_rsv_size += root-nodesize * (BTRFS_MAX_LEVEL - 1) * 2;
rc-merging_rsv_size += rc-nodes_relocated * 2;
-   spin_unlock(root-fs_info-trans_lock);
+   mutex_unlock(root-fs_info-reloc_mutex);
 again:
if (!err) {
num_bytes = rc-merging_rsv_size;
@@ -2214,9 +2221,9 @@ int merge_reloc_roots(struct reloc_control *rc)
int ret;
 again:
root = rc-extent_root;
-   spin_lock(root-fs_info-trans_lock);
+   mutex_lock(root-fs_info-reloc_mutex);
list_splice_init(rc-reloc_roots, reloc_roots);
-   spin_unlock(root-fs_info-trans_lock);
+   mutex_unlock(root-fs_info-reloc_mutex);

while (!list_empty(reloc_roots)) {
found = 1;
@@ -3590,17 +3597,17 @@ next:
 static void set_reloc_control(struct reloc_control *rc)
 {
struct btrfs_fs_info *fs_info = rc-extent_root-fs_info;
-   spin_lock(fs_info-trans_lock);
+   mutex_lock(fs_info-reloc_mutex);
fs_info-reloc_ctl = rc;
-   spin_unlock(fs_info-trans_lock);
+   mutex_unlock(fs_info-reloc_mutex);
 }

 static void unset_reloc_control(struct reloc_control *rc)
 {
 

Re: bug caused by removal of trans_mutex? (Was: Re: kernel BUG at fs/btrfs/extent-tree.c:6164!)

2011-06-13 Thread Li Zefan
Yan, Zheng wrote:
 On Tue, Jun 14, 2011 at 3:55 AM, Chris Mason chris.ma...@oracle.com wrote:
 Excerpts from Yan, Zheng's message of 2011-06-13 10:58:35 -0400:
 The usage of trans_mutex in relocation code is subtle. It controls
 interaction of relocation
 with transaction start, transaction commit and snapshot creation.
 Simple replacing
 trans_mutex with trans_lock is wrong.

 So, I've got a mutex around the reloc_root here and that was almost but
 not quite enough.  It looks like the biggest problem is that we need to
 wait in btrfs_record_root_in_trans for anyone inside merge_reloc_roots.

 I'm surviving much longer with a patch in place that synchronizes
 btrfs_record_root_in_trans better.

 Zheng if you have other comments on the locking please let me know.

 
 following untested patch may help.

I've tested this patch, and the bug was triggered in minutes as usual.

Also I've tested a patch in an offline email from Chris, which survived
the test.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-07 Thread Tsutomu Itoh
Hi liubo,

(2011/06/07 14:31), liubo wrote:
 On 06/06/2011 04:33 PM, Tsutomu Itoh wrote:
 Hi,

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs (rw,space_cache,compress=lzo,inode_cache)

 
 So, just a btrfs fi bal would lead to the bug?

I think so.

 
 I've figured out the warnings, but not reproduced the bug yet...
 I used 'btrfs-unstable + for-linus whose top commit is
 
 commit aa0467d8d2a00e75b2bb6a56a4ee6d70c5d1928f
 Author: David Sterba dste...@suse.cz
 Date:   Fri Jun 3 16:29:08 2011 +0200
 
 btrfs: fix uninitialized variable warning

It's same of my environment.

 
 and tried on 1) a single disk, 2) 2 disks and 3) 4 disks respectively,
 but none of them leaded to the below bug.

The test script and the volume composition that I am executing are
same as following mail.

  http://marc.info/?l=linux-btrfsm=130680171426371w=2

and, in my environment, panic is done within almost 30 minutes when
test script is executed.

Thanks,
Tsutomu

 
 I guess maybe I miss something to reproduce it?
 
 thanks,
 liubo
 
 Thanks,
 Tsutomu

 =

 btrfs: relocating block group 23383244800 flags 20
 btrfs: found 2959 extents
 [ cut here ]
 WARNING: at fs/btrfs/transaction.c:213 start_transaction+0x2a7/0x2b0 
 [btrfs]()
 Hardware name: PRIMERGY
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufr
 eq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd 
 dm_mirror
 dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr 
 i2c_i
 801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac 
 edac
 _core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas 
 pata_a
 cpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode]
 Pid: 23781, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4
 Call Trace:
  [8106004f] warn_slowpath_common+0x7f/0xc0
  [810600aa] warn_slowpath_null+0x1a/0x20
  [a0337047] start_transaction+0x2a7/0x2b0 [btrfs]
  [a035498d] ? btrfs_wait_ordered_range+0x10d/0x160 [btrfs]
  [a0337323] btrfs_start_transaction+0x13/0x20 [btrfs]
  [a033bbca] btrfs_evict_inode+0x11a/0x260 [btrfs]
  [811687f8] evict+0x78/0x170
  [81168c92] iput+0xe2/0x1a0
  [a031f171] btrfs_remove_block_group+0x141/0x3c0 [btrfs]
  [a035e6ea] btrfs_relocate_chunk+0x54a/0x670 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
  [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
  [8115dc41] ? path_openat+0x101/0x3d0
  [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
  [8111e358] ? handle_mm_fault+0x148/0x270
  [814809e8] ? do_page_fault+0x1d8/0x4b0
  [81160d6a] do_vfs_ioctl+0x9a/0x540
  [811612b1] sys_ioctl+0xa1/0xb0
  [81484ec2] system_call_fastpath+0x16/0x1b
 ---[ end trace e5c5cb2e98a3cd1a ]---
 btrfs: relocating block group 20971520 flags 18
 btrfs: relocating block group 34925969408 flags 18
 btrfs: found 1 extents
 [ cut here ]
 kernel BUG at fs/btrfs/extent-tree.c:6164!
 invalid opcode:  [#1] SMP
 last sysfs file: /sys/kernel/mm/ksm/run
 CPU 0
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 
 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc 
 parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp 
 pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif 
 sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod 
 floppy [last unloaded: microcode]

 Pid: 4109, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4 FUJITSU-SV 
  PRIMERGY/D2399
 RIP: 0010:[a0325b95]  [a0325b95] 
 walk_up_proc+0x375/0x420 [btrfs]
 RSP: 0018:8801801eb9c8  EFLAGS: 00010286
 RAX: 0005 RBX: 880167a70140 RCX: fff8
 RDX: 8801801ea000 RSI: 8800 RDI: 880194909fa8
 RBP: 8801801eba18 R08:  R09: 0005
 R10: 0001 R11: 880194909fa8 R12: 
 R13: 88013973d000 R14: 88015ad4d9a0 R15: 880042203920
 FS:  7fa86bcb9740() GS:88019fc0() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 0033cf60b0c0 CR3: 000181cf7000 CR4: 06f0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process btrfs (pid: 4109, threadinfo 8801801ea000, task 88011a4914a0)
 Stack:
  8801801eba18 880194909fa8 8801

Re: kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-07 Thread Tsutomu Itoh
: 48 8b 75 20 48 89 c3 48 8b 7d 18 e8 c9 bd ff ff 48 39 d8 77 26 b8 1d 00 
00 00 e9 15 ff ff ff a8 01 0f 85 8c fe ff ff 0f 0b eb fe 0f 0b eb fe 0f 0b 0f 
1f 84 00 00 00 00 00 eb f6 4c 89 fb 44 8b
RIP  [a0325422] lookup_inline_extent_backref+0x2d2/0x3f0 [btrfs]
 RSP 880156d2d7b8

 
 Thanks,
 Tsutomu
 

 I guess maybe I miss something to reproduce it?

 thanks,
 liubo

 Thanks,
 Tsutomu

 =

 btrfs: relocating block group 23383244800 flags 20
 btrfs: found 2959 extents
 [ cut here ]
 WARNING: at fs/btrfs/transaction.c:213 start_transaction+0x2a7/0x2b0 
 [btrfs]()
 Hardware name: PRIMERGY
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufr
 eq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd 
 dm_mirror
 dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr 
 i2c_i
 801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac 
 edac
 _core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas 
 pata_a
 cpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode]
 Pid: 23781, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4
 Call Trace:
  [8106004f] warn_slowpath_common+0x7f/0xc0
  [810600aa] warn_slowpath_null+0x1a/0x20
  [a0337047] start_transaction+0x2a7/0x2b0 [btrfs]
  [a035498d] ? btrfs_wait_ordered_range+0x10d/0x160 [btrfs]
  [a0337323] btrfs_start_transaction+0x13/0x20 [btrfs]
  [a033bbca] btrfs_evict_inode+0x11a/0x260 [btrfs]
  [811687f8] evict+0x78/0x170
  [81168c92] iput+0xe2/0x1a0
  [a031f171] btrfs_remove_block_group+0x141/0x3c0 [btrfs]
  [a035e6ea] btrfs_relocate_chunk+0x54a/0x670 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
  [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
  [8115dc41] ? path_openat+0x101/0x3d0
  [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
  [8111e358] ? handle_mm_fault+0x148/0x270
  [814809e8] ? do_page_fault+0x1d8/0x4b0
  [81160d6a] do_vfs_ioctl+0x9a/0x540
  [811612b1] sys_ioctl+0xa1/0xb0
  [81484ec2] system_call_fastpath+0x16/0x1b
 ---[ end trace e5c5cb2e98a3cd1a ]---
 btrfs: relocating block group 20971520 flags 18
 btrfs: relocating block group 34925969408 flags 18
 btrfs: found 1 extents
 [ cut here ]
 kernel BUG at fs/btrfs/extent-tree.c:6164!
 invalid opcode:  [#1] SMP
 last sysfs file: /sys/kernel/mm/ksm/run
 CPU 0
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 
 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc 
 parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp 
 pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif 
 sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod 
 floppy [last unloaded: microcode]

 Pid: 4109, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4 
 FUJITSU-SV  PRIMERGY/D2399
 RIP: 0010:[a0325b95]  [a0325b95] 
 walk_up_proc+0x375/0x420 [btrfs]
 RSP: 0018:8801801eb9c8  EFLAGS: 00010286
 RAX: 0005 RBX: 880167a70140 RCX: fff8
 RDX: 8801801ea000 RSI: 8800 RDI: 880194909fa8
 RBP: 8801801eba18 R08:  R09: 0005
 R10: 0001 R11: 880194909fa8 R12: 
 R13: 88013973d000 R14: 88015ad4d9a0 R15: 880042203920
 FS:  7fa86bcb9740() GS:88019fc0() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 0033cf60b0c0 CR3: 000181cf7000 CR4: 06f0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process btrfs (pid: 4109, threadinfo 8801801ea000, task 
 88011a4914a0)
 Stack:
  8801801eba18 880194909fa8 8801 a03280e8
  8801801eba58 88015ad4d9a0  
  8801801ea000 880167a70140 8801801eba78 a0325d71
 Call Trace:
  [a03280e8] ? btrfs_run_delayed_refs+0xc8/0x210 [btrfs]
  [a0325d71] walk_up_tree+0x131/0x1b0 [btrfs]
  [a03260b0] btrfs_drop_snapshot+0x2c0/0x5c0 [btrfs]
  [a03328b0] ? btrfs_read_fs_root_no_name+0x1b0/0x280 [btrfs]
  [a037b45f] merge_reloc_roots+0xdf/0x150 [btrfs]
  [a037f311] relocate_block_group+0x481/0x660 [btrfs]
  [a0334d15] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
  [a037f6a3] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
  [a0368d80] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
  [a035e22b] btrfs_relocate_chunk+0x8b/0x670 [btrfs

Re: kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-07 Thread Tsutomu Itoh
:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process btrfs (pid: 3849, threadinfo 880156d2c000, task 880192a04ab0)
 Stack:
  880193893000 1000 880156d2d808 880156d2d920
  880156d2d808 000181f61000  880193893000
  0001 880156d2da18 000181f61000 009000a8
 Call Trace:
  [a03271d6] __btrfs_free_extent+0xd6/0x730 [btrfs]
  [a0356960] ? map_extent_buffer+0xb0/0xc0 [btrfs]
  [a0326d19] ? update_block_group+0xd9/0x2a0 [btrfs]
  [a0327ed8] run_clustered_refs+0x6a8/0x7f0 [btrfs]
  [a03280e8] btrfs_run_delayed_refs+0xc8/0x210 [btrfs]
  [a033638d] btrfs_commit_transaction+0x7d/0x790 [btrfs]
  [a0335ac5] ? join_transaction+0x25/0x250 [btrfs]
  [81081de0] ? wake_up_bit+0x40/0x40
  [a037f377] relocate_block_group+0x4e7/0x660 [btrfs]
  [a0334d15] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
  [a037f6a3] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
  [a0368d80] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
  [a035e22b] btrfs_relocate_chunk+0x8b/0x670 [btrfs]
  [a031303d] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
  [8115dc41] ? path_openat+0x101/0x3d0
  [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
  [8111e358] ? handle_mm_fault+0x148/0x270
  [814809e8] ? do_page_fault+0x1d8/0x4b0
  [81160d6a] do_vfs_ioctl+0x9a/0x540
  [811612b1] sys_ioctl+0xa1/0xb0
  [81484ec2] system_call_fastpath+0x16/0x1b
 Code: 48 8b 75 20 48 89 c3 48 8b 7d 18 e8 c9 bd ff ff 48 39 d8 77 26 b8 1d 00 
 00 00 e9 15 ff ff ff a8 01 0f 85 8c fe ff ff 0f 0b eb fe 0f 0b eb fe 0f 0b 
 0f 1f 84 00 00 00 00 00 eb f6 4c 89 fb 44 8b
 RIP  [a0325422] lookup_inline_extent_backref+0x2d2/0x3f0 [btrfs]
  RSP 880156d2d7b8
 

 Thanks,
 Tsutomu


 I guess maybe I miss something to reproduce it?

 thanks,
 liubo

 Thanks,
 Tsutomu

 =

 btrfs: relocating block group 23383244800 flags 20
 btrfs: found 2959 extents
 [ cut here ]
 WARNING: at fs/btrfs/transaction.c:213 start_transaction+0x2a7/0x2b0 
 [btrfs]()
 Hardware name: PRIMERGY
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufr
 eq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd 
 dm_mirror
 dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr 
 i2c_i
 801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug 
 i3000_edac edac
 _core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas 
 pata_a
 cpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode]
 Pid: 23781, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4
 Call Trace:
  [8106004f] warn_slowpath_common+0x7f/0xc0
  [810600aa] warn_slowpath_null+0x1a/0x20
  [a0337047] start_transaction+0x2a7/0x2b0 [btrfs]
  [a035498d] ? btrfs_wait_ordered_range+0x10d/0x160 [btrfs]
  [a0337323] btrfs_start_transaction+0x13/0x20 [btrfs]
  [a033bbca] btrfs_evict_inode+0x11a/0x260 [btrfs]
  [811687f8] evict+0x78/0x170
  [81168c92] iput+0xe2/0x1a0
  [a031f171] btrfs_remove_block_group+0x141/0x3c0 [btrfs]
  [a035e6ea] btrfs_relocate_chunk+0x54a/0x670 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
  [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
  [8115dc41] ? path_openat+0x101/0x3d0
  [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
  [8111e358] ? handle_mm_fault+0x148/0x270
  [814809e8] ? do_page_fault+0x1d8/0x4b0
  [81160d6a] do_vfs_ioctl+0x9a/0x540
  [811612b1] sys_ioctl+0xa1/0xb0
  [81484ec2] system_call_fastpath+0x16/0x1b
 ---[ end trace e5c5cb2e98a3cd1a ]---
 btrfs: relocating block group 20971520 flags 18
 btrfs: relocating block group 34925969408 flags 18
 btrfs: found 1 extents
 [ cut here ]
 kernel BUG at fs/btrfs/extent-tree.c:6164!
 invalid opcode:  [#1] SMP
 last sysfs file: /sys/kernel/mm/ksm/run
 CPU 0
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c 
 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev 
 parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt 
 iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 
 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_acpi 
 ata_generic ata_piix libata scsi_mod floppy

Re: kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-07 Thread liubo
:  R11: 0015 R12: 00b2
 R13: 88019474b4f8 R14: 0001 R15: 
 FS:  7faaf3753740() GS:88019fc0() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 017bfea0 CR3: 000156e53000 CR4: 06f0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process btrfs (pid: 3849, threadinfo 880156d2c000, task 880192a04ab0)
 Stack:
  880193893000 1000 880156d2d808 880156d2d920
  880156d2d808 000181f61000  880193893000
  0001 880156d2da18 000181f61000 009000a8
 Call Trace:
  [a03271d6] __btrfs_free_extent+0xd6/0x730 [btrfs]
  [a0356960] ? map_extent_buffer+0xb0/0xc0 [btrfs]
  [a0326d19] ? update_block_group+0xd9/0x2a0 [btrfs]
  [a0327ed8] run_clustered_refs+0x6a8/0x7f0 [btrfs]
  [a03280e8] btrfs_run_delayed_refs+0xc8/0x210 [btrfs]
  [a033638d] btrfs_commit_transaction+0x7d/0x790 [btrfs]
  [a0335ac5] ? join_transaction+0x25/0x250 [btrfs]
  [81081de0] ? wake_up_bit+0x40/0x40
  [a037f377] relocate_block_group+0x4e7/0x660 [btrfs]
  [a0334d15] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
  [a037f6a3] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
  [a0368d80] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
  [a035e22b] btrfs_relocate_chunk+0x8b/0x670 [btrfs]
  [a031303d] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
  [8115dc41] ? path_openat+0x101/0x3d0
  [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
  [8111e358] ? handle_mm_fault+0x148/0x270
  [814809e8] ? do_page_fault+0x1d8/0x4b0
  [81160d6a] do_vfs_ioctl+0x9a/0x540
  [811612b1] sys_ioctl+0xa1/0xb0
  [81484ec2] system_call_fastpath+0x16/0x1b
 Code: 48 8b 75 20 48 89 c3 48 8b 7d 18 e8 c9 bd ff ff 48 39 d8 77 26 b8 1d 
 00 00 00 e9 15 ff ff ff a8 01 0f 85 8c fe ff ff 0f 0b eb fe 0f 0b eb fe 0f 
 0b 0f 1f 84 00 00 00 00 00 eb f6 4c 89 fb 44 8b
 RIP  [a0325422] lookup_inline_extent_backref+0x2d2/0x3f0 [btrfs]
  RSP 880156d2d7b8

 Thanks,
 Tsutomu

 I guess maybe I miss something to reproduce it?

 thanks,
 liubo

 Thanks,
 Tsutomu

 =

 btrfs: relocating block group 23383244800 flags 20
 btrfs: found 2959 extents
 [ cut here ]
 WARNING: at fs/btrfs/transaction.c:213 start_transaction+0x2a7/0x2b0 
 [btrfs]()
 Hardware name: PRIMERGY
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufr
 eq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd 
 dm_mirror
 dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg 
 pcspkr i2c_i
 801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug 
 i3000_edac edac
 _core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas 
 pata_a
 cpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode]
 Pid: 23781, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4
 Call Trace:
  [8106004f] warn_slowpath_common+0x7f/0xc0
  [810600aa] warn_slowpath_null+0x1a/0x20
  [a0337047] start_transaction+0x2a7/0x2b0 [btrfs]
  [a035498d] ? btrfs_wait_ordered_range+0x10d/0x160 [btrfs]
  [a0337323] btrfs_start_transaction+0x13/0x20 [btrfs]
  [a033bbca] btrfs_evict_inode+0x11a/0x260 [btrfs]
  [811687f8] evict+0x78/0x170
  [81168c92] iput+0xe2/0x1a0
  [a031f171] btrfs_remove_block_group+0x141/0x3c0 [btrfs]
  [a035e6ea] btrfs_relocate_chunk+0x54a/0x670 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
  [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
  [8115dc41] ? path_openat+0x101/0x3d0
  [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
  [8111e358] ? handle_mm_fault+0x148/0x270
  [814809e8] ? do_page_fault+0x1d8/0x4b0
  [81160d6a] do_vfs_ioctl+0x9a/0x540
  [811612b1] sys_ioctl+0xa1/0xb0
  [81484ec2] system_call_fastpath+0x16/0x1b
 ---[ end trace e5c5cb2e98a3cd1a ]---
 btrfs: relocating block group 20971520 flags 18
 btrfs: relocating block group 34925969408 flags 18
 btrfs: found 1 extents
 [ cut here ]
 kernel BUG at fs/btrfs/extent-tree.c:6164!
 invalid opcode:  [#1] SMP
 last sysfs file: /sys/kernel/mm/ksm/run
 CPU 0
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c

Re: kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-07 Thread Chris Mason
Excerpts from liubo's message of 2011-06-07 04:36:56 -0400:
 On 06/07/2011 04:24 PM, Tsutomu Itoh wrote:
  (2011/06/07 15:17), Tsutomu Itoh wrote:
  (2011/06/07 14:59), Tsutomu Itoh wrote:
  Hi liubo,
 
  (2011/06/07 14:31), liubo wrote:
  On 06/06/2011 04:33 PM, Tsutomu Itoh wrote:
  Hi,
 
  I encountered following panic using 'btrfs-unstable + for-linus'
  kernel.
 
  I ran btrfs fi bal /test5 command, and mount option of /test5
  is as follows:
 
   /dev/sdc3 on /test5 type btrfs 
  (rw,space_cache,compress=lzo,inode_cache)
 
  So, just a btrfs fi bal would lead to the bug?
  I think so.

It should be specific to the inode caching code.  The balancing code is
finding the inode map cache extents, but it doesn't know how to relocate
them.

I think we need to switch the inode map cache over to regular extents
that are not preallocated.  It will fix the overflow problem and it will
fix the balancing.

There are a lot of special cases for the free extent cache that don't
apply to the inode map cache, and I think sharing the extent
preallocation is hurting us.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-07 Thread Tsutomu Itoh
(2011/06/08 0:46), Chris Mason wrote:
 Excerpts from liubo's message of 2011-06-07 04:36:56 -0400:
 On 06/07/2011 04:24 PM, Tsutomu Itoh wrote:
 (2011/06/07 15:17), Tsutomu Itoh wrote:
 (2011/06/07 14:59), Tsutomu Itoh wrote:
 Hi liubo,

 (2011/06/07 14:31), liubo wrote:
 On 06/06/2011 04:33 PM, Tsutomu Itoh wrote:
 Hi,

 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.

 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:

  /dev/sdc3 on /test5 type btrfs 
 (rw,space_cache,compress=lzo,inode_cache)

 So, just a btrfs fi bal would lead to the bug?
 I think so.
 
 It should be specific to the inode caching code.  The balancing code is
 finding the inode map cache extents, but it doesn't know how to relocate
 them.

However, the panic has occurred even if inode_cahce is turned off.
Is this another problem?

---
Tsutomu



device fsid a46d03b5cb35c93-4713fead8acc709e devid 1 transid 7 /dev/sdc3
btrfs: enabling disk space caching
btrfs: use lzo compression
device fsid 914b303425ef9825-e448135c0d20babe devid 1 transid 7 /dev/sdd4
btrfs: disk space caching is enabled
btrfs: relocating block group 1103101952 flags 9
btrfs: found 540 extents
btrfs: found 540 extents
[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:1424!
invalid opcode:  [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd 
dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg 
pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug 
i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom 
megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod floppy [last 
unloaded: microcode]

Pid: 26884, comm: btrfs Not tainted 2.6.39btrfs-test+ #4 FUJITSU-SV  
PRIMERGY/D2399
RIP: 0010:[a0325422]  [a0325422] 
lookup_inline_extent_backref+0x2d2/0x3f0 [btrfs]
RSP: 0018:8801475db748  EFLAGS: 00010202
RAX: 0001 RBX: 880141d1a6d0 RCX: 8801475da000
RDX: 0008 RSI: 8800 RDI: 
RBP: 8801475db7e8 R08: 0001 R09: 6db6db6db6db6db7
R10: 0001 R11: 0014 R12: 00b8
R13: 880142bc8a08 R14: 0001 R15: 000d
FS:  7fbbaa8b0740() GS:88019fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0033cfeda340 CR3: 000145c04000 CR4: 06f0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs (pid: 26884, threadinfo 8801475da000, task 880160806ab0)
Stack:
 8801475db778 a0331ca6 88018c1087c8 8801475db830
 0821 000181f43000 8801475db7e8 88012cc27800
 082e 0794475db9a9 000181f43000 004000a8
Call Trace:
 [a0331ca6] ? btrfs_mark_buffer_dirty+0xb6/0x130 [btrfs]
 [a03255a9] insert_inline_extent_backref+0x69/0x100 [btrfs]
 [81140376] ? kmem_cache_alloc+0x186/0x190
 [a03256e3] __btrfs_inc_extent_ref+0xa3/0x1e0 [btrfs]
 [a0326d19] ? update_block_group+0xd9/0x2a0 [btrfs]
 [a0327e94] run_clustered_refs+0x664/0x7f0 [btrfs]
 [a03280e8] btrfs_run_delayed_refs+0xc8/0x210 [btrfs]
 [a033638d] btrfs_commit_transaction+0x7d/0x790 [btrfs]
 [81081de0] ? wake_up_bit+0x40/0x40
 [a037922d] prepare_to_merge+0x1fd/0x230 [btrfs]
 [a037f306] relocate_block_group+0x476/0x660 [btrfs]
 [a0334d15] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
 [a037f6a3] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
 [a0368d80] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
 [a035e22b] btrfs_relocate_chunk+0x8b/0x670 [btrfs]
 [a031303d] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs]
 [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
 [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
 [8115dc41] ? path_openat+0x101/0x3d0
 [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
 [8111e358] ? handle_mm_fault+0x148/0x270
 [814809e8] ? do_page_fault+0x1d8/0x4b0
 [81160d6a] do_vfs_ioctl+0x9a/0x540
 [811612b1] sys_ioctl+0xa1/0xb0
 [81484ec2] system_call_fastpath+0x16/0x1b
Code: 48 8b 75 20 48 89 c3 48 8b 7d 18 e8 c9 bd ff ff 48 39 d8 77 26 b8 1d 00 
00 00 e9 15 ff ff ff a8 01 0f 85 8c fe ff ff 0f 0b eb fe 0f 0b eb fe 0f 0b 0f 
1f 84 00 00 00 00 00 eb f6 4c 89 fb 44 8b
RIP  [a0325422] lookup_inline_extent_backref+0x2d2/0x3f0 [btrfs]
 RSP 8801475db748


 
 I think we need to 

kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-06 Thread Tsutomu Itoh
Hi,

I encountered following panic using 'btrfs-unstable + for-linus'
kernel.

I ran btrfs fi bal /test5 command, and mount option of /test5
is as follows:

 /dev/sdc3 on /test5 type btrfs (rw,space_cache,compress=lzo,inode_cache)

Thanks,
Tsutomu

=

btrfs: relocating block group 23383244800 flags 20
btrfs: found 2959 extents
[ cut here ]
WARNING: at fs/btrfs/transaction.c:213 start_transaction+0x2a7/0x2b0 [btrfs]()
Hardware name: PRIMERGY
Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufr
eq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd dm_mirror
dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i
801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac
_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas pata_a
cpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode]
Pid: 23781, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4
Call Trace:
 [8106004f] warn_slowpath_common+0x7f/0xc0
 [810600aa] warn_slowpath_null+0x1a/0x20
 [a0337047] start_transaction+0x2a7/0x2b0 [btrfs]
 [a035498d] ? btrfs_wait_ordered_range+0x10d/0x160 [btrfs]
 [a0337323] btrfs_start_transaction+0x13/0x20 [btrfs]
 [a033bbca] btrfs_evict_inode+0x11a/0x260 [btrfs]
 [811687f8] evict+0x78/0x170
 [81168c92] iput+0xe2/0x1a0
 [a031f171] btrfs_remove_block_group+0x141/0x3c0 [btrfs]
 [a035e6ea] btrfs_relocate_chunk+0x54a/0x670 [btrfs]
 [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
 [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
 [8115dc41] ? path_openat+0x101/0x3d0
 [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
 [8111e358] ? handle_mm_fault+0x148/0x270
 [814809e8] ? do_page_fault+0x1d8/0x4b0
 [81160d6a] do_vfs_ioctl+0x9a/0x540
 [811612b1] sys_ioctl+0xa1/0xb0
 [81484ec2] system_call_fastpath+0x16/0x1b
---[ end trace e5c5cb2e98a3cd1a ]---
btrfs: relocating block group 20971520 flags 18
btrfs: relocating block group 34925969408 flags 18
btrfs: found 1 extents
[ cut here ]
kernel BUG at fs/btrfs/extent-tree.c:6164!
invalid opcode:  [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd 
dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg 
pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug 
i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom 
megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod floppy [last 
unloaded: microcode]

Pid: 4109, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4 FUJITSU-SV
  PRIMERGY/D2399
RIP: 0010:[a0325b95]  [a0325b95] walk_up_proc+0x375/0x420 
[btrfs]
RSP: 0018:8801801eb9c8  EFLAGS: 00010286
RAX: 0005 RBX: 880167a70140 RCX: fff8
RDX: 8801801ea000 RSI: 8800 RDI: 880194909fa8
RBP: 8801801eba18 R08:  R09: 0005
R10: 0001 R11: 880194909fa8 R12: 
R13: 88013973d000 R14: 88015ad4d9a0 R15: 880042203920
FS:  7fa86bcb9740() GS:88019fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0033cf60b0c0 CR3: 000181cf7000 CR4: 06f0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process btrfs (pid: 4109, threadinfo 8801801ea000, task 88011a4914a0)
Stack:
 8801801eba18 880194909fa8 8801 a03280e8
 8801801eba58 88015ad4d9a0  
 8801801ea000 880167a70140 8801801eba78 a0325d71
Call Trace:
 [a03280e8] ? btrfs_run_delayed_refs+0xc8/0x210 [btrfs]
 [a0325d71] walk_up_tree+0x131/0x1b0 [btrfs]
 [a03260b0] btrfs_drop_snapshot+0x2c0/0x5c0 [btrfs]
 [a03328b0] ? btrfs_read_fs_root_no_name+0x1b0/0x280 [btrfs]
 [a037b45f] merge_reloc_roots+0xdf/0x150 [btrfs]
 [a037f311] relocate_block_group+0x481/0x660 [btrfs]
 [a0334d15] ? btrfs_clean_old_snapshots+0x35/0x150 [btrfs]
 [a037f6a3] btrfs_relocate_block_group+0x1b3/0x2e0 [btrfs]
 [a0368d80] ? btrfs_tree_unlock+0x50/0x50 [btrfs]
 [a035e22b] btrfs_relocate_chunk+0x8b/0x670 [btrfs]
 [a031303d] ? btrfs_set_path_blocking+0x3d/0x50 [btrfs]
 [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
 [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
 [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs

Re: kernel BUG at fs/btrfs/extent-tree.c:6164!

2011-06-06 Thread liubo
On 06/06/2011 04:33 PM, Tsutomu Itoh wrote:
 Hi,
 
 I encountered following panic using 'btrfs-unstable + for-linus'
 kernel.
 
 I ran btrfs fi bal /test5 command, and mount option of /test5
 is as follows:
 
  /dev/sdc3 on /test5 type btrfs (rw,space_cache,compress=lzo,inode_cache)
 

So, just a btrfs fi bal would lead to the bug?

I've figured out the warnings, but not reproduced the bug yet...
I used 'btrfs-unstable + for-linus whose top commit is

commit aa0467d8d2a00e75b2bb6a56a4ee6d70c5d1928f
Author: David Sterba dste...@suse.cz
Date:   Fri Jun 3 16:29:08 2011 +0200

btrfs: fix uninitialized variable warning

and tried on 1) a single disk, 2) 2 disks and 3) 4 disks respectively,
but none of them leaded to the below bug.

I guess maybe I miss something to reproduce it?

thanks,
liubo

 Thanks,
 Tsutomu
 
 =
 
 btrfs: relocating block group 23383244800 flags 20
 btrfs: found 2959 extents
 [ cut here ]
 WARNING: at fs/btrfs/transaction.c:213 start_transaction+0x2a7/0x2b0 [btrfs]()
 Hardware name: PRIMERGY
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufr
 eq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 jbd 
 dm_mirror
 dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr 
 i2c_i
 801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac 
 edac
 _core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas 
 pata_a
 cpi ata_generic ata_piix libata scsi_mod floppy [last unloaded: microcode]
 Pid: 23781, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4
 Call Trace:
  [8106004f] warn_slowpath_common+0x7f/0xc0
  [810600aa] warn_slowpath_null+0x1a/0x20
  [a0337047] start_transaction+0x2a7/0x2b0 [btrfs]
  [a035498d] ? btrfs_wait_ordered_range+0x10d/0x160 [btrfs]
  [a0337323] btrfs_start_transaction+0x13/0x20 [btrfs]
  [a033bbca] btrfs_evict_inode+0x11a/0x260 [btrfs]
  [811687f8] evict+0x78/0x170
  [81168c92] iput+0xe2/0x1a0
  [a031f171] btrfs_remove_block_group+0x141/0x3c0 [btrfs]
  [a035e6ea] btrfs_relocate_chunk+0x54a/0x670 [btrfs]
  [a0357668] ? read_extent_buffer+0xd8/0x1d0 [btrfs]
  [a031be51] ? btrfs_previous_item+0xb1/0x150 [btrfs]
  [a035f43a] btrfs_balance+0x21a/0x2b0 [btrfs]
  [8115dc41] ? path_openat+0x101/0x3d0
  [a03685bc] btrfs_ioctl+0x51c/0xc40 [btrfs]
  [8111e358] ? handle_mm_fault+0x148/0x270
  [814809e8] ? do_page_fault+0x1d8/0x4b0
  [81160d6a] do_vfs_ioctl+0x9a/0x540
  [811612b1] sys_ioctl+0xa1/0xb0
  [81484ec2] system_call_fastpath+0x16/0x1b
 ---[ end trace e5c5cb2e98a3cd1a ]---
 btrfs: relocating block group 20971520 flags 18
 btrfs: relocating block group 34925969408 flags 18
 btrfs: found 1 extents
 [ cut here ]
 kernel BUG at fs/btrfs/extent-tree.c:6164!
 invalid opcode:  [#1] SMP
 last sysfs file: /sys/kernel/mm/ksm/run
 CPU 0
 Modules linked in: autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand 
 acpi_cpufreq freq_table mperf ipv6 btrfs zlib_deflate crc32c libcrc32c ext3 
 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc 
 parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp 
 pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif 
 sr_mod cdrom megaraid_sas pata_acpi ata_generic ata_piix libata scsi_mod 
 floppy [last unloaded: microcode]
 
 Pid: 4109, comm: btrfs Tainted: GW   2.6.39btrfs-test+ #4 FUJITSU-SV  
 PRIMERGY/D2399
 RIP: 0010:[a0325b95]  [a0325b95] walk_up_proc+0x375/0x420 
 [btrfs]
 RSP: 0018:8801801eb9c8  EFLAGS: 00010286
 RAX: 0005 RBX: 880167a70140 RCX: fff8
 RDX: 8801801ea000 RSI: 8800 RDI: 880194909fa8
 RBP: 8801801eba18 R08:  R09: 0005
 R10: 0001 R11: 880194909fa8 R12: 
 R13: 88013973d000 R14: 88015ad4d9a0 R15: 880042203920
 FS:  7fa86bcb9740() GS:88019fc0() knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 0033cf60b0c0 CR3: 000181cf7000 CR4: 06f0
 DR0:  DR1:  DR2: 
 DR3:  DR6: 0ff0 DR7: 0400
 Process btrfs (pid: 4109, threadinfo 8801801ea000, task 88011a4914a0)
 Stack:
  8801801eba18 880194909fa8 8801 a03280e8
  8801801eba58 88015ad4d9a0  
  8801801ea000 880167a70140 8801801eba78 a0325d71
 Call Trace:
  [a03280e8] ? btrfs_run_delayed_refs+0xc8/0x210 [btrfs]
  [a0325d71] walk_up_tree+0x131/0x1b0 [btrfs]
  [a03260b0] btrfs_drop_snapshot+0x2c0/0x5c0 [btrfs]
  [a03328b0