Re: [RFC PATCH 0/3] apply the Probabilistic Skiplist on btrfs

2012-01-06 Thread Liu Bo
On 01/06/2012 07:51 AM, David Sterba wrote:
 Hi, I've let it run through xfstests and ended at 091, patches applied
 on top of 3.2, mount options
 compress-force=lzo,discard,inode_cache,space_cache,autodefrag
 fresh mkfs with defaults.
 


Hi David,

Thanks a lot for your work!

I also find this and fix it.

I will send V2 patchset after it goes through xfstests.

thanks,
liubo


 [ 1081.623819] btrfs: force lzo compression
 [ 1081.629166] btrfs: enabling inode map caching
 [ 1081.634853] btrfs: enabling auto defrag
 [ 1081.638569] btrfs: disk space caching is enabled
 [ 1119.693957] [ cut here ]
 [ 1119.697876] kernel BUG at fs/btrfs/file.c:530!
 [ 1119.697876] invalid opcode:  [#1] SMP
 [ 1119.697876] CPU 1
 [ 1119.697876] Modules linked in: loop btrfs aoe
 [ 1119.697876]
 [ 1119.697876] Pid: 25819, comm: fsx Not tainted 3.2.0-default+ #95 Intel 
 Corporation Santa Rosa platform/Matanzas
 [ 1119.697876] RIP: 0010:[a0048a18]  [a0048a18] 
 btrfs_drop_extent_cache+0x3f8/0x400 [btrfs]
 [ 1119.697876] RSP: 0018:88000c47f698  EFLAGS: 00010282
 [ 1119.697876] RAX: ffef RBX: 88006ff01e48 RCX: 
 00026fff
 [ 1119.697876] RDX: 88006ed5d830 RSI: 00022000 RDI: 
 
 [ 1119.697876] RBP: 88000c47f738 R08:  R09: 
 00022000
 [ 1119.697876] R10: fffe R11: 00026fff R12: 
 88001ada9e48
 [ 1119.697876] R13: 0001f000 R14:  R15: 
 88000c47f708
 [ 1119.697876] FS:  7f262e570700() GS:88007de0() 
 knlGS:
 [ 1119.697876] CS:  0010 DS:  ES:  CR0: 8005003b
 [ 1119.697876] CR2: 7fc4364fc000 CR3: 79435000 CR4: 
 06e0
 [ 1119.697876] DR0:  DR1:  DR2: 
 
 [ 1119.697876] DR3:  DR6: 0ff0 DR7: 
 0400
 [ 1119.697876] Process fsx (pid: 25819, threadinfo 88000c47e000, task 
 880063640700)
 [ 1119.697876] Stack:
 [ 1119.697876]  8800 81092040 88000c47f6f0 
 01000246
 [ 1119.697876]  0001  3000 
 88006e5c44f0
 [ 1119.697876]  88006e5c43e0   
 
 [ 1119.697876] Call Trace:
 [ 1119.697876]  [81092040] ? trace_hardirqs_on_caller+0x20/0x1d0
 [ 1119.697876]  [a003a0b0] ? csum_exist_in_range+0xa0/0xa0 [btrfs]
 [ 1119.697876]  [a003f296] cow_file_range+0x136/0x3e0 [btrfs]
 [ 1119.697876]  [810921fd] ? trace_hardirqs_on+0xd/0x10
 [ 1119.697876]  [a003f8a7] run_delalloc_nocow+0x367/0x820 [btrfs]
 [ 1119.697876]  [81357dae] ? do_raw_spin_unlock+0x5e/0xb0
 [ 1119.697876]  [a00400c9] run_delalloc_range+0x369/0x370 [btrfs]
 [ 1119.697876]  [a00582c0] __extent_writepage+0x5f0/0x750 [btrfs]
 [ 1119.697876]  [81349f4d] ? 
 radix_tree_gang_lookup_tag_slot+0x8d/0xd0
 [ 1119.697876]  [810f30d1] ? find_get_pages_tag+0x111/0x1b0
 [ 1119.697876]  [a0058692] 
 extent_write_cache_pages.clone.0+0x272/0x3f0 [btrfs]
 [ 1119.697876]  [81357dae] ? do_raw_spin_unlock+0x5e/0xb0
 [ 1119.697876]  [81131604] ? kfree+0xd4/0x180
 [ 1119.697876]  [81092040] ? trace_hardirqs_on_caller+0x20/0x1d0
 [ 1119.697876]  [a0058a56] extent_writepages+0x46/0x60 [btrfs]
 [ 1119.697876]  [a003b590] ? acls_after_inode_item+0xd0/0xd0 [btrfs]
 [ 1119.697876]  [a003ad17] btrfs_writepages+0x27/0x30 [btrfs]
 [ 1120.018734]  [810fdcc4] do_writepages+0x24/0x40
 [ 1120.018734]  [810f3cdb] __filemap_fdatawrite_range+0x5b/0x60
 [ 1120.018734]  [810f3d3a] filemap_write_and_wait_range+0x5a/0x80
 [ 1120.018734]  [a004859a] btrfs_file_aio_write+0x4da/0x560 [btrfs]
 [ 1120.018734]  [8113a852] do_sync_write+0xe2/0x120
 [ 1120.018734]  [8187d2ad] ? __mutex_unlock_slowpath+0xdd/0x180
 [ 1120.018734]  [8187d35e] ? mutex_unlock+0xe/0x10
 [ 1120.018734]  [a004703f] ? btrfs_file_llseek+0x6f/0x390 [btrfs]
 [ 1120.018734]  [8113b15e] vfs_write+0xce/0x190
 [ 1120.018734]  [8113b4a4] sys_write+0x54/0xa0
 [ 1120.018734]  [81887a82] system_call_fastpath+0x16/0x1b
 [ 1120.018734] Code: 5e 41 5f c9 c3 0f 0b be bf 01 00 00 48 c7 c7 e6 02 09 a0 
 48 89 95 68 ff ff ff e8 e4 a2 00 e1 48 8b 95 68 ff ff ff e9 3c fc ff ff 0f 
 0b 0f 0b 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 54 53
 [ 1120.018734] RIP  [a0048a18] btrfs_drop_extent_cache+0x3f8/0x400 
 [btrfs]
 [ 1120.018734]  RSP 88000c47f698
 [ 1120.047841] ---[ end trace ca0f509767e0195d ]---
 
 xfstests/091 output:
 
 091 57s ... [19:47:50] [19:48:28] [failed, exit status 1] - output 
 mismatch (see 091.out.bad)
 --- 091.out 2011-11-01 10:31:12.0 +0100
 +++ 091.out.bad 2012-01-05 19:48:28.0 +0100
 @@ -5,3 +5,41 @@
  fsx -N 1 -o 8192 -l 50 -r 

[RFC PATCH 0/3] apply the Probabilistic Skiplist on btrfs

2012-01-05 Thread Liu Bo
Since we are inclined to apply a lockless scheme on some objects of btrfs for
higher performance, we want to build a RCU version the Probabilistic Skiplist.

Here our skiplist algorithm is based on the skiplist experiments of
Con Kolivas ker...@kolivas.org for BFS cpu scheduler.
And more details about skiplist design are in patch 1.

Right now we have a plan to apply skiplist on extent_map and extent_state.

Here we choose extent_map firstly, since it is a read mostly thing,
and the change is quite direct, all we need to do is
a) to replace rbtree with skiplist,
b) to add rcu support.
And more details are in patch 2 and patch 3.

I've done some simple tests for performance on my 2-core box, there is no
obvious difference, but I want to focus on the design side and make sure
there is no more bug in it firstly.

For long term goals, we want to ship skiplist to lib, like lib/rbtree.c.

MORE TESTS ARE WELCOME!

Liu Bo (3):
  Btrfs: add the Probabilistic Skiplist
  Btrfs: rebuild extent_map based on skiplist
  Btrfs: convert rwlock to RCU for extent_map

 fs/btrfs/Makefile  |2 +-
 fs/btrfs/compression.c |8 +-
 fs/btrfs/disk-io.c |   15 ++-
 fs/btrfs/extent_io.c   |   13 +-
 fs/btrfs/extent_map.c  |  296 ++--
 fs/btrfs/extent_map.h  |   21 +++-
 fs/btrfs/file.c|   23 +++-
 fs/btrfs/inode.c   |   69 
 fs/btrfs/ioctl.c   |8 +-
 fs/btrfs/relocation.c  |9 +-
 fs/btrfs/scrub.c   |4 +-
 fs/btrfs/skiplist.c|   98 
 fs/btrfs/skiplist.h|  217 +++
 fs/btrfs/volumes.c |   68 ++-
 14 files changed, 651 insertions(+), 200 deletions(-)
 create mode 100644 fs/btrfs/skiplist.c
 create mode 100644 fs/btrfs/skiplist.h

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH 0/3] apply the Probabilistic Skiplist on btrfs

2012-01-05 Thread Liu Bo
Since we are inclined to apply a lockless scheme on some objects of btrfs for
higher performance, we want to build a RCU version the Probabilistic Skiplist.

Here our skiplist algorithm is based on the skiplist experiments of
Con Kolivas ker...@kolivas.org for BFS cpu scheduler.
And more details about skiplist design are in patch 1.

Right now we have a plan to apply skiplist on extent_map and extent_state.

Here we choose extent_map firstly, since it is a read mostly thing,
and the change is quite direct, all we need to do is
a) to replace rbtree with skiplist,
b) to add rcu support.
And more details are in patch 2 and patch 3.

I've done some simple tests for performance on my 2-core box, there is no
obvious difference, but I want to focus on the design side and make sure
there is no more bug in it firstly.

For long term goals, we want to ship skiplist to lib, like lib/rbtree.c.

MORE TESTS ARE WELCOME!

Liu Bo (3):
  Btrfs: add the Probabilistic Skiplist
  Btrfs: rebuild extent_map based on skiplist
  Btrfs: convert rwlock to RCU for extent_map

 fs/btrfs/Makefile  |2 +-
 fs/btrfs/compression.c |8 +-
 fs/btrfs/disk-io.c |   15 ++-
 fs/btrfs/extent_io.c   |   13 +-
 fs/btrfs/extent_map.c  |  296 ++--
 fs/btrfs/extent_map.h  |   21 +++-
 fs/btrfs/file.c|   23 +++-
 fs/btrfs/inode.c   |   69 
 fs/btrfs/ioctl.c   |8 +-
 fs/btrfs/relocation.c  |9 +-
 fs/btrfs/scrub.c   |4 +-
 fs/btrfs/skiplist.c|   98 
 fs/btrfs/skiplist.h|  217 +++
 fs/btrfs/volumes.c |   68 ++-
 14 files changed, 651 insertions(+), 200 deletions(-)
 create mode 100644 fs/btrfs/skiplist.c
 create mode 100644 fs/btrfs/skiplist.h

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/3] apply the Probabilistic Skiplist on btrfs

2012-01-05 Thread David Sterba
Hi, I've let it run through xfstests and ended at 091, patches applied
on top of 3.2, mount options
compress-force=lzo,discard,inode_cache,space_cache,autodefrag
fresh mkfs with defaults.

[ 1081.623819] btrfs: force lzo compression
[ 1081.629166] btrfs: enabling inode map caching
[ 1081.634853] btrfs: enabling auto defrag
[ 1081.638569] btrfs: disk space caching is enabled
[ 1119.693957] [ cut here ]
[ 1119.697876] kernel BUG at fs/btrfs/file.c:530!
[ 1119.697876] invalid opcode:  [#1] SMP
[ 1119.697876] CPU 1
[ 1119.697876] Modules linked in: loop btrfs aoe
[ 1119.697876]
[ 1119.697876] Pid: 25819, comm: fsx Not tainted 3.2.0-default+ #95 Intel 
Corporation Santa Rosa platform/Matanzas
[ 1119.697876] RIP: 0010:[a0048a18]  [a0048a18] 
btrfs_drop_extent_cache+0x3f8/0x400 [btrfs]
[ 1119.697876] RSP: 0018:88000c47f698  EFLAGS: 00010282
[ 1119.697876] RAX: ffef RBX: 88006ff01e48 RCX: 00026fff
[ 1119.697876] RDX: 88006ed5d830 RSI: 00022000 RDI: 
[ 1119.697876] RBP: 88000c47f738 R08:  R09: 00022000
[ 1119.697876] R10: fffe R11: 00026fff R12: 88001ada9e48
[ 1119.697876] R13: 0001f000 R14:  R15: 88000c47f708
[ 1119.697876] FS:  7f262e570700() GS:88007de0() 
knlGS:
[ 1119.697876] CS:  0010 DS:  ES:  CR0: 8005003b
[ 1119.697876] CR2: 7fc4364fc000 CR3: 79435000 CR4: 06e0
[ 1119.697876] DR0:  DR1:  DR2: 
[ 1119.697876] DR3:  DR6: 0ff0 DR7: 0400
[ 1119.697876] Process fsx (pid: 25819, threadinfo 88000c47e000, task 
880063640700)
[ 1119.697876] Stack:
[ 1119.697876]  8800 81092040 88000c47f6f0 
01000246
[ 1119.697876]  0001  3000 
88006e5c44f0
[ 1119.697876]  88006e5c43e0   

[ 1119.697876] Call Trace:
[ 1119.697876]  [81092040] ? trace_hardirqs_on_caller+0x20/0x1d0
[ 1119.697876]  [a003a0b0] ? csum_exist_in_range+0xa0/0xa0 [btrfs]
[ 1119.697876]  [a003f296] cow_file_range+0x136/0x3e0 [btrfs]
[ 1119.697876]  [810921fd] ? trace_hardirqs_on+0xd/0x10
[ 1119.697876]  [a003f8a7] run_delalloc_nocow+0x367/0x820 [btrfs]
[ 1119.697876]  [81357dae] ? do_raw_spin_unlock+0x5e/0xb0
[ 1119.697876]  [a00400c9] run_delalloc_range+0x369/0x370 [btrfs]
[ 1119.697876]  [a00582c0] __extent_writepage+0x5f0/0x750 [btrfs]
[ 1119.697876]  [81349f4d] ? radix_tree_gang_lookup_tag_slot+0x8d/0xd0
[ 1119.697876]  [810f30d1] ? find_get_pages_tag+0x111/0x1b0
[ 1119.697876]  [a0058692] 
extent_write_cache_pages.clone.0+0x272/0x3f0 [btrfs]
[ 1119.697876]  [81357dae] ? do_raw_spin_unlock+0x5e/0xb0
[ 1119.697876]  [81131604] ? kfree+0xd4/0x180
[ 1119.697876]  [81092040] ? trace_hardirqs_on_caller+0x20/0x1d0
[ 1119.697876]  [a0058a56] extent_writepages+0x46/0x60 [btrfs]
[ 1119.697876]  [a003b590] ? acls_after_inode_item+0xd0/0xd0 [btrfs]
[ 1119.697876]  [a003ad17] btrfs_writepages+0x27/0x30 [btrfs]
[ 1120.018734]  [810fdcc4] do_writepages+0x24/0x40
[ 1120.018734]  [810f3cdb] __filemap_fdatawrite_range+0x5b/0x60
[ 1120.018734]  [810f3d3a] filemap_write_and_wait_range+0x5a/0x80
[ 1120.018734]  [a004859a] btrfs_file_aio_write+0x4da/0x560 [btrfs]
[ 1120.018734]  [8113a852] do_sync_write+0xe2/0x120
[ 1120.018734]  [8187d2ad] ? __mutex_unlock_slowpath+0xdd/0x180
[ 1120.018734]  [8187d35e] ? mutex_unlock+0xe/0x10
[ 1120.018734]  [a004703f] ? btrfs_file_llseek+0x6f/0x390 [btrfs]
[ 1120.018734]  [8113b15e] vfs_write+0xce/0x190
[ 1120.018734]  [8113b4a4] sys_write+0x54/0xa0
[ 1120.018734]  [81887a82] system_call_fastpath+0x16/0x1b
[ 1120.018734] Code: 5e 41 5f c9 c3 0f 0b be bf 01 00 00 48 c7 c7 e6 02 09 a0 
48 89 95 68 ff ff ff e8 e4 a2 00 e1 48 8b 95 68 ff ff ff e9 3c fc ff ff 0f 0b 
0f 0b 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 54 53
[ 1120.018734] RIP  [a0048a18] btrfs_drop_extent_cache+0x3f8/0x400 
[btrfs]
[ 1120.018734]  RSP 88000c47f698
[ 1120.047841] ---[ end trace ca0f509767e0195d ]---

xfstests/091 output:

091 57s ... [19:47:50] [19:48:28] [failed, exit status 1] - output mismatch 
(see 091.out.bad)
--- 091.out 2011-11-01 10:31:12.0 +0100
+++ 091.out.bad 2012-01-05 19:48:28.0 +0100
@@ -5,3 +5,41 @@
 fsx -N 1 -o 8192 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
 fsx -N 1 -o 32768 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -R -W
 fsx -N 1 -o 128000 -l 50 -r PSIZE -t BSIZE -w BSIZE -Z -W
+./091: line 46: 25819 Segmentation fault  $here/ltp/fsx $args 
$TEST_DIR/junk  $seq.full 21
+fsx -N 1 -l 50 -r