Re: [PATCH 1/4] Btrfs: use radix tree for checksum

2012-07-07 Thread Liu Bo
On 07/06/2012 11:37 PM, Chris Mason wrote:

 On Wed, Jun 13, 2012 at 07:50:52PM -0600, Liu Bo wrote:
 On 06/14/2012 12:07 AM, Zach Brown wrote:

   int set_state_private(struct extent_io_tree *tree, u64 start, u64
 private)
   {
 [...]
 +ret = radix_tree_insert(tree-csum, (unsigned long)start,
 +   (void *)((unsigned long)private  1));
 Will this fail for 64bit files on 32bit hosts?

 In theory it will fail, but crc32c return u32, so private will be originally 
 u32,
 and it'd be ok on 32bit hosts.
 
 The (unsigned long)start part looks wrong though.  This is the byte offset
 from 0, so on a 32 bit machine you won't be able to have large files.
 
 The page cache also has this limitation, but it gains extra bits
 counting page indexes instead of byte indexes.
 


I see.

 I've made that change here and I'm benchmarking it on my big flash ;)
 


Thanks a lot. :)

I must note that this patchset is still very initial, and this week I've fixed
a deadlock bug hidden in the 4th patch (it can be triggered by xfstests 208).

I'm planning to set up a worker thread or just use 'endio_meta' thread for 
merge_state
and do more tuning work to lessen writer lock.

thanks,
liubo

 -chris
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] Btrfs: use radix tree for checksum

2012-06-14 Thread Zach Brown



+BUG_ON(ret);


I wonder if we can patch BUG_ON() to break the build if its only
argument is ret.



why?


Well, I'm mostly joking :).  That would be a very silly change to make.

But only mostly joking.  btrfs does have a real fragility problem from
all these incomplete error handling paths:

$ grep 'BUG_ON(ret.*)' fs/btrfs/*.c | wc -l
197

We should be fixing these, not adding more.  I don't think any patches
should be merged which add more of these.

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] Btrfs: use radix tree for checksum

2012-06-13 Thread Liu Bo
We used to issue a checksum to an extent state of 4K range for read endio,
but now we want to use larger range for performance optimization, so instead we
create a radix tree for checksum, where an item stands for checksum of 4K data.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/extent_io.c |   84 --
 fs/btrfs/extent_io.h |2 +
 fs/btrfs/inode.c |7 +---
 3 files changed, 23 insertions(+), 70 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 2c8f7b2..2923ede 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -117,10 +117,12 @@ void extent_io_tree_init(struct extent_io_tree *tree,
 {
tree-state = RB_ROOT;
INIT_RADIX_TREE(tree-buffer, GFP_ATOMIC);
+   INIT_RADIX_TREE(tree-csum, GFP_ATOMIC);
tree-ops = NULL;
tree-dirty_bytes = 0;
spin_lock_init(tree-lock);
spin_lock_init(tree-buffer_lock);
+   spin_lock_init(tree-csum_lock);
tree-mapping = mapping;
 }
 
@@ -703,15 +705,6 @@ static void cache_state(struct extent_state *state,
}
 }
 
-static void uncache_state(struct extent_state **cached_ptr)
-{
-   if (cached_ptr  (*cached_ptr)) {
-   struct extent_state *state = *cached_ptr;
-   *cached_ptr = NULL;
-   free_extent_state(state);
-   }
-}
-
 /*
  * set some bits on a range in the tree.  This may require allocations or
  * sleeping, so the gfp mask is used to indicate what is allowed.
@@ -1666,56 +1659,32 @@ out:
  */
 int set_state_private(struct extent_io_tree *tree, u64 start, u64 private)
 {
-   struct rb_node *node;
-   struct extent_state *state;
int ret = 0;
 
-   spin_lock(tree-lock);
-   /*
-* this search will find all the extents that end after
-* our range starts.
-*/
-   node = tree_search(tree, start);
-   if (!node) {
-   ret = -ENOENT;
-   goto out;
-   }
-   state = rb_entry(node, struct extent_state, rb_node);
-   if (state-start != start) {
-   ret = -ENOENT;
-   goto out;
-   }
-   state-private = private;
-out:
-   spin_unlock(tree-lock);
+   spin_lock(tree-csum_lock);
+   ret = radix_tree_insert(tree-csum, (unsigned long)start,
+  (void *)((unsigned long)private  1));
+   BUG_ON(ret);
+   spin_unlock(tree-csum_lock);
return ret;
 }
 
 int get_state_private(struct extent_io_tree *tree, u64 start, u64 *private)
 {
-   struct rb_node *node;
-   struct extent_state *state;
-   int ret = 0;
+   void **slot = NULL;
 
-   spin_lock(tree-lock);
-   /*
-* this search will find all the extents that end after
-* our range starts.
-*/
-   node = tree_search(tree, start);
-   if (!node) {
-   ret = -ENOENT;
-   goto out;
-   }
-   state = rb_entry(node, struct extent_state, rb_node);
-   if (state-start != start) {
-   ret = -ENOENT;
-   goto out;
+   spin_lock(tree-csum_lock);
+   slot = radix_tree_lookup_slot(tree-csum, (unsigned long)start);
+   if (!slot) {
+   spin_unlock(tree-csum_lock);
+   return -ENOENT;
}
-   *private = state-private;
-out:
-   spin_unlock(tree-lock);
-   return ret;
+   *private = (u64)(*slot)  1;
+
+   radix_tree_delete(tree-csum, (unsigned long)start);
+   spin_unlock(tree-csum_lock);
+
+   return 0;
 }
 
 /*
@@ -2294,7 +2263,6 @@ static void end_bio_extent_readpage(struct bio *bio, int 
err)
do {
struct page *page = bvec-bv_page;
struct extent_state *cached = NULL;
-   struct extent_state *state;
 
pr_debug(end_bio_extent_readpage: bi_vcnt=%d, idx=%d, err=%d, 
 mirror=%ld\n, bio-bi_vcnt, bio-bi_idx, err,
@@ -2313,21 +2281,10 @@ static void end_bio_extent_readpage(struct bio *bio, 
int err)
if (++bvec = bvec_end)
prefetchw(bvec-bv_page-flags);
 
-   spin_lock(tree-lock);
-   state = find_first_extent_bit_state(tree, start, EXTENT_LOCKED);
-   if (state  state-start == start) {
-   /*
-* take a reference on the state, unlock will drop
-* the ref
-*/
-   cache_state(state, cached);
-   }
-   spin_unlock(tree-lock);
-
mirror = (int)(unsigned long)bio-bi_bdev;
if (uptodate  tree-ops  tree-ops-readpage_end_io_hook) {
ret = tree-ops-readpage_end_io_hook(page, start, end,
- state, mirror);
+ NULL, mirror);

Re: [PATCH 1/4] Btrfs: use radix tree for checksum

2012-06-13 Thread Zach Brown



  int set_state_private(struct extent_io_tree *tree, u64 start, u64 private)
  {

[...]

+   ret = radix_tree_insert(tree-csum, (unsigned long)start,
+  (void *)((unsigned long)private  1));


Will this fail for 64bit files on 32bit hosts?


+   BUG_ON(ret);


I wonder if we can patch BUG_ON() to break the build if its only
argument is ret.

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] Btrfs: use radix tree for checksum

2012-06-13 Thread Liu Bo
On 06/14/2012 12:07 AM, Zach Brown wrote:

 
   int set_state_private(struct extent_io_tree *tree, u64 start, u64
 private)
   {
 [...]
 +ret = radix_tree_insert(tree-csum, (unsigned long)start,
 +   (void *)((unsigned long)private  1));
 
 Will this fail for 64bit files on 32bit hosts?


In theory it will fail, but crc32c return u32, so private will be originally 
u32,
and it'd be ok on 32bit hosts.

 
 +BUG_ON(ret);
 
 I wonder if we can patch BUG_ON() to break the build if its only
 argument is ret.
 


why?

thanks,
liubo

 - z
 -- 
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html