Re: btrfs hang in flush-btrfs-5

2011-07-13 Thread Josef Bacik
On 07/11/2011 05:21 PM, Jeremy Sanders wrote:
 Josef Bacik wrote:
 
 On 07/11/2011 07:40 AM, Jeremy Sanders wrote:
 Jeremy Sanders wrote:

 Hi - I'm trying btrfs with kernel 2.6.38.8-32.fc15.x86_64 (a Fedora
 kernel). I'm just doing a tar-to-tar copy onto the file system with
 compress- force=zlib. Here are some traces of the stuck processes.

 I've managed to reproduce the hang using the latest btrfs from the
 repository. I had to remove some of the tracing lines to get it to
 compile under 2.6.38.8 and an ioctl which wasn't defined. Here is is
 where it is stuck:


 Hrm well that is just unlikely and hard to hit.  Will you try this and
 see if it helps you?  Thanks,
 
 It's got quite a bit further past than where it got before and hasn't 
 crashed yet. I will let you know when it has finished ok.
 
 I see that the btrfs-delalloc (rather than endio-write) thread is taking up 
 100% of CPU and the write speed seems to have dropped during the copying, 
 however. The copy started with using endio-write fully on both cores and now 
 is using dealloc a lot.
 


When you see that can you get sysrq+w or sysrq+t to get a stacktrace of
what it's doing so I can see if it's something that can be fixed.  Thanks,

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Delayed inode operations not doing the right thing with enospc

2011-07-13 Thread Josef Bacik
On 07/12/2011 11:20 AM, Christian Brunner wrote:
 2011/6/7 Josef Bacik jo...@redhat.com:
 On 06/06/2011 09:39 PM, Miao Xie wrote:
 On fri, 03 Jun 2011 14:46:10 -0400, Josef Bacik wrote:
 I got a lot of these when running stress.sh on my test box



 This is because use_block_rsv() is having to do a
 reserve_metadata_bytes(), which shouldn't happen as we should have
 reserved enough space for those operations to complete.  This is
 happening because use_block_rsv() will call get_block_rsv(), which if
 root-ref_cows is set (which is the case on all fs roots) we will use
 trans-block_rsv, which will only have what the current transaction
 starter had reserved.

 What needs to be done instead is we need to have a block reserve that
 any reservation that is done at create time for these inodes is migrated
 to this special reserve, and then when you run the delayed inode items
 stuff you set trans-block_rsv to the special block reserve so the
 accounting is all done properly.

 This is just off the top of my head, there may be a better way to do it,
 I've not actually looked that the delayed inode code at all.

 I would do this myself but I have a ever increasing list of shit to do
 so will somebody pick this up and fix it please?  Thanks,

 Sorry, it's my miss.
 I forgot to set trans-block_rsv to global_block_rsv, since we have migrated
 the space from trans_block_rsv to global_block_rsv.

 I'll fix it soon.


 There is another problem, we're failing xfstest 204.  I tried making
 reserve_metadata_bytes commit the transaction regardless of whether or
 not there were pinned bytes but the test just hung there.  Usually it
 takes 7 seconds to run and I ctrl+c'ed it after a couple of minutes.
 204 just creates a crap ton of files, which is what is killing us.
 There needs to be a way to start flushing delayed inode items so we can
 reclaim the space they are holding onto so we don't get enospc, and it
 needs to be better than just committing the transaction because that is
 dog slow.  Thanks,

 Josef
 
 Is there a solution for this?
 
 I'm running a 2.6.38.8 kernel with all the btrfs patches from 3.0rc7
 (except the pluging). When starting a ceph rebuild on the btrfs
 volumes I get a lot of warnings from block_rsv_use_bytes in
 use_block_rsv:
 

Ok I think I've got this nailed down.  Will you run with this patch and make 
sure the warnings go away?  Thanks,

Josef

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 52d7eca..2263d29 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -112,9 +112,6 @@ struct btrfs_inode {
 */
u64 disk_i_size;
 
-   /* flags field from the on disk inode */
-   u32 flags;
-
/*
 * if this is a directory then index_cnt is the counter for the index
 * number for new files that are created
@@ -128,14 +125,8 @@ struct btrfs_inode {
 */
u64 last_unlink_trans;
 
-   /*
-* Counters to keep track of the number of extent item's we may use due
-* to delalloc and such.  outstanding_extents is the number of extent
-* items we think we'll end up using, and reserved_extents is the number
-* of extent items we've reserved metadata for.
-*/
-   atomic_t outstanding_extents;
-   atomic_t reserved_extents;
+   /* flags field from the on disk inode */
+   u32 flags;
 
/*
 * ordered_data_close is set by truncate when a file that used
@@ -151,12 +142,21 @@ struct btrfs_inode {
unsigned orphan_meta_reserved:1;
unsigned dummy_inode:1;
unsigned in_defrag:1;
-
/*
 * always compress this one file
 */
unsigned force_compress:4;
 
+   /*
+* Counters to keep track of the number of extent item's we may use due
+* to delalloc and such.  outstanding_extents is the number of extent
+* items we think we'll end up using, and reserved_extents is the number
+* of extent items we've reserved metadata for.
+*/
+   spinlock_t extents_count_lock;
+   unsigned outstanding_extents;
+   unsigned reserved_extents;
+
struct btrfs_delayed_node *delayed_node;
 
struct inode vfs_inode;
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index be02cae..3ba4d5f 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2133,7 +2133,7 @@ static inline bool btrfs_mixed_space_info(struct 
btrfs_space_info *space_info)
 
 /* extent-tree.c */
 static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root,
-int num_items)
+unsigned num_items)
 {
return (root-leafsize + root-nodesize * (BTRFS_MAX_LEVEL - 1)) *
3 * num_items;
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3e52b85..65a721c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3952,13 +3952,35 @@ static u64 

[PATCH] Btrfs: don't print the leaf if we had an error

2011-07-13 Thread Josef Bacik
In __btrfs_free_extent we will print the leaf if we fail to find the extent we
wanted, but the problem is if we get an error we won't have a leaf so often this
leads to a NULL pointer dereference and we lose the error that actually
occurred.  So only print the leaf if ret  0, which means we didn't find the
item we were looking for but we didn't error either.  This way the error is
preserved.

Signed-off-by: Josef Bacik jo...@redhat.com
---
 fs/btrfs/extent-tree.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3e52b85..152669b 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4447,7 +4447,9 @@ static int __btrfs_free_extent(struct btrfs_trans_handle 
*trans,
printk(KERN_ERR umm, got %d back from search
   , was looking for %llu\n, ret,
   (unsigned long long)bytenr);
-   btrfs_print_leaf(extent_root, path-nodes[0]);
+   if (ret  0)
+   btrfs_print_leaf(extent_root,
+path-nodes[0]);
}
BUG_ON(ret);
extent_slot = path-slots[0];
-- 
1.7.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/16] Btrfs: fixes and cleanups for 3.1

2011-07-13 Thread Li Zefan
The first 4 patches are bug-fixes, and the remaining are small
cleanups that have sit in my git tree for some time.

The first 3 patches have been sent to the list before.

We save some bytes after this patchset:

   textdata bss dec hex filename
 42638738541024  431265   694a1 fs/btrfs/btrfs.o.orig
 42565138541024  430529   691c1 fs/btrfs/btrfs.o

I've run xfstests for testing.

===

Li Zefan (12):
  Btrfs: copy string correctly in INO_LOOKUP ioctl
  Btrfs: fix space leak when skipping small extents during trimming
  Btrfs: fix space leak when trimming free extents
  Btrfs: check the nodatasum flag when writing compressed files
  Btrfs: use wait_event()
  Btrfs: remove a BUG_ON() in btrfs_commit_transaction()
  Btrfs: remove remaining ref-cache code
  Btrfs: make acl functions really no-op if acl is not enabled
  Btrfs: remove redundant code for dir item lookup
  Btrfs: clean up search_extent_mapping()
  Btrfs: clean up code for extent_map lookup
  Btrfs: clean up code for merging extent maps

Xiao Guangrong (4):
  Btrfs: remove unused members from struct extent_state
  Btrfs: clean up for insert_state()
  Btrfs: clean up for wait_extent_bit()
  Btrfs: clean up for find_first_extent_bit()

 fs/btrfs/Makefile   |4 +-
 fs/btrfs/acl.c  |   17 -
 fs/btrfs/compression.c  |   14 +++-
 fs/btrfs/ctree.h|   15 -
 fs/btrfs/dir-item.c |   30 +
 fs/btrfs/extent_io.c|   80 --
 fs/btrfs/extent_io.h|2 -
 fs/btrfs/extent_map.c   |  155 ++-
 fs/btrfs/free-space-cache.c |   43 +++-
 fs/btrfs/ioctl.c|3 +-
 fs/btrfs/ref-cache.c|   68 ---
 fs/btrfs/ref-cache.h|   52 --
 fs/btrfs/transaction.c  |   65 +++
 13 files changed, 142 insertions(+), 406 deletions(-)
 delete mode 100644 fs/btrfs/ref-cache.c
 delete mode 100644 fs/btrfs/ref-cache.h

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/16] Btrfs: copy string correctly in INO_LOOKUP ioctl

2011-07-13 Thread Li Zefan
Memory areas [ptr, ptr+total_len] and [name, name+total_len]
may overlap, so it's wrong to use memcpy().

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a3c4751..08a4580 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1755,11 +1755,10 @@ static noinline int btrfs_search_path_in_tree(struct 
btrfs_fs_info *info,
key.objectid = key.offset;
key.offset = (u64)-1;
dirid = key.objectid;
-
}
if (ptr  name)
goto out;
-   memcpy(name, ptr, total_len);
+   memmove(name, ptr, total_len);
name[total_len]='\0';
ret = 0;
 out:
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/16] Btrfs: fix space leak when skipping small extents during trimming

2011-07-13 Thread Li Zefan
We're taking a free space extent out of the free space cache, trimming
it and then putting it back into the cache.

However for an extent that is smaller than the specified minimum length,
it's taken out but won't be put back, which causes space leak.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/free-space-cache.c |   34 +-
 1 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index bf0d615..901585d 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2463,6 +2463,7 @@ int btrfs_trim_block_group(struct btrfs_block_group_cache 
*block_group,
u64 bytes = 0;
u64 actually_trimmed;
int ret = 0;
+   int update_ret;
 
*trimmed = 0;
 
@@ -2486,6 +2487,7 @@ int btrfs_trim_block_group(struct btrfs_block_group_cache 
*block_group,
}
 
if (entry-bitmap) {
+   bytes = 0;
ret = search_bitmap(ctl, entry, start, bytes);
if (!ret) {
if (start = end) {
@@ -2493,6 +2495,8 @@ int btrfs_trim_block_group(struct btrfs_block_group_cache 
*block_group,
break;
}
bytes = min(bytes, end - start);
+   if (bytes  minlen)
+   goto next;
bitmap_clear_bits(ctl, entry, start, bytes);
if (entry-bytes == 0)
free_bitmap(ctl, entry);
@@ -2506,33 +2510,29 @@ int btrfs_trim_block_group(struct 
btrfs_block_group_cache *block_group,
} else {
start = entry-offset;
bytes = min(entry-bytes, end - start);
+   if (bytes  minlen)
+   goto next;
unlink_free_space(ctl, entry);
kmem_cache_free(btrfs_free_space_cachep, entry);
}
 
spin_unlock(ctl-tree_lock);
 
-   if (bytes = minlen) {
-   int update_ret;
-   update_ret = btrfs_update_reserved_bytes(block_group,
-bytes, 1, 1);
+   update_ret = btrfs_update_reserved_bytes(block_group,
+bytes, 1, 1);
 
-   ret = btrfs_error_discard_extent(fs_info-extent_root,
-start,
-bytes,
-actually_trimmed);
+   ret = btrfs_error_discard_extent(fs_info-extent_root, start,
+bytes, actually_trimmed);
 
-   btrfs_add_free_space(block_group, start, bytes);
-   if (!update_ret)
-   btrfs_update_reserved_bytes(block_group,
-   bytes, 0, 1);
+   btrfs_add_free_space(block_group, start, bytes);
+   if (!update_ret)
+   btrfs_update_reserved_bytes(block_group, bytes, 0, 1);
 
-   if (ret)
-   break;
-   *trimmed += actually_trimmed;
-   }
+   if (ret)
+   break;
+   *trimmed += actually_trimmed;
+next:
start += bytes;
-   bytes = 0;
 
if (fatal_signal_pending(current)) {
ret = -ERESTARTSYS;
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/16] Btrfs: fix space leak when trimming free extents

2011-07-13 Thread Li Zefan
When the end of an extent exceeds the end of the specified range,
the extent will be accidentally truncated.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/free-space-cache.c |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 901585d..cf4bffd 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -2512,8 +2512,15 @@ int btrfs_trim_block_group(struct 
btrfs_block_group_cache *block_group,
bytes = min(entry-bytes, end - start);
if (bytes  minlen)
goto next;
+
unlink_free_space(ctl, entry);
-   kmem_cache_free(btrfs_free_space_cachep, entry);
+   if (bytes  entry-bytes) {
+   entry-offset = entry-offset + bytes;
+   entry-bytes = entry-bytes - bytes;
+   link_free_space(ctl, entry);
+   } else {
+   kmem_cache_free(btrfs_free_space_cachep, entry);
+   }
}
 
spin_unlock(ctl-tree_lock);
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/16] Btrfs: use wait_event()

2011-07-13 Thread Li Zefan
Use wait_event() when possible to avoid code duplication.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/transaction.c |   59 +--
 1 files changed, 7 insertions(+), 52 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 51dcec8..34a30ea 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -216,17 +216,11 @@ static void wait_current_trans(struct btrfs_root *root)
spin_lock(root-fs_info-trans_lock);
cur_trans = root-fs_info-running_transaction;
if (cur_trans  cur_trans-blocked) {
-   DEFINE_WAIT(wait);
atomic_inc(cur_trans-use_count);
spin_unlock(root-fs_info-trans_lock);
-   while (1) {
-   prepare_to_wait(root-fs_info-transaction_wait, wait,
-   TASK_UNINTERRUPTIBLE);
-   if (!cur_trans-blocked)
-   break;
-   schedule();
-   }
-   finish_wait(root-fs_info-transaction_wait, wait);
+
+   wait_event(root-fs_info-transaction_wait,
+  !cur_trans-blocked);
put_transaction(cur_trans);
} else {
spin_unlock(root-fs_info-trans_lock);
@@ -362,15 +356,7 @@ struct btrfs_trans_handle 
*btrfs_start_ioctl_transaction(struct btrfs_root *root
 static noinline int wait_for_commit(struct btrfs_root *root,
struct btrfs_transaction *commit)
 {
-   DEFINE_WAIT(wait);
-   while (!commit-commit_done) {
-   prepare_to_wait(commit-commit_wait, wait,
-   TASK_UNINTERRUPTIBLE);
-   if (commit-commit_done)
-   break;
-   schedule();
-   }
-   finish_wait(commit-commit_wait, wait);
+   wait_event(commit-commit_wait, commit-commit_done);
return 0;
 }
 
@@ -1080,22 +1066,7 @@ int btrfs_transaction_blocked(struct btrfs_fs_info *info)
 static void wait_current_trans_commit_start(struct btrfs_root *root,
struct btrfs_transaction *trans)
 {
-   DEFINE_WAIT(wait);
-
-   if (trans-in_commit)
-   return;
-
-   while (1) {
-   prepare_to_wait(root-fs_info-transaction_blocked_wait, wait,
-   TASK_UNINTERRUPTIBLE);
-   if (trans-in_commit) {
-   finish_wait(root-fs_info-transaction_blocked_wait,
-   wait);
-   break;
-   }
-   schedule();
-   finish_wait(root-fs_info-transaction_blocked_wait, wait);
-   }
+   wait_event(root-fs_info-transaction_blocked_wait, trans-in_commit);
 }
 
 /*
@@ -1105,24 +1076,8 @@ static void wait_current_trans_commit_start(struct 
btrfs_root *root,
 static void wait_current_trans_commit_start_and_unblock(struct btrfs_root 
*root,
 struct btrfs_transaction *trans)
 {
-   DEFINE_WAIT(wait);
-
-   if (trans-commit_done || (trans-in_commit  !trans-blocked))
-   return;
-
-   while (1) {
-   prepare_to_wait(root-fs_info-transaction_wait, wait,
-   TASK_UNINTERRUPTIBLE);
-   if (trans-commit_done ||
-   (trans-in_commit  !trans-blocked)) {
-   finish_wait(root-fs_info-transaction_wait,
-   wait);
-   break;
-   }
-   schedule();
-   finish_wait(root-fs_info-transaction_wait,
-   wait);
-   }
+   wait_event(root-fs_info-transaction_wait,
+  trans-commit_done || (trans-in_commit  !trans-blocked));
 }
 
 /*
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/16] Btrfs: remove a BUG_ON() in btrfs_commit_transaction()

2011-07-13 Thread Li Zefan
wait_for_commit() always returns 0.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/transaction.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 34a30ea..40726ac 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -353,11 +353,10 @@ struct btrfs_trans_handle 
*btrfs_start_ioctl_transaction(struct btrfs_root *root
 }
 
 /* wait for a transaction commit to be fully complete */
-static noinline int wait_for_commit(struct btrfs_root *root,
+static noinline void wait_for_commit(struct btrfs_root *root,
struct btrfs_transaction *commit)
 {
wait_event(commit-commit_wait, commit-commit_done);
-   return 0;
 }
 
 int btrfs_wait_for_commit(struct btrfs_root *root, u64 transid)
@@ -1184,8 +1183,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
*trans,
atomic_inc(cur_trans-use_count);
btrfs_end_transaction(trans, root);
 
-   ret = wait_for_commit(root, cur_trans);
-   BUG_ON(ret);
+   wait_for_commit(root, cur_trans);
 
put_transaction(cur_trans);
 
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/16] Btrfs: remove remaining ref-cache code

2011-07-13 Thread Li Zefan
Since commit f2a97a9dbd86eb1ef956bdf20e05c507b32beb96
(btrfs: remove all unused functions), there's no extern functions
at all in ref-cache.c, so just remove the remaining dead code.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/ref-cache.c |   68 --
 fs/btrfs/ref-cache.h |   52 --
 2 files changed, 0 insertions(+), 120 deletions(-)
 delete mode 100644 fs/btrfs/ref-cache.c
 delete mode 100644 fs/btrfs/ref-cache.h

diff --git a/fs/btrfs/ref-cache.c b/fs/btrfs/ref-cache.c
deleted file mode 100644
index 82d569c..000
--- a/fs/btrfs/ref-cache.c
+++ /dev/null
@@ -1,68 +0,0 @@
-/*
- * Copyright (C) 2008 Oracle.  All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License v2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this program; if not, write to the
- * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
- * Boston, MA 021110-1307, USA.
- */
-
-#include linux/sched.h
-#include linux/slab.h
-#include linux/sort.h
-#include ctree.h
-#include ref-cache.h
-#include transaction.h
-
-static struct rb_node *tree_insert(struct rb_root *root, u64 bytenr,
-  struct rb_node *node)
-{
-   struct rb_node **p = root-rb_node;
-   struct rb_node *parent = NULL;
-   struct btrfs_leaf_ref *entry;
-
-   while (*p) {
-   parent = *p;
-   entry = rb_entry(parent, struct btrfs_leaf_ref, rb_node);
-
-   if (bytenr  entry-bytenr)
-   p = (*p)-rb_left;
-   else if (bytenr  entry-bytenr)
-   p = (*p)-rb_right;
-   else
-   return parent;
-   }
-
-   entry = rb_entry(node, struct btrfs_leaf_ref, rb_node);
-   rb_link_node(node, parent, p);
-   rb_insert_color(node, root);
-   return NULL;
-}
-
-static struct rb_node *tree_search(struct rb_root *root, u64 bytenr)
-{
-   struct rb_node *n = root-rb_node;
-   struct btrfs_leaf_ref *entry;
-
-   while (n) {
-   entry = rb_entry(n, struct btrfs_leaf_ref, rb_node);
-   WARN_ON(!entry-in_tree);
-
-   if (bytenr  entry-bytenr)
-   n = n-rb_left;
-   else if (bytenr  entry-bytenr)
-   n = n-rb_right;
-   else
-   return n;
-   }
-   return NULL;
-}
diff --git a/fs/btrfs/ref-cache.h b/fs/btrfs/ref-cache.h
deleted file mode 100644
index 24f7001..000
--- a/fs/btrfs/ref-cache.h
+++ /dev/null
@@ -1,52 +0,0 @@
-/*
- * Copyright (C) 2008 Oracle.  All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License v2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this program; if not, write to the
- * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
- * Boston, MA 021110-1307, USA.
- */
-#ifndef __REFCACHE__
-#define __REFCACHE__
-
-struct btrfs_extent_info {
-   /* bytenr and num_bytes find the extent in the extent allocation tree */
-   u64 bytenr;
-   u64 num_bytes;
-
-   /* objectid and offset find the back reference for the file */
-   u64 objectid;
-   u64 offset;
-};
-
-struct btrfs_leaf_ref {
-   struct rb_node rb_node;
-   struct btrfs_leaf_ref_tree *tree;
-   int in_tree;
-   atomic_t usage;
-
-   u64 root_gen;
-   u64 bytenr;
-   u64 owner;
-   u64 generation;
-   int nritems;
-
-   struct list_head list;
-   struct btrfs_extent_info extents[];
-};
-
-static inline size_t btrfs_leaf_ref_size(int nr_extents)
-{
-   return sizeof(struct btrfs_leaf_ref) +
-  sizeof(struct btrfs_extent_info) * nr_extents;
-}
-#endif
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/16] Btrfs: make acl functions really no-op if acl is not enabled

2011-07-13 Thread Li Zefan
So there's no overhead for something we don't use.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/Makefile |4 +++-
 fs/btrfs/acl.c|   17 -
 fs/btrfs/ctree.h  |   15 ---
 3 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 9b72dcf..40e6ac0 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -6,5 +6,7 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o 
root-tree.o dir-item.o \
   transaction.o inode.o file.o tree-defrag.o \
   extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
   extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
-  export.o tree-log.o acl.o free-space-cache.o zlib.o lzo.o \
+  export.o tree-log.o free-space-cache.o zlib.o lzo.o \
   compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o
+
+btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o
diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
index f66fc99..b206d4c 100644
--- a/fs/btrfs/acl.c
+++ b/fs/btrfs/acl.c
@@ -28,8 +28,6 @@
 #include btrfs_inode.h
 #include xattr.h
 
-#ifdef CONFIG_BTRFS_FS_POSIX_ACL
-
 static struct posix_acl *btrfs_get_acl(struct inode *inode, int type)
 {
int size;
@@ -318,18 +316,3 @@ const struct xattr_handler btrfs_xattr_acl_access_handler 
= {
.get= btrfs_xattr_acl_get,
.set= btrfs_xattr_acl_set,
 };
-
-#else /* CONFIG_BTRFS_FS_POSIX_ACL */
-
-int btrfs_acl_chmod(struct inode *inode)
-{
-   return 0;
-}
-
-int btrfs_init_acl(struct btrfs_trans_handle *trans,
-  struct inode *inode, struct inode *dir)
-{
-   return 0;
-}
-
-#endif /* CONFIG_BTRFS_FS_POSIX_ACL */
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 60e13ef..b5097e2 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2644,12 +2644,21 @@ do {
\
 /* acl.c */
 #ifdef CONFIG_BTRFS_FS_POSIX_ACL
 int btrfs_check_acl(struct inode *inode, int mask, unsigned int flags);
-#else
-#define btrfs_check_acl NULL
-#endif
 int btrfs_init_acl(struct btrfs_trans_handle *trans,
   struct inode *inode, struct inode *dir);
 int btrfs_acl_chmod(struct inode *inode);
+#else
+#define btrfs_check_acl NULL
+static inline int btrfs_init_acl(struct btrfs_trans_handle *trans,
+struct inode *inode, struct inode *dir)
+{
+   return 0;
+}
+static inline int btrfs_acl_chmod(struct inode *inode)
+{
+   return 0;
+}
+#endif
 
 /* relocation.c */
 int btrfs_relocate_block_group(struct btrfs_root *root, u64 group_start);
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/16] Btrfs: remove redundant code for dir item lookup

2011-07-13 Thread Li Zefan
When we search a dir item with a specific hash code, we can
just return NULL without further checking if btrfs_search_slot()
returns 1.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/dir-item.c |   30 ++
 1 files changed, 2 insertions(+), 28 deletions(-)

diff --git a/fs/btrfs/dir-item.c b/fs/btrfs/dir-item.c
index 685f259..81533a0 100644
--- a/fs/btrfs/dir-item.c
+++ b/fs/btrfs/dir-item.c
@@ -203,8 +203,6 @@ struct btrfs_dir_item *btrfs_lookup_dir_item(struct 
btrfs_trans_handle *trans,
struct btrfs_key key;
int ins_len = mod  0 ? -1 : 0;
int cow = mod != 0;
-   struct btrfs_key found_key;
-   struct extent_buffer *leaf;
 
key.objectid = dir;
btrfs_set_key_type(key, BTRFS_DIR_ITEM_KEY);
@@ -214,18 +212,7 @@ struct btrfs_dir_item *btrfs_lookup_dir_item(struct 
btrfs_trans_handle *trans,
ret = btrfs_search_slot(trans, root, key, path, ins_len, cow);
if (ret  0)
return ERR_PTR(ret);
-   if (ret  0) {
-   if (path-slots[0] == 0)
-   return NULL;
-   path-slots[0]--;
-   }
-
-   leaf = path-nodes[0];
-   btrfs_item_key_to_cpu(leaf, found_key, path-slots[0]);
-
-   if (found_key.objectid != dir ||
-   btrfs_key_type(found_key) != BTRFS_DIR_ITEM_KEY ||
-   found_key.offset != key.offset)
+   if (ret  0)
return NULL;
 
return btrfs_match_dir_item_name(root, path, name, name_len);
@@ -320,8 +307,6 @@ struct btrfs_dir_item *btrfs_lookup_xattr(struct 
btrfs_trans_handle *trans,
struct btrfs_key key;
int ins_len = mod  0 ? -1 : 0;
int cow = mod != 0;
-   struct btrfs_key found_key;
-   struct extent_buffer *leaf;
 
key.objectid = dir;
btrfs_set_key_type(key, BTRFS_XATTR_ITEM_KEY);
@@ -329,18 +314,7 @@ struct btrfs_dir_item *btrfs_lookup_xattr(struct 
btrfs_trans_handle *trans,
ret = btrfs_search_slot(trans, root, key, path, ins_len, cow);
if (ret  0)
return ERR_PTR(ret);
-   if (ret  0) {
-   if (path-slots[0] == 0)
-   return NULL;
-   path-slots[0]--;
-   }
-
-   leaf = path-nodes[0];
-   btrfs_item_key_to_cpu(leaf, found_key, path-slots[0]);
-
-   if (found_key.objectid != dir ||
-   btrfs_key_type(found_key) != BTRFS_XATTR_ITEM_KEY ||
-   found_key.offset != key.offset)
+   if (ret  0)
return NULL;
 
return btrfs_match_dir_item_name(root, path, name, name_len);
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/16] Btrfs: clean up search_extent_mapping()

2011-07-13 Thread Li Zefan
rb_node returned by __tree_search() can be a valid pointer or NULL,
but won't be some errno.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/extent_map.c |   17 +++--
 1 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 2d04103..911a9db 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -379,23 +379,12 @@ struct extent_map *search_extent_mapping(struct 
extent_map_tree *tree,
em = rb_entry(next, struct extent_map, rb_node);
goto found;
}
-   if (!rb_node) {
-   em = NULL;
-   goto out;
-   }
-   if (IS_ERR(rb_node)) {
-   em = ERR_CAST(rb_node);
-   goto out;
-   }
-   em = rb_entry(rb_node, struct extent_map, rb_node);
-   goto found;
-
-   em = NULL;
-   goto out;
+   if (!rb_node)
+   return NULL;
 
+   em = rb_entry(rb_node, struct extent_map, rb_node);
 found:
atomic_inc(em-refs);
-out:
return em;
 }
 
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/16] Btrfs: clean up code for extent_map lookup

2011-07-13 Thread Li Zefan
lookup_extent_map() and search_extent_map() can share most of code.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/extent_map.c |   85 +
 1 files changed, 29 insertions(+), 56 deletions(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 911a9db..df7a803 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -299,19 +299,8 @@ static u64 range_end(u64 start, u64 len)
return start + len;
 }
 
-/**
- * lookup_extent_mapping - lookup extent_map
- * @tree:  tree to lookup in
- * @start: byte offset to start the search
- * @len:   length of the lookup range
- *
- * Find and return the first extent_map struct in @tree that intersects the
- * [start, len] range.  There may be additional objects in the tree that
- * intersect, so check the object returned carefully to make sure that no
- * additional lookups are needed.
- */
-struct extent_map *lookup_extent_mapping(struct extent_map_tree *tree,
-u64 start, u64 len)
+struct extent_map *__lookup_extent_mapping(struct extent_map_tree *tree,
+  u64 start, u64 len, int strict)
 {
struct extent_map *em;
struct rb_node *rb_node;
@@ -320,38 +309,42 @@ struct extent_map *lookup_extent_mapping(struct 
extent_map_tree *tree,
u64 end = range_end(start, len);
 
rb_node = __tree_search(tree-map, start, prev, next);
-   if (!rb_node  prev) {
-   em = rb_entry(prev, struct extent_map, rb_node);
-   if (end  em-start  start  extent_map_end(em))
-   goto found;
-   }
-   if (!rb_node  next) {
-   em = rb_entry(next, struct extent_map, rb_node);
-   if (end  em-start  start  extent_map_end(em))
-   goto found;
-   }
if (!rb_node) {
-   em = NULL;
-   goto out;
-   }
-   if (IS_ERR(rb_node)) {
-   em = ERR_CAST(rb_node);
-   goto out;
+   if (prev)
+   rb_node = prev;
+   else if (next)
+   rb_node = next;
+   else
+   return NULL;
}
+
em = rb_entry(rb_node, struct extent_map, rb_node);
-   if (end  em-start  start  extent_map_end(em))
-   goto found;
 
-   em = NULL;
-   goto out;
+   if (strict  !(end  em-start  start  extent_map_end(em)))
+   return NULL;
 
-found:
atomic_inc(em-refs);
-out:
return em;
 }
 
 /**
+ * lookup_extent_mapping - lookup extent_map
+ * @tree:  tree to lookup in
+ * @start: byte offset to start the search
+ * @len:   length of the lookup range
+ *
+ * Find and return the first extent_map struct in @tree that intersects the
+ * [start, len] range.  There may be additional objects in the tree that
+ * intersect, so check the object returned carefully to make sure that no
+ * additional lookups are needed.
+ */
+struct extent_map *lookup_extent_mapping(struct extent_map_tree *tree,
+u64 start, u64 len)
+{
+   return __lookup_extent_mapping(tree, start, len, 1);
+}
+
+/**
  * search_extent_mapping - find a nearby extent map
  * @tree:  tree to lookup in
  * @start: byte offset to start the search
@@ -365,27 +358,7 @@ out:
 struct extent_map *search_extent_mapping(struct extent_map_tree *tree,
 u64 start, u64 len)
 {
-   struct extent_map *em;
-   struct rb_node *rb_node;
-   struct rb_node *prev = NULL;
-   struct rb_node *next = NULL;
-
-   rb_node = __tree_search(tree-map, start, prev, next);
-   if (!rb_node  prev) {
-   em = rb_entry(prev, struct extent_map, rb_node);
-   goto found;
-   }
-   if (!rb_node  next) {
-   em = rb_entry(next, struct extent_map, rb_node);
-   goto found;
-   }
-   if (!rb_node)
-   return NULL;
-
-   em = rb_entry(rb_node, struct extent_map, rb_node);
-found:
-   atomic_inc(em-refs);
-   return em;
+   return __lookup_extent_mapping(tree, start, len, 0);
 }
 
 /**
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/16] Btrfs: clean up code for merging extent maps

2011-07-13 Thread Li Zefan
unpin_extent_cache() and add_extent_mapping() shares the same code
that merges extent maps.

Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/extent_map.c |   59 +---
 1 files changed, 21 insertions(+), 38 deletions(-)

diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index df7a803..7c97b33 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -183,22 +183,10 @@ static int mergable_maps(struct extent_map *prev, struct 
extent_map *next)
return 0;
 }
 
-int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len)
+static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
 {
-   int ret = 0;
struct extent_map *merge = NULL;
struct rb_node *rb;
-   struct extent_map *em;
-
-   write_lock(tree-lock);
-   em = lookup_extent_mapping(tree, start, len);
-
-   WARN_ON(!em || em-start != start);
-
-   if (!em)
-   goto out;
-
-   clear_bit(EXTENT_FLAG_PINNED, em-flags);
 
if (em-start != 0) {
rb = rb_prev(em-rb_node);
@@ -225,6 +213,24 @@ int unpin_extent_cache(struct extent_map_tree *tree, u64 
start, u64 len)
merge-in_tree = 0;
free_extent_map(merge);
}
+}
+
+int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len)
+{
+   int ret = 0;
+   struct extent_map *em;
+
+   write_lock(tree-lock);
+   em = lookup_extent_mapping(tree, start, len);
+
+   WARN_ON(!em || em-start != start);
+
+   if (!em)
+   goto out;
+
+   clear_bit(EXTENT_FLAG_PINNED, em-flags);
+
+   try_merge_map(tree, em);
 
free_extent_map(em);
 out:
@@ -247,7 +253,6 @@ int add_extent_mapping(struct extent_map_tree *tree,
   struct extent_map *em)
 {
int ret = 0;
-   struct extent_map *merge = NULL;
struct rb_node *rb;
struct extent_map *exist;
 
@@ -263,30 +268,8 @@ int add_extent_mapping(struct extent_map_tree *tree,
goto out;
}
atomic_inc(em-refs);
-   if (em-start != 0) {
-   rb = rb_prev(em-rb_node);
-   if (rb)
-   merge = rb_entry(rb, struct extent_map, rb_node);
-   if (rb  mergable_maps(merge, em)) {
-   em-start = merge-start;
-   em-len += merge-len;
-   em-block_len += merge-block_len;
-   em-block_start = merge-block_start;
-   merge-in_tree = 0;
-   rb_erase(merge-rb_node, tree-map);
-   free_extent_map(merge);
-   }
-}
-   rb = rb_next(em-rb_node);
-   if (rb)
-   merge = rb_entry(rb, struct extent_map, rb_node);
-   if (rb  mergable_maps(em, merge)) {
-   em-len += merge-len;
-   em-block_len += merge-len;
-   rb_erase(merge-rb_node, tree-map);
-   merge-in_tree = 0;
-   free_extent_map(merge);
-   }
+
+   try_merge_map(tree, em);
 out:
return ret;
 }
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/16] Btrfs: remove unused members from struct extent_state

2011-07-13 Thread Li Zefan
From: Xiao Guangrong xiaoguangr...@cn.fujitsu.com

These members are not used at all.

Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com
Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/extent_io.h |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index a11a92e..d04ca37 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -108,8 +108,6 @@ struct extent_state {
wait_queue_head_t wq;
atomic_t refs;
unsigned long state;
-   u64 split_start;
-   u64 split_end;
 
/* for use by the FS */
u64 private;
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/16] Btrfs: clean up for insert_state()

2011-07-13 Thread Li Zefan
From: Xiao Guangrong xiaoguangr...@cn.fujitsu.com

Don't duplicate set_state_bits().

Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com
Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/extent_io.c |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index b181a94..6c7394f 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -308,6 +308,9 @@ static void clear_state_cb(struct extent_io_tree *tree,
tree-ops-clear_bit_hook(tree-mapping-host, state, bits);
 }
 
+static int set_state_bits(struct extent_io_tree *tree,
+ struct extent_state *state, int *bits);
+
 /*
  * insert an extent_state struct into the tree.  'bits' are set on the
  * struct before it is inserted.
@@ -323,7 +326,6 @@ static int insert_state(struct extent_io_tree *tree,
int *bits)
 {
struct rb_node *node;
-   int bits_to_set = *bits  ~EXTENT_CTLBITS;
int ret;
 
if (end  start) {
@@ -334,13 +336,11 @@ static int insert_state(struct extent_io_tree *tree,
}
state-start = start;
state-end = end;
-   ret = set_state_cb(tree, state, bits);
+
+   ret = set_state_bits(tree, state, bits);
if (ret)
return ret;
 
-   if (bits_to_set  EXTENT_DIRTY)
-   tree-dirty_bytes += end - start + 1;
-   state-state |= bits_to_set;
node = tree_insert(tree-state, end, state-rb_node);
if (node) {
struct extent_state *found;
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/16] Btrfs: clean up for wait_extent_bit()

2011-07-13 Thread Li Zefan
From: Xiao Guangrong xiaoguangr...@cn.fujitsu.com

We can just use cond_resched_lock().

Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com
Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/extent_io.c |6 +-
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 6c7394f..1959a63 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -658,11 +658,7 @@ again:
if (start  end)
break;
 
-   if (need_resched()) {
-   spin_unlock(tree-lock);
-   cond_resched();
-   spin_lock(tree-lock);
-   }
+   cond_resched_lock(tree-lock);
}
 out:
spin_unlock(tree-lock);
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/16] Btrfs: clean up for find_first_extent_bit()

2011-07-13 Thread Li Zefan
From: Xiao Guangrong xiaoguangr...@cn.fujitsu.com

find_first_extent_bit() and find_first_extent_bit_state() share
most of the code, and we can just make the former call the latter.

Signed-off-by: Xiao Guangrong xiaoguangr...@cn.fujitsu.com
Signed-off-by: Li Zefan l...@cn.fujitsu.com
---
 fs/btrfs/extent_io.c |   64 ++---
 1 files changed, 24 insertions(+), 40 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 1959a63..a31ffa5 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -1055,46 +1055,6 @@ static int set_range_writeback(struct extent_io_tree 
*tree, u64 start, u64 end)
return 0;
 }
 
-/*
- * find the first offset in the io tree with 'bits' set. zero is
- * returned if we find something, and *start_ret and *end_ret are
- * set to reflect the state struct that was found.
- *
- * If nothing was found, 1 is returned,  0 on error
- */
-int find_first_extent_bit(struct extent_io_tree *tree, u64 start,
- u64 *start_ret, u64 *end_ret, int bits)
-{
-   struct rb_node *node;
-   struct extent_state *state;
-   int ret = 1;
-
-   spin_lock(tree-lock);
-   /*
-* this search will find all the extents that end after
-* our range starts.
-*/
-   node = tree_search(tree, start);
-   if (!node)
-   goto out;
-
-   while (1) {
-   state = rb_entry(node, struct extent_state, rb_node);
-   if (state-end = start  (state-state  bits)) {
-   *start_ret = state-start;
-   *end_ret = state-end;
-   ret = 0;
-   break;
-   }
-   node = rb_next(node);
-   if (!node)
-   break;
-   }
-out:
-   spin_unlock(tree-lock);
-   return ret;
-}
-
 /* find the first state struct with 'bits' set after 'start', and
  * return it.  tree-lock must be held.  NULL will returned if
  * nothing was found after 'start'
@@ -1127,6 +1087,30 @@ out:
 }
 
 /*
+ * find the first offset in the io tree with 'bits' set. zero is
+ * returned if we find something, and *start_ret and *end_ret are
+ * set to reflect the state struct that was found.
+ *
+ * If nothing was found, 1 is returned,  0 on error
+ */
+int find_first_extent_bit(struct extent_io_tree *tree, u64 start,
+ u64 *start_ret, u64 *end_ret, int bits)
+{
+   struct extent_state *state;
+   int ret = 1;
+
+   spin_lock(tree-lock);
+   state = find_first_extent_bit_state(tree, start, bits);
+   if (state) {
+   *start_ret = state-start;
+   *end_ret = state-end;
+   ret = 0;
+   }
+   spin_unlock(tree-lock);
+   return ret;
+}
+
+/*
  * find a contiguous range of bytes in the file marked as delalloc, not
  * more than 'max_bytes'.  start and end are used to return the range,
  *
-- 
1.7.3.1
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Mis-Design of Btrfs?

2011-07-13 Thread NeilBrown
On Wed, 29 Jun 2011 10:29:53 +0100 Ric Wheeler rwhee...@redhat.com wrote:

 On 06/27/2011 07:46 AM, NeilBrown wrote:
  On Thu, 23 Jun 2011 12:53:37 +0200 Nico Schottelius
  nico-lkml-20110...@schottelius.org  wrote:
 
  Good morning devs,
 
  I'm wondering whether the raid- and volume-management-builtin of btrfs is
  actually a sane idea or not.
  Currently we do have md/device-mapper support for raid
  already, btrfs lacks raid5 support and re-implements stuff that
  has already been done.
 
  I'm aware of the fact that it is very useful to know on which devices
  we are in a filesystem. But I'm wondering, whether it wouldn't be
  smarter to generalise the information exposure through the VFS layer
  instead of replicating functionality:
 
  Physical:   USB-HD   SSD   USB-Flash  | Exposes information to
  Raid:   Raid1, Raid5, Raid10, etc.| higher levels
  Crypto: Luks  |
  LVM:Groups/Volumes|
  FS: xfs/jfs/reiser/ext3   v
 
  Thus a filesystem like ext3 could be aware that it is running
  on a USB HD, enable -o sync be default or have the filesystem
  to rewrite blocks when running on crypto or optimise for an SSD, ...
  I would certainly agree that exposing information to higher levels is a good
  idea.  To some extent we do.  But it isn't always as easy as it might sound.
  Choosing exactly what information to expose is the challenge.  If you lack
  sufficient foresight you might expose something which turns out to be
  very specific to just one device, so all those upper levels which make use 
  of
  the information find they are really special-casing one specific device,
  which isn't a good idea.
 
 
  However it doesn't follow that RAID5 should not be implemented in BTRFS.
  The levels that you have drawn are just one perspective.  While that has
  value, it may not be universal.
  I could easily argue that the LVM layer is a mistake and that filesystems
  should provide that functionality directly.
  I could almost argue the same for crypto.
  RAID1 can make a lot of sense to be tightly integrated with the FS.
  RAID5 ... I'm less convinced, but then I have a vested interest there so 
  that
  isn't an objective assessment.
 
  Part of the way Linux works is that s/he who writes the code gets to make
  the design decisions.   The BTRFS developers might create something truly
  awesome, or might end up having to support a RAID feature that they
  subsequently think is a bad idea.  But it really is their decision to make.
 
  NeilBrown
 
 
 One more thing to add here is that I think that we still have a chance to 
 increase the sharing between btrfs and the MD stack if we can get those 
 changes 
 made. No one likes to duplicate code, but we will need a richer interface 
 between the block and file system layer to help close that gap.
 
 Ric
 

I'm certainly open to suggestions and collaboration.  Do you have in mind any
particular way to make the interface richer??

NeilBrown
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html