[PATCH 00/10] several fixes and cleanups

2012-03-27 Thread Liu Bo
This patchset consists of a bug fix from allocating chunk,
six bug fixes from autodefrag, and other cleanups.

I've tested it with xfstests plus autodefrag option.

Liu Bo (10):
  Btrfs: show useful info in space reservation tracepoint
  Btrfs: fix deadlock during allocating chunks
  Btrfs: fix race between direct io and autodefrag
  Btrfs: fix the mismatch of page-mapping
  Btrfs: fix recursive defragment with autodefrag option
  Btrfs: add a check to decide if we should defrag the range
  Btrfs: do not bother to defrag an extent if it is a big real extent
  Btrfs: update to the right index of defragment
  Btrfs: use PagePrivate2 to check ordered data
  Btrfs: drop cache with VACANCY em when we fail to start a transaction

 fs/btrfs/extent-tree.c |   79 --
 fs/btrfs/inode-map.c   |6 +--
 fs/btrfs/inode.c   |   59 ---
 fs/btrfs/ioctl.c   |   89 +++-
 fs/btrfs/transaction.c |3 +-
 5 files changed, 151 insertions(+), 85 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] Btrfs: show useful info in space reservation tracepoint

2012-03-27 Thread Liu Bo
o For space info, the type of space info is useful for debug.
o For transaction handle, its transid is useful.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/extent-tree.c |   29 ++---
 fs/btrfs/inode-map.c   |6 ++
 fs/btrfs/transaction.c |3 +--
 3 files changed, 13 insertions(+), 25 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 37e0a80..f3d367a 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3312,8 +3312,7 @@ commit_trans:
}
data_sinfo-bytes_may_use += bytes;
trace_btrfs_space_reservation(root-fs_info, space_info,
- (u64)(unsigned long)data_sinfo,
- bytes, 1);
+ data_sinfo-flags, bytes, 1);
spin_unlock(data_sinfo-lock);
 
return 0;
@@ -3334,8 +,7 @@ void btrfs_free_reserved_data_space(struct inode *inode, 
u64 bytes)
spin_lock(data_sinfo-lock);
data_sinfo-bytes_may_use -= bytes;
trace_btrfs_space_reservation(root-fs_info, space_info,
- (u64)(unsigned long)data_sinfo,
- bytes, 0);
+ data_sinfo-flags, bytes, 0);
spin_unlock(data_sinfo-lock);
 }
 
@@ -3700,9 +3698,7 @@ again:
if (used + orig_bytes = space_info-total_bytes) {
space_info-bytes_may_use += orig_bytes;
trace_btrfs_space_reservation(root-fs_info,
- space_info,
- (u64)(unsigned long)space_info,
- orig_bytes, 1);
+   space_info, space_info-flags, orig_bytes, 1);
ret = 0;
} else {
/*
@@ -3771,9 +3767,7 @@ again:
if (used + num_bytes  space_info-total_bytes + avail) {
space_info-bytes_may_use += orig_bytes;
trace_btrfs_space_reservation(root-fs_info,
- space_info,
- (u64)(unsigned long)space_info,
- orig_bytes, 1);
+   space_info, space_info-flags, orig_bytes, 1);
ret = 0;
} else {
wait_ordered = true;
@@ -3918,8 +3912,7 @@ static void block_rsv_release_bytes(struct btrfs_fs_info 
*fs_info,
spin_lock(space_info-lock);
space_info-bytes_may_use -= num_bytes;
trace_btrfs_space_reservation(fs_info, space_info,
- (u64)(unsigned long)space_info,
- num_bytes, 0);
+   space_info-flags, num_bytes, 0);
space_info-reservation_progress++;
spin_unlock(space_info-lock);
}
@@ -4137,14 +4130,14 @@ static void update_global_block_rsv(struct 
btrfs_fs_info *fs_info)
block_rsv-reserved += num_bytes;
sinfo-bytes_may_use += num_bytes;
trace_btrfs_space_reservation(fs_info, space_info,
- (u64)(unsigned long)sinfo, num_bytes, 1);
+ sinfo-flags, num_bytes, 1);
}
 
if (block_rsv-reserved = block_rsv-size) {
num_bytes = block_rsv-reserved - block_rsv-size;
sinfo-bytes_may_use -= num_bytes;
trace_btrfs_space_reservation(fs_info, space_info,
- (u64)(unsigned long)sinfo, num_bytes, 0);
+ sinfo-flags, num_bytes, 0);
sinfo-reservation_progress++;
block_rsv-reserved = block_rsv-size;
block_rsv-full = 1;
@@ -4198,8 +4191,7 @@ void btrfs_trans_release_metadata(struct 
btrfs_trans_handle *trans,
return;
 
trace_btrfs_space_reservation(root-fs_info, transaction,
- (u64)(unsigned long)trans,
- trans-bytes_reserved, 0);
+ trans-transid, trans-bytes_reserved, 0);
btrfs_block_rsv_release(root, trans-block_rsv, trans-bytes_reserved);
trans-bytes_reserved = 0;
 }
@@ -4716,9 +4708,8 @@ static int btrfs_update_reserved_bytes(struct 
btrfs_block_group_cache *cache,
space_info-bytes_reserved += num_bytes;
if (reserve == RESERVE_ALLOC) {
trace_btrfs_space_reservation(cache-fs_info,
- space_info,
- 

[PATCH 02/10][RESEND] Btrfs: fix deadlock during allocating chunks

2012-03-27 Thread Liu Bo
This deadlock comes from xfstests 251.

We'll hold the chunk_mutex throughout the whole of a chunk allocation.
But if we find that we've used up system chunk space, we need to allocate a
new system chunk, but this will lead to a recursion of chunk allocation and end
up with a deadlock on chunk_mutex.
So instead we need to allocate the system chunk first if we find we're in 
ENOSPC.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/extent-tree.c |   50 
 1 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index f3d367a..fe5bbc7 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3394,6 +3394,50 @@ static int should_alloc_chunk(struct btrfs_root *root,
return 1;
 }
 
+static u64 get_system_chunk_thresh(struct btrfs_root *root, u64 type)
+{
+   u64 num_dev;
+
+   if (type  BTRFS_BLOCK_GROUP_RAID10 ||
+   type  BTRFS_BLOCK_GROUP_RAID0)
+   num_dev = root-fs_info-fs_devices-rw_devices;
+   else if (type  BTRFS_BLOCK_GROUP_RAID1)
+   num_dev = 2;
+   else
+   num_dev = 1;/* DUP or single */
+
+   /* metadata for updaing devices and chunk tree */
+   return btrfs_calc_trans_metadata_size(root, num_dev + 1);
+}
+
+static void check_system_chunk(struct btrfs_trans_handle *trans,
+  struct btrfs_root *root, u64 type)
+{
+   struct btrfs_space_info *info;
+   u64 left;
+   u64 thresh;
+
+   info = __find_space_info(root-fs_info, BTRFS_BLOCK_GROUP_SYSTEM);
+   spin_lock(info-lock);
+   left = info-total_bytes - info-bytes_used - info-bytes_pinned -
+   info-bytes_reserved - info-bytes_readonly;
+   spin_unlock(info-lock);
+
+   thresh = get_system_chunk_thresh(root, type);
+   if (left  thresh  btrfs_test_opt(root, ENOSPC_DEBUG)) {
+   printk(KERN_INFO left=%llu, need=%llu, flags=%llu\n,
+  left, thresh, type);
+   dump_space_info(info, 0, 0);
+   }
+
+   if (left  thresh) {
+   u64 flags;
+
+   flags = btrfs_get_alloc_profile(root-fs_info-chunk_root, 0);
+   btrfs_alloc_chunk(trans, root, flags);
+   }
+}
+
 static int do_chunk_alloc(struct btrfs_trans_handle *trans,
  struct btrfs_root *extent_root, u64 alloc_bytes,
  u64 flags, int force)
@@ -3466,6 +3510,12 @@ again:
force_metadata_allocation(fs_info);
}
 
+   /*
+* Check if we have enough space in SYSTEM chunk because we may need
+* to update devices.
+*/
+   check_system_chunk(trans, extent_root, flags);
+
ret = btrfs_alloc_chunk(trans, extent_root, flags);
if (ret  0  ret != -ENOSPC)
goto out;
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/10] Btrfs: fix the mismatch of page-mapping

2012-03-27 Thread Liu Bo
commit 600a45e1d5e376f679ff9ecc4ce9452710a6d27c
(Btrfs: fix deadlock on page lock when doing auto-defragment)
fixes the deadlock on page, but it also introduces another bug.

A page may have been truncated after unlock  lock.
So we need to find it again to get the right one.

And since we've held i_mutex lock, inode size remains unchanged and
we can drop isize overflow checks.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |   35 +++
 1 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 0acc828..81faa78 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -856,6 +856,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
u64 isize = i_size_read(inode);
u64 page_start;
u64 page_end;
+   u64 page_cnt;
int ret;
int i;
int i_done;
@@ -864,19 +865,21 @@ static int cluster_pages_for_defrag(struct inode *inode,
struct extent_io_tree *tree;
gfp_t mask = btrfs_alloc_write_mask(inode-i_mapping);
 
-   if (isize == 0)
-   return 0;
file_end = (isize - 1)  PAGE_CACHE_SHIFT;
+   if (!isize || start_index  file_end)
+   return 0;
+
+   page_cnt = min_t(u64, (u64)num_pages, (u64)file_end - start_index + 1);
 
ret = btrfs_delalloc_reserve_space(inode,
-  num_pages  PAGE_CACHE_SHIFT);
+  page_cnt  PAGE_CACHE_SHIFT);
if (ret)
return ret;
i_done = 0;
tree = BTRFS_I(inode)-io_tree;
 
/* step one, lock all the pages */
-   for (i = 0; i  num_pages; i++) {
+   for (i = 0; i  page_cnt; i++) {
struct page *page;
 again:
page = find_or_create_page(inode-i_mapping,
@@ -898,6 +901,15 @@ again:
btrfs_start_ordered_extent(inode, ordered, 1);
btrfs_put_ordered_extent(ordered);
lock_page(page);
+   /*
+* we unlocked the page above, so we need check if
+* it was released or not.
+*/
+   if (page-mapping != inode-i_mapping) {
+   unlock_page(page);
+   page_cache_release(page);
+   goto again;
+   }
}
 
if (!PageUptodate(page)) {
@@ -911,15 +923,6 @@ again:
}
}
 
-   isize = i_size_read(inode);
-   file_end = (isize - 1)  PAGE_CACHE_SHIFT;
-   if (!isize || page-index  file_end) {
-   /* whoops, we blew past eof, skip this page */
-   unlock_page(page);
-   page_cache_release(page);
-   break;
-   }
-
if (page-mapping != inode-i_mapping) {
unlock_page(page);
page_cache_release(page);
@@ -953,12 +956,12 @@ again:
  EXTENT_DO_ACCOUNTING, 0, 0, cached_state,
  GFP_NOFS);
 
-   if (i_done != num_pages) {
+   if (i_done != page_cnt) {
spin_lock(BTRFS_I(inode)-lock);
BTRFS_I(inode)-outstanding_extents++;
spin_unlock(BTRFS_I(inode)-lock);
btrfs_delalloc_release_space(inode,
-(num_pages - i_done)  PAGE_CACHE_SHIFT);
+(page_cnt - i_done)  PAGE_CACHE_SHIFT);
}
 
 
@@ -983,7 +986,7 @@ out:
unlock_page(pages[i]);
page_cache_release(pages[i]);
}
-   btrfs_delalloc_release_space(inode, num_pages  PAGE_CACHE_SHIFT);
+   btrfs_delalloc_release_space(inode, page_cnt  PAGE_CACHE_SHIFT);
return ret;
 
 }
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/10] Btrfs: add a check to decide if we should defrag the range

2012-03-27 Thread Liu Bo
If our file's layout is as follows:
| hole | data1 | hole | data2 |

we do not need to defrag this file, because this file has holes and
cannot be merged into one extent.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |   36 +++-
 1 files changed, 35 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 81faa78..66a4933 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -769,6 +769,31 @@ none:
return -ENOENT;
 }
 
+/*
+ * Validaty check of prev em and next em:
+ * 1) no prev/next em
+ * 2) prev/next em is an hole/inline extent
+ */
+static int check_adjacent_extents(struct inode *inode, struct extent_map *em)
+{
+   struct extent_map_tree *em_tree = BTRFS_I(inode)-extent_tree;
+   struct extent_map *prev = NULL, *next = NULL;
+   int ret = 0;
+
+   read_lock(em_tree-lock);
+   prev = lookup_extent_mapping(em_tree, em-start - 1, (u64)-1);
+   next = lookup_extent_mapping(em_tree, em-start + em-len, (u64)-1);
+   read_unlock(em_tree-lock);
+
+   if ((!prev || prev-block_start = EXTENT_MAP_LAST_BYTE) 
+   (!next || next-block_start = EXTENT_MAP_LAST_BYTE))
+   ret = 1;
+   free_extent_map(prev);
+   free_extent_map(next);
+
+   return ret;
+}
+
 static int should_defrag_range(struct inode *inode, u64 start, u64 len,
   int thresh, u64 *last_len, u64 *skip,
   u64 *defrag_end)
@@ -806,8 +831,16 @@ static int should_defrag_range(struct inode *inode, u64 
start, u64 len,
}
 
/* this will cover holes, and inline extents */
-   if (em-block_start = EXTENT_MAP_LAST_BYTE)
+   if (em-block_start = EXTENT_MAP_LAST_BYTE) {
+   ret = 0;
+   goto out;
+   }
+
+   /* If we have nothing to merge with us, just skip. */
+   if (check_adjacent_extents(inode, em)) {
ret = 0;
+   goto out;
+   }
 
/*
 * we hit a real extent, if it is big don't bother defragging it again
@@ -815,6 +848,7 @@ static int should_defrag_range(struct inode *inode, u64 
start, u64 len,
if ((*last_len == 0 || *last_len = thresh)  em-len = thresh)
ret = 0;
 
+out:
/*
 * last_len ends up being a counter of how many bytes we've defragged.
 * every time we choose not to defrag an extent, we reset *last_len
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/10][RESEND] Btrfs: fix recursive defragment with autodefrag option

2012-03-27 Thread Liu Bo
Reproduce:
$ mkfs.btrfs disk
$ mount disk /mnt -o autodefrag
$ dd if=/dev/zero of=/mnt/foobar bs=4k count=10 2/dev/null  sync
$ for i in `seq 9 -2 0`; do dd if=/dev/zero of=/mnt/foobar bs=4k count=1 \
  seek=$i conv=notrunc 2 /dev/null; done  sync

then we'll get to defrag foobar again and again.
So does option -o autodefrag,compress.

Reasons:
When the cleaner kthread gets to fetch inodes from the defrag tree and defrag
them, it will dirty pages and submit them, this will comes to another DATA COW
where the processing inode will be inserted to the defrag tree again.

This patch sets a rule for COW code, i.e. insert an inode when we're really
going to make some defragments.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/inode.c |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 892b347..7f5018d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -344,8 +344,9 @@ static noinline int compress_file_range(struct inode *inode,
int will_compress;
int compress_type = root-fs_info-compress_type;
 
-   /* if this is a small write inside eof, kick off a defragbot */
-   if (end = BTRFS_I(inode)-disk_i_size  (end - start + 1)  16 * 1024)
+   /* if this is a small write inside eof, kick off a defrag */
+   if ((end - start + 1)  16 * 1024 
+   (start  0 || end + 1  BTRFS_I(inode)-disk_i_size))
btrfs_add_inode_defrag(NULL, inode);
 
actual_end = min_t(u64, isize, end + 1);
@@ -800,7 +801,8 @@ static noinline int cow_file_range(struct inode *inode,
ret = 0;
 
/* if this is a small write inside eof, kick off defrag */
-   if (end = BTRFS_I(inode)-disk_i_size  num_bytes  64 * 1024)
+   if (num_bytes  64 * 1024 
+   (start  0 || end + 1  BTRFS_I(inode)-disk_i_size))
btrfs_add_inode_defrag(trans, inode);
 
if (start == 0) {
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/10] Btrfs: update to the right index of defragment

2012-03-27 Thread Liu Bo
When we use autodefrag, we forget to update the index which indicates
the last page we've dirty.  And we'll set dirty flags on a same set of
pages again and again.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 7a6d15c..e3cb770 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1172,6 +1172,9 @@ int btrfs_defrag_file(struct inode *inode, struct file 
*file,
if (newer_off == (u64)-1)
break;
 
+   if (ret  0)
+   i += ret;
+
newer_off = max(newer_off + 1,
(u64)i  PAGE_CACHE_SHIFT);
 
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/10] Btrfs: drop cache with VACANCY em when we fail to start a transaction

2012-03-27 Thread Liu Bo
We need to clean a VACANCY em(if we have) when we fail to start a transaction.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/inode.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index bacf441..2b2f0b6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3406,9 +3406,6 @@ int btrfs_cont_expand(struct inode *inode, loff_t 
oldsize, loff_t size)
break;
}
 
-   btrfs_drop_extent_cache(inode, hole_start,
-   last_byte - 1, 0);
-
btrfs_update_inode(trans, root, inode);
btrfs_end_transaction(trans, root);
}
@@ -3419,6 +3416,9 @@ int btrfs_cont_expand(struct inode *inode, loff_t 
oldsize, loff_t size)
break;
}
 
+   if (em  test_bit(EXTENT_FLAG_VACANCY, em-flags))
+   btrfs_drop_extent_cache(inode, hole_start, last_byte - 1, 0);
+
free_extent_map(em);
unlock_extent_cached(io_tree, hole_start, block_end - 1, cached_state,
 GFP_NOFS);
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/10] Btrfs: use PagePrivate2 to check ordered data

2012-03-27 Thread Liu Bo
If a page has PagePrivate2 flag, it still remains as ordered data,
so we can check this flag directly instead of looking up an ordered
extent.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/inode.c |   45 +++--
 1 files changed, 15 insertions(+), 30 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7f5018d..bacf441 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6345,8 +6345,6 @@ static int btrfs_releasepage(struct page *page, gfp_t 
gfp_flags)
 static void btrfs_invalidatepage(struct page *page, unsigned long offset)
 {
struct extent_io_tree *tree;
-   struct btrfs_ordered_extent *ordered;
-   struct extent_state *cached_state = NULL;
u64 page_start = page_offset(page);
u64 page_end = page_start + PAGE_CACHE_SIZE - 1;
 
@@ -6365,35 +6363,22 @@ static void btrfs_invalidatepage(struct page *page, 
unsigned long offset)
btrfs_releasepage(page, GFP_NOFS);
return;
}
-   lock_extent_bits(tree, page_start, page_end, 0, cached_state,
-GFP_NOFS);
-   ordered = btrfs_lookup_ordered_extent(page-mapping-host,
-  page_offset(page));
-   if (ordered) {
-   /*
-* IO on this page will never be started, so we need
-* to account for any ordered extents now
-*/
-   clear_extent_bit(tree, page_start, page_end,
-EXTENT_DIRTY | EXTENT_DELALLOC |
-EXTENT_LOCKED | EXTENT_DO_ACCOUNTING, 1, 0,
-cached_state, GFP_NOFS);
-   /*
-* whoever cleared the private bit is responsible
-* for the finish_ordered_io
-*/
-   if (TestClearPagePrivate2(page)) {
-   btrfs_finish_ordered_io(page-mapping-host,
-   page_start, page_end);
-   }
-   btrfs_put_ordered_extent(ordered);
-   cached_state = NULL;
-   lock_extent_bits(tree, page_start, page_end, 0, cached_state,
-GFP_NOFS);
-   }
+   /*
+* IO on this page will never be started, so we need
+* to account for any ordered extents now
+*/
clear_extent_bit(tree, page_start, page_end,
-EXTENT_LOCKED | EXTENT_DIRTY | EXTENT_DELALLOC |
-EXTENT_DO_ACCOUNTING, 1, 1, cached_state, GFP_NOFS);
+EXTENT_DIRTY | EXTENT_DELALLOC |
+EXTENT_LOCKED | EXTENT_DO_ACCOUNTING, 1, 1,
+NULL, GFP_NOFS);
+   /*
+* whoever cleared the private bit is responsible
+* for the finish_ordered_io
+*/
+   if (TestClearPagePrivate2(page)) {
+   btrfs_finish_ordered_io(page-mapping-host,
+   page_start, page_end);
+   }
__btrfs_releasepage(page, GFP_NOFS);
 
ClearPageChecked(page);
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/10] Btrfs: do not bother to defrag an extent if it is a big real extent

2012-03-27 Thread Liu Bo
$ mkfs.btrfs /dev/sdb7
$ mount /dev/sdb7 /mnt/btrfs/ -oautodefrag
$ dd if=/dev/zero of=/mnt/btrfs/foobar bs=4k count=10 oflag=direct 2/dev/null
$ filefrag -v /mnt/btrfs/foobar
Filesystem type is: 9123683e
File size of /mnt/btrfs/foobar is 40960 (10 blocks, blocksize 4096)
 ext logical physical expected length flags
   0   0 3072  10 eof
/mnt/btrfs/foobar: 1 extent found

Now we have a big real extent [0, 40960), but autodefrag will still defrag it.

$ sync
$ filefrag -v /mnt/btrfs/foobar
Filesystem type is: 9123683e
File size of /mnt/btrfs/foobar is 40960 (10 blocks, blocksize 4096)
 ext logical physical expected length flags
   0   0 3082  10 eof
/mnt/btrfs/foobar: 1 extent found

So if we already find a big real extent, we're ok about that, just skip it.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |9 +++--
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 66a4933..7a6d15c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1126,12 +1126,9 @@ int btrfs_defrag_file(struct inode *inode, struct file 
*file,
if (!(inode-i_sb-s_flags  MS_ACTIVE))
break;
 
-   if (!newer_than 
-   !should_defrag_range(inode, (u64)i  PAGE_CACHE_SHIFT,
-   PAGE_CACHE_SIZE,
-   extent_thresh,
-   last_len, skip,
-   defrag_end)) {
+   if (!should_defrag_range(inode, (u64)i  PAGE_CACHE_SHIFT,
+PAGE_CACHE_SIZE, extent_thresh,
+last_len, skip, defrag_end)) {
unsigned long next;
/*
 * the should_defrag function tells us how much to skip
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/10] Btrfs: fix race between direct io and autodefrag

2012-03-27 Thread Liu Bo
The bug is from running xfstests 209 with autodefrag.

The race is as follows:
   t1   t2(autodefrag)
   direct IO
 invalidate pagecache
 dio(old data) add_inode_defrag
 invalidate pagecache
   endio

   direct IO
 invalidate pagecache
run_defrag
  readpage(old data)
  set page dirty (old data)
 dio(new data, rewrite)
 invalidate pagecache (*)
 endio

t2(autodefrag) will get old data into pagecache via readpage and set
pagecache dirty.  Meanwhile, invalidate pagecache(*) will fail due to
dirty flags in pages.  So the old data may be flushed into disk by
flush thread, which will lead to data loss.

And so does the case of user defragment progs.

The patch fixes this race by holding i_mutex when we readpage and set page 
dirty.

Signed-off-by: Liu Bo liubo2...@cn.fujitsu.com
Signed-off-by: Miao Xie mi...@cn.fujitsu.com
---
 fs/btrfs/ioctl.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index d8b5471..0acc828 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1123,12 +1123,16 @@ int btrfs_defrag_file(struct inode *inode, struct file 
*file,
ra_index += max_cluster;
}
 
+   mutex_lock(inode-i_mutex);
ret = cluster_pages_for_defrag(inode, pages, i, cluster);
-   if (ret  0)
+   if (ret  0) {
+   mutex_unlock(inode-i_mutex);
goto out_ra;
+   }
 
defrag_count += ret;
balance_dirty_pages_ratelimited_nr(inode-i_mapping, ret);
+   mutex_unlock(inode-i_mutex);
 
if (newer_than) {
if (newer_off == (u64)-1)
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs csum failed, scrub ok

2012-03-27 Thread Christoph Groth
I have a freshly installed system with btrfs as the root file system.
The machine is running linux 3.2.  The raid1 btrfs file system lives on
two new hard drives.

About one day after installation the following message appeared in
kern.log.  There were no other errors.

root@mim:/var/log# grep 'btrfs.*fail' kern.log
Mar 27 01:07:46 mim kernel: [ 6480.233861] btrfs csum failed ino 453509 off 
1495040 csum 3301532933 private 4156998194
Mar 27 01:07:46 mim kernel: [ 6480.234470] btrfs csum failed ino 453509 off 
1499136 csum 1873118812 private 3512102188
Mar 27 01:07:46 mim kernel: [ 6480.234572] btrfs csum failed ino 453509 off 
1503232 csum 1034640717 private 2041007647
Mar 27 01:07:46 mim kernel: [ 6480.234670] btrfs csum failed ino 453509 off 
1507328 csum 889729013 private 2342095239
Mar 27 01:07:46 mim kernel: [ 6480.237977] btrfs csum failed ino 453509 off 
1503232 csum 1518679450 private 2041007647
Mar 27 01:07:46 mim kernel: [ 6480.238149] btrfs csum failed ino 453509 off 
1507328 csum 889729013 private 2342095239
Mar 27 01:07:46 mim kernel: [ 6480.238330] btrfs csum failed ino 453509 off 
1495040 csum 3234580989 private 4156998194
Mar 27 01:07:46 mim kernel: [ 6480.238447] btrfs csum failed ino 453509 off 
1499136 csum 1873118812 private 3512102188
Mar 27 01:07:46 mim kernel: [ 6480.243873] btrfs csum failed ino 453509 off 
1503232 csum 2184012753 private 2041007647
Mar 27 01:07:46 mim kernel: [ 6480.243962] btrfs csum failed ino 453509 off 
1507328 csum 240604621 private 2342095239

inode 453509 belongs to a file installed by dpkg

root@mim:/# find / -inum 453509 -ls
453509 1976 -rw-r--r--   1 root root  2020832 Mar  7 21:11 
/usr/lib/libreoffice/basis3.4/program/libsblx.so

That file seems to be ok, there are no errors when re-reading it.

A scrub done the morning after the incident also didn't find any
problems:

root@mim:/home/cwg# btrfs scrub status /
scrub status for 2da00153-f9ea-4d6c-a6cc-10c913d22686
scrub started at Tue Mar 27 10:37:49 2012 and finished after 3921 
seconds
total bytes scrubbed: 550.20GB with 0 errors

Also inspecting the SMART status of the hard drives does not reveal any
problems.

Is this a bug in btrfs, or am I supposed to be afraid that the new hard
drives are not working reliably?  Or could this be the effect of some
cosmic ray hitting my machine?  (It doesn't have ECC.)  Or is it normal
that hard drives sometimes make errors?  (In that case the additional
layer of btrfs checksumming seems to be a very good thing.)

Christoph

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed, scrub ok

2012-03-27 Thread Roman Mamedov
On Tue, 27 Mar 2012 12:57:31 +0200
Christoph Groth c...@falma.de wrote:

 root@mim:/# find / -inum 453509 -ls
 453509 1976 -rw-r--r--   1 root root  2020832 Mar  7 21:11 
 /usr/lib/libreoffice/basis3.4/program/libsblx.so
 
 That file seems to be ok, there are no errors when re-reading it.

How about

$ sudo apt-get install debsums
$ debsums libreoffice-core | grep libsblx.so

-- 
With respect,
Roman

~~~
Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free.


signature.asc
Description: PGP signature


Re: btrfs csum failed, scrub ok

2012-03-27 Thread Christoph Groth
Roman Mamedov r...@romanrm.ru writes:

 On Tue, 27 Mar 2012 12:57:31 +0200
 Christoph Groth c...@falma.de wrote:

 root@mim:/# find / -inum 453509 -ls
 453509 1976 -rw-r--r--   1 root root  2020832 Mar  7 21:11 
 /usr/lib/libreoffice/basis3.4/program/libsblx.so
 
 That file seems to be ok, there are no errors when re-reading it.

 How about

 $ sudo apt-get install debsums
 $ debsums libreoffice-core | grep libsblx.so

Good idea!

$ debsums libreoffice-core | grep libsblx.so
/usr/lib/libreoffice/basis3.4/program/libsblx.so  OK

I'm still puzzled by this incident.

Christoph

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: open_ctree failed

2012-03-27 Thread Not Zippy
One entire subvolume was restored. But there were 4 subvolumes on that
partition. Is there a way to specify/force the restore of a different
subvolume ?

find-root seems to only find a single root.

thanks

On Mon, Mar 26, 2012 at 3:47 PM, Hugo Mills h...@carfax.org.uk wrote:
 On Mon, Mar 26, 2012 at 03:36:13PM -0700, Not Zippy wrote:
 Hugo
 I did try the dangerdonteveruse branch and thats the error btrfsck
 --repair gave me.

   Oooh, a brave one, I see. ;)

 Looks like the btrfs-restore command may work (thanks!). And yes I
 do have backups for the important data - I had some other data on
 there which would need to be d/l again..

   Excellent. We don't need to set the hounds onto you, then.

 I don't dabble that much with the kernel - this is a straight ubuntu
 which I regularly do their updates - Can I advance the kernel beyond ?

   Yes, there's a PPA[1] for it (documented in the Getting Started
 page on the btrfs wiki at [2]).

 [1] http://kernel.ubuntu.com/~kernel-ppa/mainline/
 [2] http://btrfs.ipv5.de/index.php?title=Getting_started#Ubuntu_Linux

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: open_ctree failed

2012-03-27 Thread Hugo Mills
On Tue, Mar 27, 2012 at 05:58:17AM -0700, Not Zippy wrote:
 One entire subvolume was restored. But there were 4 subvolumes on that
 partition. Is there a way to specify/force the restore of a different
 subvolume ?
 
 find-root seems to only find a single root.

   There is only a single root tree, so that's understandable. If you
have a look at the documentation for restore[1], it mentions (right
near the bottom of the page) that -r will allow you to select an
alternative subvolume to recover from.

   Hugo.

[1] http://btrfs.ipv5.de/index.php?title=Restore

 thanks
 
 On Mon, Mar 26, 2012 at 3:47 PM, Hugo Mills h...@carfax.org.uk wrote:
  On Mon, Mar 26, 2012 at 03:36:13PM -0700, Not Zippy wrote:
  Hugo
  I did try the dangerdonteveruse branch and thats the error btrfsck
  --repair gave me.
 
    Oooh, a brave one, I see. ;)
 
  Looks like the btrfs-restore command may work (thanks!). And yes I
  do have backups for the important data - I had some other data on
  there which would need to be d/l again..
 
    Excellent. We don't need to set the hounds onto you, then.
 
  I don't dabble that much with the kernel - this is a straight ubuntu
  which I regularly do their updates - Can I advance the kernel beyond ?
 
    Yes, there's a PPA[1] for it (documented in the Getting Started
  page on the btrfs wiki at [2]).
 
  [1] http://kernel.ubuntu.com/~kernel-ppa/mainline/
  [2] http://btrfs.ipv5.de/index.php?title=Getting_started#Ubuntu_Linux
 

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- The enemy have elected for Death by Powerpoint. That's what ---   
 they shall get.  -- gdb 


signature.asc
Description: Digital signature


[PATCH 1/3] Btrfs: actually call btrfs_init_lockdep

2012-03-27 Thread Jan Schmidt
btrfs_init_lockdep only makes our lockdep class names look prettier, thus
it did never hurt we forgot to actually call it. This turns our lockdep
identifier strings from lockdep auto-set #[id] into really pretty
btrfs-fs-01 or btrfs-csum-03.

Signed-off-by: Jan Schmidt list.bt...@jan-o-sch.net
---
 fs/btrfs/super.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 61717a4..5239003 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1404,6 +1404,8 @@ static int __init init_btrfs_fs(void)
if (err)
goto unregister_ioctl;
 
+   btrfs_init_lockdep();
+
printk(KERN_INFO %s loaded\n, BTRFS_BUILD_VERSION);
return 0;
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] Btrfs fixes for 3.4

2012-03-27 Thread Jan Schmidt
Hi Chris,

please pull my three current patches from my repo, based on your
for-linus branch (I can rebase them to the integration branch if that
helps):

git://git.jan-o-sch.net/btrfs-unstable for-chris

It's two really small fixes both mentioned earlier and a more or less
imporant fixup for scrub. While working fine in 3.2, name resolving can
deadlock since the first rc of 3.3. I suggest we queue that patch 3/3
for submission to 3.3-stable.

I'm passing xfstests just as good as for-linus is doing without my
patches (which is not really good). I also made some manual error
insertion tests to verify that the scrub deadlock chance is really gone.

And, we really should have an xfstest for raid-repair and scrub-repair.
Anyone? :-)

-Jan

Jan Schmidt (3):
  Btrfs: actually call btrfs_init_lockdep
  Btrfs: check return value of btrfs_cow_block()
  Btrfs: fix regression in scrub path resolving

 fs/btrfs/backref.c |  115 +++
 fs/btrfs/backref.h |5 +-
 fs/btrfs/ioctl.c   |4 +-
 fs/btrfs/scrub.c   |4 +-
 fs/btrfs/super.c   |2 +
 fs/btrfs/transaction.c |6 ++-
 6 files changed, 79 insertions(+), 57 deletions(-)

-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] Btrfs: check return value of btrfs_cow_block()

2012-03-27 Thread Jan Schmidt
The two helper functions commit_cowonly_roots() and
create_pending_snapshot() failed to check the return value from
btrfs_cow_block(), which could at least in theory fail with -ENOSPC from
btrfs_alloc_free_block(). This commit adds the missing checks.

Signed-off-by: Jan Schmidt list.bt...@jan-o-sch.net
---
 fs/btrfs/transaction.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 04b77e3..cd220f2 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -718,7 +718,8 @@ static noinline int commit_cowonly_roots(struct 
btrfs_trans_handle *trans,
BUG_ON(ret);
 
eb = btrfs_lock_root_node(fs_info-tree_root);
-   btrfs_cow_block(trans, fs_info-tree_root, eb, NULL, 0, eb);
+   ret = btrfs_cow_block(trans, fs_info-tree_root, eb, NULL, 0, eb);
+   BUG_ON(ret);
btrfs_tree_unlock(eb);
free_extent_buffer(eb);
 
@@ -949,7 +950,8 @@ static noinline int create_pending_snapshot(struct 
btrfs_trans_handle *trans,
btrfs_set_root_flags(new_root_item, root_flags);
 
old = btrfs_lock_root_node(root);
-   btrfs_cow_block(trans, root, old, NULL, 0, old);
+   ret = btrfs_cow_block(trans, root, old, NULL, 0, old);
+   BUG_ON(ret);
btrfs_set_lock_blocking(old);
 
btrfs_copy_root(trans, root, old, tmp, objectid);
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] Btrfs: fix regression in scrub path resolving

2012-03-27 Thread Jan Schmidt
In commit 4692cf58 we introduced new backref walking code for btrfs. This
assumes we're searching live roots, which requires a transaction context.
While scrubbing, however, we must not join a transaction because this could
deadlock with the commit path. Additionally, what scrub really wants to do
is resolving a logical address in the commit root it's currently checking.

This patch adds support for logical to path resolving on commit roots and
makes scrub use that.

Signed-off-by: Jan Schmidt list.bt...@jan-o-sch.net
---
I think we should queue this one for 3.3-stable
---
 fs/btrfs/backref.c |  115 ++--
 fs/btrfs/backref.h |5 +-
 fs/btrfs/ioctl.c   |4 +-
 fs/btrfs/scrub.c   |4 +-
 4 files changed, 73 insertions(+), 55 deletions(-)

diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 0436c12..56136d90 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -116,6 +116,7 @@ add_parent:
  * to a logical address
  */
 static int __resolve_indirect_ref(struct btrfs_fs_info *fs_info,
+   int search_commit_root,
struct __prelim_ref *ref,
struct ulist *parents)
 {
@@ -131,6 +132,7 @@ static int __resolve_indirect_ref(struct btrfs_fs_info 
*fs_info,
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
+   path-search_commit_root = !!search_commit_root;
 
root_key.objectid = ref-root_id;
root_key.type = BTRFS_ROOT_ITEM_KEY;
@@ -188,6 +190,7 @@ out:
  * resolve all indirect backrefs from the list
  */
 static int __resolve_indirect_refs(struct btrfs_fs_info *fs_info,
+  int search_commit_root,
   struct list_head *head)
 {
int err;
@@ -212,7 +215,8 @@ static int __resolve_indirect_refs(struct btrfs_fs_info 
*fs_info,
continue;
if (ref-count == 0)
continue;
-   err = __resolve_indirect_ref(fs_info, ref, parents);
+   err = __resolve_indirect_ref(fs_info, search_commit_root,
+ref, parents);
if (err) {
if (ret == 0)
ret = err;
@@ -586,6 +590,7 @@ static int find_parent_nodes(struct btrfs_trans_handle 
*trans,
struct btrfs_delayed_ref_head *head;
int info_level = 0;
int ret;
+   int search_commit_root = (trans == BTRFS_BACKREF_SEARCH_COMMIT_ROOT);
struct list_head prefs_delayed;
struct list_head prefs;
struct __prelim_ref *ref;
@@ -600,6 +605,7 @@ static int find_parent_nodes(struct btrfs_trans_handle 
*trans,
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
+   path-search_commit_root = !!search_commit_root;
 
/*
 * grab both a lock on the path and a lock on the delayed ref head.
@@ -614,35 +620,39 @@ again:
goto out;
BUG_ON(ret == 0);
 
-   /*
-* look if there are updates for this ref queued and lock the head
-*/
-   delayed_refs = trans-transaction-delayed_refs;
-   spin_lock(delayed_refs-lock);
-   head = btrfs_find_delayed_ref_head(trans, bytenr);
-   if (head) {
-   if (!mutex_trylock(head-mutex)) {
-   atomic_inc(head-node.refs);
-   spin_unlock(delayed_refs-lock);
-
-   btrfs_release_path(path);
-
-   /*
-* Mutex was contended, block until it's
-* released and try again
-*/
-   mutex_lock(head-mutex);
-   mutex_unlock(head-mutex);
-   btrfs_put_delayed_ref(head-node);
-   goto again;
-   }
-   ret = __add_delayed_refs(head, seq, info_key, prefs_delayed);
-   if (ret) {
-   spin_unlock(delayed_refs-lock);
-   goto out;
+   if (trans != BTRFS_BACKREF_SEARCH_COMMIT_ROOT) {
+   /*
+* look if there are updates for this ref queued and lock the
+* head
+*/
+   delayed_refs = trans-transaction-delayed_refs;
+   spin_lock(delayed_refs-lock);
+   head = btrfs_find_delayed_ref_head(trans, bytenr);
+   if (head) {
+   if (!mutex_trylock(head-mutex)) {
+   atomic_inc(head-node.refs);
+   spin_unlock(delayed_refs-lock);
+
+   btrfs_release_path(path);
+
+   /*
+* Mutex was contended, block until it's
+* released and try again
+   

[PATCH 0/8] Restriper fixes

2012-03-27 Thread Ilya Dryomov
Hi Chris,

The main one here is the improvement to btrfs_can_relocate(), which is
now a tiny bit smarter and does not return ENOSPC when there's plenty of
unallocated space for target chunks.  This, in addition to my patch
which disables silent profile upgrades, should lower a number of
corner cases in profile changing.

The rest are a bunch of cleanups and some minor fixes.

Please pull from

git://github.com/idryomov/btrfs-unstable.git for-chris

top commit 213e64da90d14537cd63f7090d6c4d1fcc75d9f8

Thanks,

Ilya


Ilya Dryomov (8):
  Btrfs: add wrappers for working with alloc profiles
  Btrfs: make profile_is_valid() check more strict
  Btrfs: move alloc_profile_is_valid() to volumes.c
  Btrfs: add get_restripe_target() helper
  Btrfs: add __get_block_group_index() helper
  Btrfs: improve the logic in btrfs_can_relocate()
  Btrfs: validate target profiles only if we are going to use them
  Btrfs: allow dup for data chunks in mixed mode

 fs/btrfs/ctree.h   |   33 +--
 fs/btrfs/extent-tree.c |  158 ++--
 fs/btrfs/volumes.c |   88 ---
 3 files changed, 152 insertions(+), 127 deletions(-)

-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/8] Btrfs: move alloc_profile_is_valid() to volumes.c

2012-03-27 Thread Ilya Dryomov
Header file is not a good place to define functions.  This also moves a
call to alloc_profile_is_valid() down the stack and removes a redundant
check from __btrfs_alloc_chunk() - alloc_profile_is_valid() takes it
into account.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/ctree.h   |   23 ---
 fs/btrfs/extent-tree.c |2 --
 fs/btrfs/volumes.c |   30 +-
 3 files changed, 25 insertions(+), 30 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index f057e92..a56e1e0 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2734,29 +2734,6 @@ static inline void free_fs_info(struct btrfs_fs_info 
*fs_info)
kfree(fs_info-super_for_commit);
kfree(fs_info);
 }
-/**
- * alloc_profile_is_valid - see if a given profile is valid and reduced
- * @flags: profile to validate
- * @extended: if true @flags is treated as an extended profile
- */
-static inline int alloc_profile_is_valid(u64 flags, int extended)
-{
-   u64 mask = (extended ? BTRFS_EXTENDED_PROFILE_MASK :
-  BTRFS_BLOCK_GROUP_PROFILE_MASK);
-
-   flags = ~BTRFS_BLOCK_GROUP_TYPE_MASK;
-
-   /* 1) check that all other bits are zeroed */
-   if (flags  ~mask)
-   return 0;
-
-   /* 2) see if profile is reduced */
-   if (flags == 0)
-   return !extended; /* 0 is valid for usual profiles */
-
-   /* true if exactly one bit set */
-   return (flags  (flags - 1)) == 0;
-}
 
 /* root-item.c */
 int btrfs_find_root_ref(struct btrfs_root *tree_root,
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 8c5bd8f..304710c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3400,8 +3400,6 @@ static int do_chunk_alloc(struct btrfs_trans_handle 
*trans,
int wait_for_alloc = 0;
int ret = 0;
 
-   BUG_ON(!alloc_profile_is_valid(flags, 0));
-
space_info = __find_space_info(extent_root-fs_info, flags);
if (!space_info) {
ret = update_space_info(extent_root-fs_info, flags,
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index e4ef0f2..def9e25 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2598,6 +2598,30 @@ error:
return ret;
 }
 
+/**
+ * alloc_profile_is_valid - see if a given profile is valid and reduced
+ * @flags: profile to validate
+ * @extended: if true @flags is treated as an extended profile
+ */
+static int alloc_profile_is_valid(u64 flags, int extended)
+{
+   u64 mask = (extended ? BTRFS_EXTENDED_PROFILE_MASK :
+  BTRFS_BLOCK_GROUP_PROFILE_MASK);
+
+   flags = ~BTRFS_BLOCK_GROUP_TYPE_MASK;
+
+   /* 1) check that all other bits are zeroed */
+   if (flags  ~mask)
+   return 0;
+
+   /* 2) see if profile is reduced */
+   if (flags == 0)
+   return !extended; /* 0 is valid for usual profiles */
+
+   /* true if exactly one bit set */
+   return (flags  (flags - 1)) == 0;
+}
+
 static inline int balance_need_close(struct btrfs_fs_info *fs_info)
 {
/* cancel requested || normal exit path */
@@ -3124,11 +3148,7 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle 
*trans,
int i;
int j;
 
-   if ((type  BTRFS_BLOCK_GROUP_RAID1) 
-   (type  BTRFS_BLOCK_GROUP_DUP)) {
-   WARN_ON(1);
-   type = ~BTRFS_BLOCK_GROUP_DUP;
-   }
+   BUG_ON(!alloc_profile_is_valid(type, 0));
 
if (list_empty(fs_devices-alloc_list))
return -ENOSPC;
-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/8] Btrfs: add wrappers for working with alloc profiles

2012-03-27 Thread Ilya Dryomov
Add functions to abstract the conversion between chunk and extended
allocation profile formats and switch everybody to use them.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/ctree.h   |   15 +++
 fs/btrfs/extent-tree.c |   25 +++--
 fs/btrfs/volumes.c |   20 
 3 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index c2e17cd..aba7832 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -849,6 +849,21 @@ struct btrfs_csum_item {
  */
 #define BTRFS_AVAIL_ALLOC_BIT_SINGLE   (1ULL  48)
 
+#define BTRFS_EXTENDED_PROFILE_MASK(BTRFS_BLOCK_GROUP_PROFILE_MASK | \
+BTRFS_AVAIL_ALLOC_BIT_SINGLE)
+
+static inline u64 chunk_to_extended(u64 flags)
+{
+   if ((flags  BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0)
+   flags |= BTRFS_AVAIL_ALLOC_BIT_SINGLE;
+
+   return flags;
+}
+static inline u64 extended_to_chunk(u64 flags)
+{
+   return flags  ~BTRFS_AVAIL_ALLOC_BIT_SINGLE;
+}
+
 struct btrfs_block_group_item {
__le64 used;
__le64 chunk_objectid;
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 4269777..9f16fdb 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3098,11 +3098,8 @@ static int update_space_info(struct btrfs_fs_info *info, 
u64 flags,
 
 static void set_avail_alloc_bits(struct btrfs_fs_info *fs_info, u64 flags)
 {
-   u64 extra_flags = flags  BTRFS_BLOCK_GROUP_PROFILE_MASK;
-
-   /* chunk - extended profile */
-   if (extra_flags == 0)
-   extra_flags = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
+   u64 extra_flags = chunk_to_extended(flags) 
+   BTRFS_EXTENDED_PROFILE_MASK;
 
if (flags  BTRFS_BLOCK_GROUP_DATA)
fs_info-avail_data_alloc_bits |= extra_flags;
@@ -3181,9 +3178,7 @@ u64 btrfs_reduce_alloc_profile(struct btrfs_root *root, 
u64 flags)
}
 
 out:
-   /* extended - chunk profile */
-   flags = ~BTRFS_AVAIL_ALLOC_BIT_SINGLE;
-   return flags;
+   return extended_to_chunk(flags);
 }
 
 static u64 get_alloc_profile(struct btrfs_root *root, u64 flags)
@@ -6914,11 +6909,8 @@ static u64 update_block_group_flags(struct btrfs_root 
*root, u64 flags)
tgt = BTRFS_BLOCK_GROUP_METADATA | bctl-meta.target;
}
 
-   if (tgt) {
-   /* extended - chunk profile */
-   tgt = ~BTRFS_AVAIL_ALLOC_BIT_SINGLE;
-   return tgt;
-   }
+   if (tgt)
+   return extended_to_chunk(tgt);
}
 
/*
@@ -7597,11 +7589,8 @@ int btrfs_make_block_group(struct btrfs_trans_handle 
*trans,
 
 static void clear_avail_alloc_bits(struct btrfs_fs_info *fs_info, u64 flags)
 {
-   u64 extra_flags = flags  BTRFS_BLOCK_GROUP_PROFILE_MASK;
-
-   /* chunk - extended profile */
-   if (extra_flags == 0)
-   extra_flags = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
+   u64 extra_flags = chunk_to_extended(flags) 
+   BTRFS_EXTENDED_PROFILE_MASK;
 
if (flags  BTRFS_BLOCK_GROUP_DATA)
fs_info-avail_data_alloc_bits = ~extra_flags;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 58aad63e..4b263a2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2250,15 +2250,13 @@ static void unset_balance_control(struct btrfs_fs_info 
*fs_info)
  * Balance filters.  Return 1 if chunk should be filtered out
  * (should not be balanced).
  */
-static int chunk_profiles_filter(u64 chunk_profile,
+static int chunk_profiles_filter(u64 chunk_type,
 struct btrfs_balance_args *bargs)
 {
-   chunk_profile = BTRFS_BLOCK_GROUP_PROFILE_MASK;
+   chunk_type = chunk_to_extended(chunk_type) 
+   BTRFS_EXTENDED_PROFILE_MASK;
 
-   if (chunk_profile == 0)
-   chunk_profile = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
-
-   if (bargs-profiles  chunk_profile)
+   if (bargs-profiles  chunk_type)
return 0;
 
return 1;
@@ -2365,18 +2363,16 @@ static int chunk_vrange_filter(struct extent_buffer 
*leaf,
return 1;
 }
 
-static int chunk_soft_convert_filter(u64 chunk_profile,
+static int chunk_soft_convert_filter(u64 chunk_type,
 struct btrfs_balance_args *bargs)
 {
if (!(bargs-flags  BTRFS_BALANCE_ARGS_CONVERT))
return 0;
 
-   chunk_profile = BTRFS_BLOCK_GROUP_PROFILE_MASK;
-
-   if (chunk_profile == 0)
-   chunk_profile = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
+   chunk_type = chunk_to_extended(chunk_type) 
+   BTRFS_EXTENDED_PROFILE_MASK;
 
-   if (bargs-target  chunk_profile)
+   if (bargs-target == chunk_type)
return 1;
 
return 0;
-- 
1.7.9.1

--
To unsubscribe from this 

[PATCH 4/8] Btrfs: add get_restripe_target() helper

2012-03-27 Thread Ilya Dryomov
Add get_restripe_target() helper and switch everybody to use it.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/extent-tree.c |   94 +--
 1 files changed, 50 insertions(+), 44 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 304710c..faf52e0 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3110,6 +3110,35 @@ static void set_avail_alloc_bits(struct btrfs_fs_info 
*fs_info, u64 flags)
 }
 
 /*
+ * returns target flags in extended format or 0 if restripe for this
+ * chunk_type is not in progress
+ */
+static u64 get_restripe_target(struct btrfs_fs_info *fs_info, u64 flags)
+{
+   struct btrfs_balance_control *bctl = fs_info-balance_ctl;
+   u64 target = 0;
+
+   BUG_ON(!mutex_is_locked(fs_info-volume_mutex) 
+  !spin_is_locked(fs_info-balance_lock));
+
+   if (!bctl)
+   return 0;
+
+   if (flags  BTRFS_BLOCK_GROUP_DATA 
+   bctl-data.flags  BTRFS_BALANCE_ARGS_CONVERT) {
+   target = BTRFS_BLOCK_GROUP_DATA | bctl-data.target;
+   } else if (flags  BTRFS_BLOCK_GROUP_SYSTEM 
+  bctl-sys.flags  BTRFS_BALANCE_ARGS_CONVERT) {
+   target = BTRFS_BLOCK_GROUP_SYSTEM | bctl-sys.target;
+   } else if (flags  BTRFS_BLOCK_GROUP_METADATA 
+  bctl-meta.flags  BTRFS_BALANCE_ARGS_CONVERT) {
+   target = BTRFS_BLOCK_GROUP_METADATA | bctl-meta.target;
+   }
+
+   return target;
+}
+
+/*
  * @flags: available profiles in extended format (see ctree.h)
  *
  * Returns reduced profile in chunk format.  If profile changing is in
@@ -3125,31 +3154,19 @@ u64 btrfs_reduce_alloc_profile(struct btrfs_root *root, 
u64 flags)
 */
u64 num_devices = root-fs_info-fs_devices-rw_devices +
root-fs_info-fs_devices-missing_devices;
+   u64 target;
 
-   /* pick restriper's target profile if it's available */
+   /*
+* see if restripe for this chunk_type is in progress, if so
+* try to reduce to the target profile
+*/
spin_lock(root-fs_info-balance_lock);
-   if (root-fs_info-balance_ctl) {
-   struct btrfs_balance_control *bctl = root-fs_info-balance_ctl;
-   u64 tgt = 0;
-
-   if ((flags  BTRFS_BLOCK_GROUP_DATA) 
-   (bctl-data.flags  BTRFS_BALANCE_ARGS_CONVERT) 
-   (flags  bctl-data.target)) {
-   tgt = BTRFS_BLOCK_GROUP_DATA | bctl-data.target;
-   } else if ((flags  BTRFS_BLOCK_GROUP_SYSTEM) 
-  (bctl-sys.flags  BTRFS_BALANCE_ARGS_CONVERT) 
-  (flags  bctl-sys.target)) {
-   tgt = BTRFS_BLOCK_GROUP_SYSTEM | bctl-sys.target;
-   } else if ((flags  BTRFS_BLOCK_GROUP_METADATA) 
-  (bctl-meta.flags  BTRFS_BALANCE_ARGS_CONVERT) 
-  (flags  bctl-meta.target)) {
-   tgt = BTRFS_BLOCK_GROUP_METADATA | bctl-meta.target;
-   }
-
-   if (tgt) {
+   target = get_restripe_target(root-fs_info, flags);
+   if (target) {
+   /* pick target profile only if it's already available */
+   if ((flags  target)  BTRFS_EXTENDED_PROFILE_MASK) {
spin_unlock(root-fs_info-balance_lock);
-   flags = tgt;
-   goto out;
+   return extended_to_chunk(target);
}
}
spin_unlock(root-fs_info-balance_lock);
@@ -3177,7 +3194,6 @@ u64 btrfs_reduce_alloc_profile(struct btrfs_root *root, 
u64 flags)
flags = ~BTRFS_BLOCK_GROUP_RAID0;
}
 
-out:
return extended_to_chunk(flags);
 }
 
@@ -6888,28 +6904,15 @@ int btrfs_drop_subtree(struct btrfs_trans_handle *trans,
 static u64 update_block_group_flags(struct btrfs_root *root, u64 flags)
 {
u64 num_devices;
-   u64 stripped = BTRFS_BLOCK_GROUP_RAID0 |
-   BTRFS_BLOCK_GROUP_RAID1 | BTRFS_BLOCK_GROUP_RAID10;
-
-   if (root-fs_info-balance_ctl) {
-   struct btrfs_balance_control *bctl = root-fs_info-balance_ctl;
-   u64 tgt = 0;
-
-   /* pick restriper's target profile and return */
-   if (flags  BTRFS_BLOCK_GROUP_DATA 
-   bctl-data.flags  BTRFS_BALANCE_ARGS_CONVERT) {
-   tgt = BTRFS_BLOCK_GROUP_DATA | bctl-data.target;
-   } else if (flags  BTRFS_BLOCK_GROUP_SYSTEM 
-  bctl-sys.flags  BTRFS_BALANCE_ARGS_CONVERT) {
-   tgt = BTRFS_BLOCK_GROUP_SYSTEM | bctl-sys.target;
-   } else if (flags  BTRFS_BLOCK_GROUP_METADATA 
-  bctl-meta.flags  BTRFS_BALANCE_ARGS_CONVERT) {
-   tgt = BTRFS_BLOCK_GROUP_METADATA | bctl-meta.target;
-   }
+   u64 

[PATCH 5/8] Btrfs: add __get_block_group_index() helper

2012-03-27 Thread Ilya Dryomov
Add __get_block_group_index() helper to be able to derive block group
index from an arbitary set of flags.  Implement get_block_group_index()
in terms of it.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/extent-tree.c |   17 -
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index faf52e0..c44aa96 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -5248,22 +5248,29 @@ wait_block_group_cache_done(struct 
btrfs_block_group_cache *cache)
return 0;
 }
 
-static int get_block_group_index(struct btrfs_block_group_cache *cache)
+static int __get_block_group_index(u64 flags)
 {
int index;
-   if (cache-flags  BTRFS_BLOCK_GROUP_RAID10)
+
+   if (flags  BTRFS_BLOCK_GROUP_RAID10)
index = 0;
-   else if (cache-flags  BTRFS_BLOCK_GROUP_RAID1)
+   else if (flags  BTRFS_BLOCK_GROUP_RAID1)
index = 1;
-   else if (cache-flags  BTRFS_BLOCK_GROUP_DUP)
+   else if (flags  BTRFS_BLOCK_GROUP_DUP)
index = 2;
-   else if (cache-flags  BTRFS_BLOCK_GROUP_RAID0)
+   else if (flags  BTRFS_BLOCK_GROUP_RAID0)
index = 3;
else
index = 4;
+
return index;
 }
 
+static int get_block_group_index(struct btrfs_block_group_cache *cache)
+{
+   return __get_block_group_index(cache-flags);
+}
+
 enum btrfs_loop_type {
LOOP_CACHING_NOWAIT = 0,
LOOP_CACHING_WAIT = 1,
-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/8] Btrfs: make profile_is_valid() check more strict

2012-03-27 Thread Ilya Dryomov
0 is a valid value for an on-disk chunk profile, but it is not a valid
extended profile.  (We have a separate bit for single chunks in extended
case)

Also rename it to alloc_profile_is_valid() for clarity.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/ctree.h   |   21 +
 fs/btrfs/extent-tree.c |2 +-
 fs/btrfs/volumes.c |6 +++---
 3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index aba7832..f057e92 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2735,22 +2735,27 @@ static inline void free_fs_info(struct btrfs_fs_info 
*fs_info)
kfree(fs_info);
 }
 /**
- * profile_is_valid - tests whether a given profile is valid and reduced
+ * alloc_profile_is_valid - see if a given profile is valid and reduced
  * @flags: profile to validate
  * @extended: if true @flags is treated as an extended profile
  */
-static inline int profile_is_valid(u64 flags, int extended)
+static inline int alloc_profile_is_valid(u64 flags, int extended)
 {
-   u64 mask = ~BTRFS_BLOCK_GROUP_PROFILE_MASK;
+   u64 mask = (extended ? BTRFS_EXTENDED_PROFILE_MASK :
+  BTRFS_BLOCK_GROUP_PROFILE_MASK);
 
flags = ~BTRFS_BLOCK_GROUP_TYPE_MASK;
-   if (extended)
-   mask = ~BTRFS_AVAIL_ALLOC_BIT_SINGLE;
 
-   if (flags  mask)
+   /* 1) check that all other bits are zeroed */
+   if (flags  ~mask)
return 0;
-   /* true if zero or exactly one bit set */
-   return (flags  (~flags + 1)) == flags;
+
+   /* 2) see if profile is reduced */
+   if (flags == 0)
+   return !extended; /* 0 is valid for usual profiles */
+
+   /* true if exactly one bit set */
+   return (flags  (flags - 1)) == 0;
 }
 
 /* root-item.c */
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 9f16fdb..8c5bd8f 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3400,7 +3400,7 @@ static int do_chunk_alloc(struct btrfs_trans_handle 
*trans,
int wait_for_alloc = 0;
int ret = 0;
 
-   BUG_ON(!profile_is_valid(flags, 0));
+   BUG_ON(!alloc_profile_is_valid(flags, 0));
 
space_info = __find_space_info(extent_root-fs_info, flags);
if (!space_info) {
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 4b263a2..e4ef0f2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2669,7 +2669,7 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
allowed |= (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID1 |
BTRFS_BLOCK_GROUP_RAID10);
 
-   if (!profile_is_valid(bctl-data.target, 1) ||
+   if (!alloc_profile_is_valid(bctl-data.target, 1) ||
bctl-data.target  ~allowed) {
printk(KERN_ERR btrfs: unable to start balance with target 
   data profile %llu\n,
@@ -2677,7 +2677,7 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
ret = -EINVAL;
goto out;
}
-   if (!profile_is_valid(bctl-meta.target, 1) ||
+   if (!alloc_profile_is_valid(bctl-meta.target, 1) ||
bctl-meta.target  ~allowed) {
printk(KERN_ERR btrfs: unable to start balance with target 
   metadata profile %llu\n,
@@ -2685,7 +2685,7 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
ret = -EINVAL;
goto out;
}
-   if (!profile_is_valid(bctl-sys.target, 1) ||
+   if (!alloc_profile_is_valid(bctl-sys.target, 1) ||
bctl-sys.target  ~allowed) {
printk(KERN_ERR btrfs: unable to start balance with target 
   system profile %llu\n,
-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/8] Btrfs: improve the logic in btrfs_can_relocate()

2012-03-27 Thread Ilya Dryomov
Currently if we don't have enough space allocated we go ahead and loop
though devices in the hopes of finding enough space for a chunk of the
*same* type as the one we are trying to relocate.  The problem with that
is that if we are trying to restripe the chunk its target type can be
more relaxed than the current one (eg require less devices or less
space).  So, when restriping, run checks against the target profile
instead of the current one.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/extent-tree.c |   24 ++--
 1 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index c44aa96..9454045 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -7136,6 +7136,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
bytenr)
u64 min_free;
u64 dev_min = 1;
u64 dev_nr = 0;
+   u64 target;
int index;
int full = 0;
int ret = 0;
@@ -7176,13 +7177,11 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
bytenr)
/*
 * ok we don't have enough space, but maybe we have free space on our
 * devices to allocate new chunks for relocation, so loop through our
-* alloc devices and guess if we have enough space.  However, if we
-* were marked as full, then we know there aren't enough chunks, and we
-* can just return.
+* alloc devices and guess if we have enough space.  if this block
+* group is going to be restriped, run checks against the target
+* profile instead of the current one.
 */
ret = -1;
-   if (full)
-   goto out;
 
/*
 * index:
@@ -7192,7 +7191,20 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 
bytenr)
 *  3: raid0
 *  4: single
 */
-   index = get_block_group_index(block_group);
+   target = get_restripe_target(root-fs_info, block_group-flags);
+   if (target) {
+   index = __get_block_group_index(extended_to_chunk(target));
+   } else {
+   /*
+* this is just a balance, so if we were marked as full
+* we know there is no space for a new chunk
+*/
+   if (full)
+   goto out;
+
+   index = get_block_group_index(block_group);
+   }
+
if (index == 0) {
dev_min = 4;
/* Divide by 2 */
-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/8] Btrfs: validate target profiles only if we are going to use them

2012-03-27 Thread Ilya Dryomov
Do not run sanity checks on all target profiles unless they all will be
used.  This came up because alloc_profile_is_valid() is now more strict
than it used to be.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/volumes.c |   27 +++
 1 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index def9e25..28addea 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2676,14 +2676,6 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
}
}
 
-   /*
-* Profile changing sanity checks.  Skip them if a simple
-* balance is requested.
-*/
-   if (!((bctl-data.flags | bctl-sys.flags | bctl-meta.flags) 
- BTRFS_BALANCE_ARGS_CONVERT))
-   goto do_balance;
-
allowed = BTRFS_AVAIL_ALLOC_BIT_SINGLE;
if (fs_info-fs_devices-num_devices == 1)
allowed |= BTRFS_BLOCK_GROUP_DUP;
@@ -2693,24 +2685,27 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
allowed |= (BTRFS_BLOCK_GROUP_RAID0 | BTRFS_BLOCK_GROUP_RAID1 |
BTRFS_BLOCK_GROUP_RAID10);
 
-   if (!alloc_profile_is_valid(bctl-data.target, 1) ||
-   bctl-data.target  ~allowed) {
+   if ((bctl-data.flags  BTRFS_BALANCE_ARGS_CONVERT) 
+   (!alloc_profile_is_valid(bctl-data.target, 1) ||
+(bctl-data.target  ~allowed))) {
printk(KERN_ERR btrfs: unable to start balance with target 
   data profile %llu\n,
   (unsigned long long)bctl-data.target);
ret = -EINVAL;
goto out;
}
-   if (!alloc_profile_is_valid(bctl-meta.target, 1) ||
-   bctl-meta.target  ~allowed) {
+   if ((bctl-meta.flags  BTRFS_BALANCE_ARGS_CONVERT) 
+   (!alloc_profile_is_valid(bctl-meta.target, 1) ||
+(bctl-meta.target  ~allowed))) {
printk(KERN_ERR btrfs: unable to start balance with target 
   metadata profile %llu\n,
   (unsigned long long)bctl-meta.target);
ret = -EINVAL;
goto out;
}
-   if (!alloc_profile_is_valid(bctl-sys.target, 1) ||
-   bctl-sys.target  ~allowed) {
+   if ((bctl-sys.flags  BTRFS_BALANCE_ARGS_CONVERT) 
+   (!alloc_profile_is_valid(bctl-sys.target, 1) ||
+(bctl-sys.target  ~allowed))) {
printk(KERN_ERR btrfs: unable to start balance with target 
   system profile %llu\n,
   (unsigned long long)bctl-sys.target);
@@ -2718,7 +2713,8 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
goto out;
}
 
-   if (bctl-data.target  BTRFS_BLOCK_GROUP_DUP) {
+   if ((bctl-data.flags  BTRFS_BALANCE_ARGS_CONVERT) 
+   (bctl-data.target  BTRFS_BLOCK_GROUP_DUP)) {
printk(KERN_ERR btrfs: dup for data is not allowed\n);
ret = -EINVAL;
goto out;
@@ -2744,7 +2740,6 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
}
}
 
-do_balance:
ret = insert_balance_item(fs_info-tree_root, bctl);
if (ret  ret != -EEXIST)
goto out;
-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] Btrfs: allow dup for data chunks in mixed mode

2012-03-27 Thread Ilya Dryomov
Generally we don't allow dup for data, but mixed chunks are special and
people seem to think this has its use cases.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/volumes.c |   13 +
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 28addea..bcc0acd 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2650,6 +2650,7 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
 {
struct btrfs_fs_info *fs_info = bctl-fs_info;
u64 allowed;
+   int mixed = 0;
int ret;
 
if (btrfs_fs_closing(fs_info) ||
@@ -2659,13 +2660,16 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
goto out;
}
 
+   allowed = btrfs_super_incompat_flags(fs_info-super_copy);
+   if (allowed  BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS)
+   mixed = 1;
+
/*
 * In case of mixed groups both data and meta should be picked,
 * and identical options should be given for both of them.
 */
-   allowed = btrfs_super_incompat_flags(fs_info-super_copy);
-   if ((allowed  BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS) 
-   (bctl-flags  (BTRFS_BALANCE_DATA | BTRFS_BALANCE_METADATA))) {
+   allowed = BTRFS_BALANCE_DATA | BTRFS_BALANCE_METADATA;
+   if (mixed  (bctl-flags  allowed)) {
if (!(bctl-flags  BTRFS_BALANCE_DATA) ||
!(bctl-flags  BTRFS_BALANCE_METADATA) ||
memcmp(bctl-data, bctl-meta, sizeof(bctl-data))) {
@@ -2713,7 +2717,8 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
goto out;
}
 
-   if ((bctl-data.flags  BTRFS_BALANCE_ARGS_CONVERT) 
+   /* allow dup'ed data chunks only in mixed mode */
+   if (!mixed  (bctl-data.flags  BTRFS_BALANCE_ARGS_CONVERT) 
(bctl-data.target  BTRFS_BLOCK_GROUP_DUP)) {
printk(KERN_ERR btrfs: dup for data is not allowed\n);
ret = -EINVAL;
-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix infinite loop in btrfs_shrink_device()

2012-03-27 Thread Ilya Dryomov
If relocate of block group 0 fails with ENOSPC we end up infinitely
looping because key.offset -= 1 statement in that case brings us back to
where we started.

Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/volumes.c |5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bcc0acd..be2d4e0 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2987,7 +2987,7 @@ again:
key.offset = (u64)-1;
key.type = BTRFS_DEV_EXTENT_KEY;
 
-   while (1) {
+   do {
ret = btrfs_search_slot(NULL, root, key, path, 0, 0);
if (ret  0)
goto done;
@@ -3029,8 +3029,7 @@ again:
goto done;
if (ret == -ENOSPC)
failed++;
-   key.offset -= 1;
-   }
+   } while (key.offset--  0);
 
if (failed  !retried) {
failed = 0;
-- 
1.7.9.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: open_ctree failed

2012-03-27 Thread Not Zippy
I had found that note on the restore but my restore.c does not allow
that flag (it is also missing the m flag as well), I used the branch
dangerousdonteveruse on
https://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git I
switched to the master branch to see if there was a difference but it
does not appear to be any different. (I did find a btrfs-progs on
git-hub which appears to have those flags, but i thought the best to
use would be on git.kernel. )

Assuming I can locate the correct restore.c, is there a some other
software to determine the object id of the subvolume ?  the root
object id was 5

thanks
Nz

On Tue, Mar 27, 2012 at 6:02 AM, Hugo Mills h...@carfax.org.uk wrote:
 On Tue, Mar 27, 2012 at 05:58:17AM -0700, Not Zippy wrote:
 One entire subvolume was restored. But there were 4 subvolumes on that
 partition. Is there a way to specify/force the restore of a different
 subvolume ?

 find-root seems to only find a single root.

   There is only a single root tree, so that's understandable. If you
 have a look at the documentation for restore[1], it mentions (right
 near the bottom of the page) that -r will allow you to select an
 alternative subvolume to recover from.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed, scrub ok

2012-03-27 Thread cwillu
On Tue, Mar 27, 2012 at 4:57 AM, Christoph Groth c...@falma.de wrote:
 I have a freshly installed system with btrfs as the root file system.
 The machine is running linux 3.2.  The raid1 btrfs file system lives on
 two new hard drives.

 About one day after installation the following message appeared in
 kern.log.  There were no other errors.

 root@mim:/var/log# grep 'btrfs.*fail' kern.log
 Mar 27 01:07:46 mim kernel: [ 6480.233861] btrfs csum failed ino 453509 off 
 1495040 csum 3301532933 private 4156998194
 Mar 27 01:07:46 mim kernel: [ 6480.234470] btrfs csum failed ino 453509 off 
 1499136 csum 1873118812 private 3512102188
 Mar 27 01:07:46 mim kernel: [ 6480.234572] btrfs csum failed ino 453509 off 
 1503232 csum 1034640717 private 2041007647
 Mar 27 01:07:46 mim kernel: [ 6480.234670] btrfs csum failed ino 453509 off 
 1507328 csum 889729013 private 2342095239
 Mar 27 01:07:46 mim kernel: [ 6480.237977] btrfs csum failed ino 453509 off 
 1503232 csum 1518679450 private 2041007647
 Mar 27 01:07:46 mim kernel: [ 6480.238149] btrfs csum failed ino 453509 off 
 1507328 csum 889729013 private 2342095239
 Mar 27 01:07:46 mim kernel: [ 6480.238330] btrfs csum failed ino 453509 off 
 1495040 csum 3234580989 private 4156998194
 Mar 27 01:07:46 mim kernel: [ 6480.238447] btrfs csum failed ino 453509 off 
 1499136 csum 1873118812 private 3512102188
 Mar 27 01:07:46 mim kernel: [ 6480.243873] btrfs csum failed ino 453509 off 
 1503232 csum 2184012753 private 2041007647
 Mar 27 01:07:46 mim kernel: [ 6480.243962] btrfs csum failed ino 453509 off 
 1507328 csum 240604621 private 2342095239

 inode 453509 belongs to a file installed by dpkg

 root@mim:/# find / -inum 453509 -ls
 453509 1976 -rw-r--r--   1 root     root      2020832 Mar  7 21:11 
 /usr/lib/libreoffice/basis3.4/program/libsblx.so

 That file seems to be ok, there are no errors when re-reading it.

 A scrub done the morning after the incident also didn't find any
 problems:

 root@mim:/home/cwg# btrfs scrub status /
 scrub status for 2da00153-f9ea-4d6c-a6cc-10c913d22686
        scrub started at Tue Mar 27 10:37:49 2012 and finished after 3921 
 seconds
        total bytes scrubbed: 550.20GB with 0 errors

If btrfs is able to find a good copy, it will fix the bad copy automatically.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed, scrub ok

2012-03-27 Thread Jan Schmidt
On 27.03.2012 18:24, cwillu wrote:
 On Tue, Mar 27, 2012 at 4:57 AM, Christoph Groth c...@falma.de wrote:
 A scrub done the morning after the incident also didn't find any
 problems:

 root@mim:/home/cwg# btrfs scrub status /
 scrub status for 2da00153-f9ea-4d6c-a6cc-10c913d22686
scrub started at Tue Mar 27 10:37:49 2012 and finished after 3921 
 seconds
total bytes scrubbed: 550.20GB with 0 errors
 
 If btrfs is able to find a good copy, it will fix the bad copy automatically.

It does mention this in your logs, though. Grep for repair, if it
doesn't occur, btrfs didn't repair any failures.

Scrub would normally find and count checksum errors, though.

-Jan
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs csum failed, scrub ok

2012-03-27 Thread Christoph Groth
Jan Schmidt list.bt...@jan-o-sch.net writes:
 On 27.03.2012 18:24, cwillu wrote:
 On Tue, Mar 27, 2012 at 4:57 AM, Christoph Groth c...@falma.de wrote:
 A scrub done the morning after the incident also didn't find any
 problems:

 root@mim:/home/cwg# btrfs scrub status /
 scrub status for 2da00153-f9ea-4d6c-a6cc-10c913d22686
scrub started at Tue Mar 27 10:37:49 2012 and finished after 3921 
 seconds
total bytes scrubbed: 550.20GB with 0 errors
 
 If btrfs is able to find a good copy, it will fix the bad copy automatically.

 It does mention this in your logs, though. Grep for repair, if it
 doesn't occur, btrfs didn't repair any failures.

repair doesn't occur in the logs.  Actually, there are no other
entries from btrfs.

So why didn't btrfs try to repair a block it believed to be bad?

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Create subvolume from a directory?

2012-03-27 Thread Alex
Hi all,

Just a quick question but can't find an obvious answer.

Can I create/convert a existing (btrfs) directory into a subvolume?

It would be very helpful when transferring 'partitions' into btrfs.
I found a similar question way back in google, but that site is
down now generally.

Thanks in advance.





--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs: open_ctree failed

2012-03-27 Thread Not Zippy
Thought I would let you know I did get things figured out. I used
btrfs-progs from github
https://github.com/josefbacik/btrfs-progs

I also used the findroot function from there which generated more
possibilities for the root objectid.
By pluging in the guesses from findroot into -r objectid for the
restore I was able to access the data from my subvolumes.

thanks
Nz

On Tue, Mar 27, 2012 at 8:21 AM, Not Zippy notzi...@gmail.com wrote:
 I had found that note on the restore but my restore.c does not allow
 that flag (it is also missing the m flag as well), I used the branch
 dangerousdonteveruse on
 https://git.kernel.org/?p=linux/kernel/git/mason/btrfs-progs.git I
 switched to the master branch to see if there was a difference but it
 does not appear to be any different. (I did find a btrfs-progs on
 git-hub which appears to have those flags, but i thought the best to
 use would be on git.kernel. )

 Assuming I can locate the correct restore.c, is there a some other
 software to determine the object id of the subvolume ?  the root
 object id was 5

 thanks
 Nz

 On Tue, Mar 27, 2012 at 6:02 AM, Hugo Mills h...@carfax.org.uk wrote:
 On Tue, Mar 27, 2012 at 05:58:17AM -0700, Not Zippy wrote:
 One entire subvolume was restored. But there were 4 subvolumes on that
 partition. Is there a way to specify/force the restore of a different
 subvolume ?

 find-root seems to only find a single root.

   There is only a single root tree, so that's understandable. If you
 have a look at the documentation for restore[1], it mentions (right
 near the bottom of the page) that -r will allow you to select an
 alternative subvolume to recover from.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create subvolume from a directory?

2012-03-27 Thread Chester
On Tue, Mar 27, 2012 at 12:19 PM, Alex a...@bpmit.com wrote:
 Hi all,

 Just a quick question but can't find an obvious answer.

 Can I create/convert a existing (btrfs) directory into a subvolume?

 It would be very helpful when transferring 'partitions' into btrfs.
 I found a similar question way back in google, but that site is
 down now generally.

 Thanks in advance.



I don't think this is possible. The closest thing I can think of is to
take a snapshot of the volume, move the directory to the top of the
subvolume, then delete all other content..

... That seems like an awful amount of work, but it'll preserve the
contents of the directory without making duplicates.




 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create subvolume from a directory?

2012-03-27 Thread Matthias G. Eckermann
Hello Alex and all,

On 2012-03-27 T 17:19 + Alex wrote:

 Just a quick question but can't find an obvious answer.
 
 Can I create/convert a existing (btrfs) directory into a
 subvolume?
 
 It would be very helpful when transferring 'partitions'
 into btrfs.  I found a similar question way back in
 google, but that site is down now generally.

As far as I am aware, this is not possible directly. My
approach to this would be using copy with reflinks:

-- snip --

## migrate /var/lib/lxc/installserver 
## from directory to btrfs subvolume

# du -ks /var/lib/lxc/installserver
500332  /var/lib/lxc/installserver

# mv /var/lib/lxc/installserver /var/lib/lxc/installserver_tmp

# btrfs subvol create /var/lib/lxc/installserver
Create subvolume '/var/lib/lxc/installserver'

# time cp -a --reflink /var/lib/lxc/installserver_tmp/rootfs 
/var/lib/lxc/installserver

real0m1.367s
user0m0.148s
sys 0m1.108s

## Now remove /var/lib/lxc/installserver_tmp (or not)

-- snap --

Just to compare this with a mv:

-- snip --

## Go back to former state

# btrfs subvol delete /var/lib/lxc/installserver
Delete subvolume '/var/lib/lxc/installserver'

# btrfs subvol create /var/lib/lxc/installserver
Create subvolume '/var/lib/lxc/installserver'

# time mv /var/lib/lxc/installserver_tmp/rootfs /var/lib/lxc/installserver/

real0m12.917s
user0m0.208s 
sys 0m2.508s 

-- snap --

While the time measurement might be flawed due to the subvol
actions inbetween, caching etc.: I tried several times, and
cp --reflinks always is multiple times faster than mv in
my environment.

Or did I misunderstand your question?

so long -
MgE

-- 
Matthias G. Eckermann Senior Product Manager   SUSE® Linux Enterprise
SUSE LINUX Products GmbH  Maxfeldstraße 5  90409 Nürnberg Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs crash after disk reconnect

2012-03-27 Thread Jan Engelhardt
On Monday 2012-03-26 03:42, Liu Bo wrote:

On 03/23/2012 08:07 PM, Jan Engelhardt wrote:
 Observed on Linux 3.2.9 after the controller/disk flaked in-out.
 (The world still needs a SCSI error decoding tool to tell normal people 
 what cmd and res are about.)
 

I'm not that sure if your 3.2.9-jng4-default build contains this commit or not:

commit 8bedd51b6121c4607784d75f852828d25d119c52
(Btrfs: Check for NULL page in extent_range_uptodate)

8bedd isn't in 3.2.9; thanks for the hint, I will try that one.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create subvolume from a directory?

2012-03-27 Thread Fajar A. Nugraha
On Wed, Mar 28, 2012 at 5:24 AM, Matthias G. Eckermann m...@suse.com wrote:
 While the time measurement might be flawed due to the subvol
 actions inbetween, caching etc.: I tried several times, and
 cp --reflinks always is multiple times faster than mv in
 my environment.

So this is cross-subvolume reflinks? I thought the code for that
wasn't merged yet?

-- 
Fajar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Create subvolume from a directory?

2012-03-27 Thread Liu Bo
On 03/28/2012 06:24 AM, Matthias G. Eckermann wrote:
 Hello Alex and all,
 
 On 2012-03-27 T 17:19 + Alex wrote:
 
 Just a quick question but can't find an obvious answer.

 Can I create/convert a existing (btrfs) directory into a
 subvolume?

 It would be very helpful when transferring 'partitions'
 into btrfs.  I found a similar question way back in
 google, but that site is down now generally.
 
 As far as I am aware, this is not possible directly. My
 approach to this would be using copy with reflinks:
 
 -- snip --
 
 ## migrate /var/lib/lxc/installserver 
 ## from directory to btrfs subvolume
 
 # du -ks /var/lib/lxc/installserver
 500332  /var/lib/lxc/installserver
 
 # mv /var/lib/lxc/installserver /var/lib/lxc/installserver_tmp
 
 # btrfs subvol create /var/lib/lxc/installserver
 Create subvolume '/var/lib/lxc/installserver'
 
 # time cp -a --reflink /var/lib/lxc/installserver_tmp/rootfs 
 /var/lib/lxc/installserver
 

This is too much weird.

AFAIK, clone between different subvolumes should be forbidden.
So this would get a Invalid cross-device link, because an individual 
subvolume can be mounted directly.

thanks,
liubo

 real0m1.367s
 user0m0.148s
 sys 0m1.108s
 
 ## Now remove /var/lib/lxc/installserver_tmp (or not)
 
 -- snap --
 
 Just to compare this with a mv:
 
 -- snip --
 
 ## Go back to former state
 
 # btrfs subvol delete /var/lib/lxc/installserver
 Delete subvolume '/var/lib/lxc/installserver'
 
 # btrfs subvol create /var/lib/lxc/installserver
 Create subvolume '/var/lib/lxc/installserver'
 
 # time mv /var/lib/lxc/installserver_tmp/rootfs /var/lib/lxc/installserver/
 
 real0m12.917s
 user0m0.208s 
 sys 0m2.508s 
 
 -- snap --
 
 While the time measurement might be flawed due to the subvol
 actions inbetween, caching etc.: I tried several times, and
 cp --reflinks always is multiple times faster than mv in
 my environment.
 
 Or did I misunderstand your question?
 
 so long -
   MgE
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html