Re: snapshot space available

2013-08-27 Thread Duncan
Chris Murphy posted on Mon, 26 Aug 2013 09:15:46 -0600 as excerpted:

 I'm uncertain if autodefrag avoids this problem. It does seem like in
 certain instances, like this, the file system needs to be able to prune
 itself somehow, like a partial balance to consolidate data chunks and
 then release their space so they can become metadata chunks.

Intriguing question re defrag.  I hadn't thought of the possibility until 
you suggested it, but indeed, tracking hundreds of extents instead of one 
or a dozen, does sound like it could reduce metadata usage, reducing the 
chance of running into the issue in the first place as well as lessening 
the chance of a simple delete temporarily requiring significant new 
metadata resources in ordered to track all those extent frees before the 
final atomic root entry update.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: stop refusing the relocation of chunk 0

2013-08-27 Thread Ilya Dryomov
AFAICT chunk 0 is no longer special, and so it should be restriped just
like every other chunk.  One reason for this change is us refusing the
relocation can lead to filesystems that can only be mounted ro, and
never rw -- see the bugzilla [1] for details.  The other reason is that
device removal code is already doing this: it will happily relocate
chunk 0 is part of shrinking the device.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=60594

Reported-by: Xavier Bassery xav...@bartica.org
Signed-off-by: Ilya Dryomov idryo...@gmail.com
---
 fs/btrfs/volumes.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index c9a0977..e8325de 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2991,10 +2991,6 @@ again:
if (found_key.objectid != key.objectid)
break;
 
-   /* chunk zero is special */
-   if (found_key.offset == 0)
-   break;
-
chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk);
 
if (!counting) {
@@ -3030,6 +3026,8 @@ again:
spin_unlock(fs_info-balance_lock);
}
 loop:
+   if (found_key.offset == 0)
+   break;
key.offset = found_key.offset - 1;
}
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: handle errors when doing slow caching

2013-08-27 Thread Josef Bacik
On Tue, Aug 27, 2013 at 01:26:36PM +0300, Alex Lyakas wrote:
 Hi Josef,
 thanks for addressing this.
 
 On Mon, Aug 5, 2013 at 6:19 PM, Josef Bacik jba...@fusionio.com wrote:
  Alex Lyakas reported a bug where wait_block_group_cache_progress() would 
  wait
  forever if a drive failed.  This is because we just bail out if there is an
  error while trying to cache a block group, we don't update anybody who may 
  be
  waiting.  So this introduces a new enum for the cache state in case of 
  error and
  makes everybody bail out if we have an error.  Alex tested and verified this
  patch fixed his problem.  This fixes bz 59431.  Thanks,
 
  Signed-off-by: Josef Bacik jba...@fusionio.com
  ---
   fs/btrfs/ctree.h   |1 +
   fs/btrfs/extent-tree.c |   27 ---
   2 files changed, 21 insertions(+), 7 deletions(-)
 
  diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
  index cbb1263..c17acbc 100644
  --- a/fs/btrfs/ctree.h
  +++ b/fs/btrfs/ctree.h
  @@ -1188,6 +1188,7 @@ enum btrfs_caching_type {
  BTRFS_CACHE_STARTED = 1,
  BTRFS_CACHE_FAST= 2,
  BTRFS_CACHE_FINISHED= 3,
  +   BTRFS_CACHE_ERROR   = 4,
   };
 
   enum btrfs_disk_cache_state {
  diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
  index e868c35..e6dfa7f 100644
  --- a/fs/btrfs/extent-tree.c
  +++ b/fs/btrfs/extent-tree.c
  @@ -113,7 +113,8 @@ static noinline int
   block_group_cache_done(struct btrfs_block_group_cache *cache)
   {
  smp_mb();
  -   return cache-cached == BTRFS_CACHE_FINISHED;
  +   return cache-cached == BTRFS_CACHE_FINISHED ||
  +   cache-cached == BTRFS_CACHE_ERROR;
   }
 
   static int block_group_bits(struct btrfs_block_group_cache *cache, u64 
  bits)
  @@ -389,7 +390,7 @@ static noinline void caching_thread(struct btrfs_work 
  *work)
  u64 total_found = 0;
  u64 last = 0;
  u32 nritems;
  -   int ret = 0;
  +   int ret = -ENOMEM;
 
  caching_ctl = container_of(work, struct btrfs_caching_control, 
  work);
  block_group = caching_ctl-block_group;
  @@ -517,6 +518,12 @@ err:
 
  mutex_unlock(caching_ctl-mutex);
   out:
  +   if (ret) {
  +   spin_lock(block_group-lock);
  +   block_group-caching_ctl = NULL;
  +   block_group-cached = BTRFS_CACHE_ERROR;
  +   spin_unlock(block_group-lock);
  +   }
  wake_up(caching_ctl-wait);
 
  put_caching_control(caching_ctl);
  @@ -6035,8 +6042,11 @@ static u64 stripe_align(struct btrfs_root *root,
* for our min num_bytes.  Another option is to have it go ahead
* and look in the rbtree for a free extent of a given size, but this
* is a good start.
  + *
  + * Callers of this must check if cache-cached == BTRFS_CACHE_ERROR before 
  using
  + * any of the information in this block group.
*/
  -static noinline int
  +static noinline void
   wait_block_group_cache_progress(struct btrfs_block_group_cache *cache,
  u64 num_bytes)
   {
  @@ -6044,28 +6054,29 @@ wait_block_group_cache_progress(struct 
  btrfs_block_group_cache *cache,
 
  caching_ctl = get_caching_control(cache);
  if (!caching_ctl)
  -   return 0;
  +   return;
 
  wait_event(caching_ctl-wait, block_group_cache_done(cache) ||
 (cache-free_space_ctl-free_space = num_bytes));
 
  put_caching_control(caching_ctl);
  -   return 0;
   }
 
   static noinline int
   wait_block_group_cache_done(struct btrfs_block_group_cache *cache)
   {
  struct btrfs_caching_control *caching_ctl;
  +   int ret = 0;
 
  caching_ctl = get_caching_control(cache);
  if (!caching_ctl)
  return 0;
 In case caching_thread completes with error for this block group,
 get_caching_control() will return NULL.
 So this function will return success, although the block group was not
 cached properly.
 Currently only btrfs_trim_fs() caller checks the return value of this
 function, although you didn't post the btrfs_trim_fs() change in this
 patch (but you posed it in the bugzilla). Still, should we check the
 cache-cached for ERROR even if there is no caching control?


Yeah I'll fix that up, sorry about that.  Thanks,

Josef 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Josef Bacik
On Mon, Aug 26, 2013 at 02:53:26PM -0700, Zach Brown wrote:
  With this we can
  go through and convert any BUG_ON()'s that we have to catch actual 
  programming
  mistakes to the new ASSERT() and then fix everybody else to return errors.
 
 I like the sound of that!
 
  --- a/fs/btrfs/ctree.h
  +++ b/fs/btrfs/ctree.h
  @@ -3814,6 +3814,22 @@ void btrfs_printk(const struct btrfs_fs_info 
  *fs_info, const char *fmt, ...)
   #define btrfs_debug(fs_info, fmt, args...) \
  btrfs_printk(fs_info, KERN_DEBUG fmt, ##args)
   
  +#ifdef BTRFS_ASSERT
  +
  +static inline void assfail(char *expr, char *file, int lin)
  +{
  +   printk(KERN_ERR BTRFS assertion failed: %s, file: %s, line: %d,
  +  expr, file, line);
  +   BUG();
  +}
 
 I'm not sure why this is needed.
 
  +#define ASSERT(expr)   \
  +   (unlikely(expr) ? (void)0 : assfail(#expr, __FILE__, __LINE__))
 
 (Passing the assertion is unlikely()?  I know, this is from xfs...
 still.)


Yeah I copy+pasted and then thought about it after I sent it.  I will fix it up.
 
  +#else
  +#define ASSERT(expr)   ((void)0)
  +#endif
 
 Anyway, if you're going to do it this way, why not:
 
   #ifdef BTRFS_ASSERT
   #define btrfs_assert(cond)  BUG_ON(!(cond))
   #else
   #define btrfs_assert(cond)  do { if (cond) ; } while (0)
   #endif


I like the verbosity, especially with random kernel versions and such, it will
help me figure out where we BUG_ON()'ed without having to checkout a particular
version and go hunting.  Thanks,

Josef 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix deadlock in uuid scan kthread

2013-08-27 Thread Filipe David Borba Manana
If there's an ongoing transaction when the uuid scan kthread attempts
to create one, the kthread will block, waiting for that transaction to
finish while it's keeping locks on the tree root, and in turn the existing
transaction is waiting for those locks to be free.

The stack trace reported by the kernel follows.

[36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
[36700.671602] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671602] btrfs-uuid  D  0 15480  2 0x
[36700.671604]  880710bd5b88 0046 8803d36ba850 
0003
[36700.671605]  8806d76dc530 880710bd5fd8 880710bd5fd8 
880710bd5fd8
[36700.671607]  8808098ac530 8806d76dc530 880710bd5b98 
8805e4508e40
[36700.671608] Call Trace:
[36700.671610]  [816f36b9] schedule+0x29/0x70
[36700.671620]  [a05a3bdf] wait_current_trans.isra.33+0xbf/0x120 
[btrfs]
[36700.671623]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671629]  [a05a5b06] start_transaction+0x3d6/0x530 [btrfs]
[36700.671636]  [a05bb1f4] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
[36700.671642]  [a05a5fbb] btrfs_start_transaction+0x1b/0x20 [btrfs]
[36700.671649]  [a05c8a81] btrfs_uuid_scan_kthread+0x211/0x3d0 [btrfs]
[36700.671655]  [a05c8870] ? __btrfs_open_devices+0x2a0/0x2a0 [btrfs]
[36700.671657]  [81065fa0] kthread+0xc0/0xd0
[36700.671659]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671661]  [816fcd1c] ret_from_fork+0x7c/0xb0
[36700.671662]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
[36700.671664] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671665] btrfs   D  0 15481  15212 0x0004
[36700.671666]  880248cbf4c8 0086 8803d36ba700 
8801dbd5c280
[36700.671668]  880807815c40 880248cbffd8 880248cbffd8 
880248cbffd8
[36700.671669]  8805e86a 880807815c40 880248cbf4d8 
8801dbd5c280
[36700.671670] Call Trace:
[36700.671672]  [816f36b9] schedule+0x29/0x70
[36700.671679]  [a05d9b0d] btrfs_tree_lock+0x6d/0x230 [btrfs]
[36700.671680]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671685]  [a0582829] btrfs_search_slot+0x999/0xb00 [btrfs]
[36700.671691]  [a05bd9de] ? 
btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
[36700.671698]  [a05e3e54] __btrfs_write_out_cache+0x8c4/0xa80 [btrfs]
[36700.671704]  [a05e4362] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
[36700.671710]  [a05c4441] ? free_extent_buffer+0x61/0xc0 [btrfs]
[36700.671716]  [a0594c82] btrfs_write_dirty_block_groups+0x562/0x650 
[btrfs]
[36700.671723]  [a0610092] commit_cowonly_roots+0x171/0x24b [btrfs]
[36700.671729]  [a05a4dde] btrfs_commit_transaction+0x4fe/0xa10 
[btrfs]
[36700.671735]  [a0610af3] create_subvol+0x5c0/0x636 [btrfs]
[36700.671742]  [a05d49ff] btrfs_mksubvol.isra.60+0x33f/0x3f0 [btrfs]
[36700.671747]  [a05d4bf2] 
btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
[36700.671752]  [a05d4c6c] ? btrfs_ioctl_snap_create+0x2c/0x80 [btrfs]
[36700.671757]  [a05d4c9e] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
[36700.671759]  [8113a764] ? handle_pte_fault+0x84/0x920
[36700.671764]  [a05d87eb] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
[36700.671766]  [8113c120] ? handle_mm_fault+0x210/0x310
[36700.671768]  [816f83a4] ? __do_page_fault+0x284/0x4e0
[36700.671770]  [81180aa6] do_vfs_ioctl+0x96/0x550
[36700.671772]  [81170fe3] ? __sb_end_write+0x33/0x70
[36700.671774]  [81180ff1] SyS_ioctl+0x91/0xb0
[36700.671775]  [816fcdc2] system_call_fastpath+0x16/0x1b

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 fs/btrfs/volumes.c |   25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f42e412..44cd21b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3465,7 +3465,7 @@ static int btrfs_uuid_scan_kthread(void *data)
int slot;
struct btrfs_root_item root_item;
u32 item_size;
-   struct btrfs_trans_handle *trans;
+   struct btrfs_trans_handle *trans = NULL;
 
path = btrfs_alloc_path();
if (!path) {
@@ -3509,7 +3509,13 @@ static int btrfs_uuid_scan_kthread(void *data)
   (int)sizeof(root_item));
if (btrfs_root_refs(root_item) == 0)
goto skip;
-   if (!btrfs_is_empty_uuid(root_item.uuid)) {
+
+   if (!btrfs_is_empty_uuid(root_item.uuid) ||
+   !btrfs_is_empty_uuid(root_item.received_uuid)) {
+   if (trans)
+   goto 

[PATCH] Btrfs: handle errors when doing slow caching V2

2013-08-27 Thread Josef Bacik
Alex Lyakas reported a bug where wait_block_group_cache_progress() would wait
forever if a drive failed.  This is because we just bail out if there is an
error while trying to cache a block group, we don't update anybody who may be
waiting.  So this introduces a new enum for the cache state in case of error and
makes everybody bail out if we have an error.  Alex tested and verified this
patch fixed his problem.  This fixes bz 59431.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
V1-V2: return an error in wait_block_group_cache_done if there is no caching
ctl and we have BTRFS_CACHE_ERROR set.

 fs/btrfs/ctree.h   |1 +
 fs/btrfs/extent-tree.c |   32 +++-
 2 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index cbb1263..c17acbc 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1188,6 +1188,7 @@ enum btrfs_caching_type {
BTRFS_CACHE_STARTED = 1,
BTRFS_CACHE_FAST= 2,
BTRFS_CACHE_FINISHED= 3,
+   BTRFS_CACHE_ERROR   = 4,
 };
 
 enum btrfs_disk_cache_state {
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index e868c35..a073f3e 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -113,7 +113,8 @@ static noinline int
 block_group_cache_done(struct btrfs_block_group_cache *cache)
 {
smp_mb();
-   return cache-cached == BTRFS_CACHE_FINISHED;
+   return cache-cached == BTRFS_CACHE_FINISHED ||
+   cache-cached == BTRFS_CACHE_ERROR;
 }
 
 static int block_group_bits(struct btrfs_block_group_cache *cache, u64 bits)
@@ -389,7 +390,7 @@ static noinline void caching_thread(struct btrfs_work *work)
u64 total_found = 0;
u64 last = 0;
u32 nritems;
-   int ret = 0;
+   int ret = -ENOMEM;
 
caching_ctl = container_of(work, struct btrfs_caching_control, work);
block_group = caching_ctl-block_group;
@@ -517,6 +518,12 @@ err:
 
mutex_unlock(caching_ctl-mutex);
 out:
+   if (ret) {
+   spin_lock(block_group-lock);
+   block_group-caching_ctl = NULL;
+   block_group-cached = BTRFS_CACHE_ERROR;
+   spin_unlock(block_group-lock);
+   }
wake_up(caching_ctl-wait);
 
put_caching_control(caching_ctl);
@@ -6035,8 +6042,11 @@ static u64 stripe_align(struct btrfs_root *root,
  * for our min num_bytes.  Another option is to have it go ahead
  * and look in the rbtree for a free extent of a given size, but this
  * is a good start.
+ *
+ * Callers of this must check if cache-cached == BTRFS_CACHE_ERROR before 
using
+ * any of the information in this block group.
  */
-static noinline int
+static noinline void
 wait_block_group_cache_progress(struct btrfs_block_group_cache *cache,
u64 num_bytes)
 {
@@ -6044,28 +6054,29 @@ wait_block_group_cache_progress(struct 
btrfs_block_group_cache *cache,
 
caching_ctl = get_caching_control(cache);
if (!caching_ctl)
-   return 0;
+   return;
 
wait_event(caching_ctl-wait, block_group_cache_done(cache) ||
   (cache-free_space_ctl-free_space = num_bytes));
 
put_caching_control(caching_ctl);
-   return 0;
 }
 
 static noinline int
 wait_block_group_cache_done(struct btrfs_block_group_cache *cache)
 {
struct btrfs_caching_control *caching_ctl;
+   int ret = 0;
 
caching_ctl = get_caching_control(cache);
if (!caching_ctl)
-   return 0;
+   return (cache-cached == BTRFS_CACHE_ERROR) ? -EIO : 0;
 
wait_event(caching_ctl-wait, block_group_cache_done(cache));
-
+   if (cache-cached == BTRFS_CACHE_ERROR)
+   ret = -EIO;
put_caching_control(caching_ctl);
-   return 0;
+   return ret;
 }
 
 int __get_raid_index(u64 flags)
@@ -6248,6 +6259,8 @@ have_block_group:
ret = 0;
}
 
+   if (unlikely(block_group-cached == BTRFS_CACHE_ERROR))
+   goto loop;
if (unlikely(block_group-ro))
goto loop;
 
@@ -8230,7 +8243,8 @@ int btrfs_free_block_groups(struct btrfs_fs_info *info)
 * We haven't cached this block group, which means we could
 * possibly have excluded extents on this block group.
 */
-   if (block_group-cached == BTRFS_CACHE_NO)
+   if (block_group-cached == BTRFS_CACHE_NO ||
+   block_group-cached == BTRFS_CACHE_ERROR)
free_excluded_extents(info-extent_root, block_group);
 
btrfs_remove_free_space_cache(block_group);
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix deadlock in uuid scan kthread

2013-08-27 Thread Josef Bacik
On Tue, Aug 27, 2013 at 04:51:55PM +0100, Filipe David Borba Manana wrote:
 If there's an ongoing transaction when the uuid scan kthread attempts
 to create one, the kthread will block, waiting for that transaction to
 finish while it's keeping locks on the tree root, and in turn the existing
 transaction is waiting for those locks to be free.
 
 The stack trace reported by the kernel follows.
 
 [36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
 [36700.671602] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [36700.671602] btrfs-uuid  D  0 15480  2 
 0x
 [36700.671604]  880710bd5b88 0046 8803d36ba850 
 0003
 [36700.671605]  8806d76dc530 880710bd5fd8 880710bd5fd8 
 880710bd5fd8
 [36700.671607]  8808098ac530 8806d76dc530 880710bd5b98 
 8805e4508e40
 [36700.671608] Call Trace:
 [36700.671610]  [816f36b9] schedule+0x29/0x70
 [36700.671620]  [a05a3bdf] wait_current_trans.isra.33+0xbf/0x120 
 [btrfs]
 [36700.671623]  [81066760] ? add_wait_queue+0x60/0x60
 [36700.671629]  [a05a5b06] start_transaction+0x3d6/0x530 [btrfs]
 [36700.671636]  [a05bb1f4] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
 [36700.671642]  [a05a5fbb] btrfs_start_transaction+0x1b/0x20 [btrfs]
 [36700.671649]  [a05c8a81] btrfs_uuid_scan_kthread+0x211/0x3d0 
 [btrfs]
 [36700.671655]  [a05c8870] ? __btrfs_open_devices+0x2a0/0x2a0 
 [btrfs]
 [36700.671657]  [81065fa0] kthread+0xc0/0xd0
 [36700.671659]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
 [36700.671661]  [816fcd1c] ret_from_fork+0x7c/0xb0
 [36700.671662]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
 [36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
 [36700.671664] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables 
 this message.
 [36700.671665] btrfs   D  0 15481  15212 
 0x0004
 [36700.671666]  880248cbf4c8 0086 8803d36ba700 
 8801dbd5c280
 [36700.671668]  880807815c40 880248cbffd8 880248cbffd8 
 880248cbffd8
 [36700.671669]  8805e86a 880807815c40 880248cbf4d8 
 8801dbd5c280
 [36700.671670] Call Trace:
 [36700.671672]  [816f36b9] schedule+0x29/0x70
 [36700.671679]  [a05d9b0d] btrfs_tree_lock+0x6d/0x230 [btrfs]
 [36700.671680]  [81066760] ? add_wait_queue+0x60/0x60
 [36700.671685]  [a0582829] btrfs_search_slot+0x999/0xb00 [btrfs]
 [36700.671691]  [a05bd9de] ? 
 btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
 [36700.671698]  [a05e3e54] __btrfs_write_out_cache+0x8c4/0xa80 
 [btrfs]
 [36700.671704]  [a05e4362] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
 [36700.671710]  [a05c4441] ? free_extent_buffer+0x61/0xc0 [btrfs]
 [36700.671716]  [a0594c82] 
 btrfs_write_dirty_block_groups+0x562/0x650 [btrfs]
 [36700.671723]  [a0610092] commit_cowonly_roots+0x171/0x24b [btrfs]
 [36700.671729]  [a05a4dde] btrfs_commit_transaction+0x4fe/0xa10 
 [btrfs]
 [36700.671735]  [a0610af3] create_subvol+0x5c0/0x636 [btrfs]
 [36700.671742]  [a05d49ff] btrfs_mksubvol.isra.60+0x33f/0x3f0 
 [btrfs]
 [36700.671747]  [a05d4bf2] 
 btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
 [36700.671752]  [a05d4c6c] ? btrfs_ioctl_snap_create+0x2c/0x80 
 [btrfs]
 [36700.671757]  [a05d4c9e] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
 [36700.671759]  [8113a764] ? handle_pte_fault+0x84/0x920
 [36700.671764]  [a05d87eb] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
 [36700.671766]  [8113c120] ? handle_mm_fault+0x210/0x310
 [36700.671768]  [816f83a4] ? __do_page_fault+0x284/0x4e0
 [36700.671770]  [81180aa6] do_vfs_ioctl+0x96/0x550
 [36700.671772]  [81170fe3] ? __sb_end_write+0x33/0x70
 [36700.671774]  [81180ff1] SyS_ioctl+0x91/0xb0
 [36700.671775]  [816fcdc2] system_call_fastpath+0x16/0x1b
 
 Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
 ---
  fs/btrfs/volumes.c |   25 ++---
  1 file changed, 14 insertions(+), 11 deletions(-)
 
 diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
 index f42e412..44cd21b 100644
 --- a/fs/btrfs/volumes.c
 +++ b/fs/btrfs/volumes.c
 @@ -3465,7 +3465,7 @@ static int btrfs_uuid_scan_kthread(void *data)
   int slot;
   struct btrfs_root_item root_item;
   u32 item_size;
 - struct btrfs_trans_handle *trans;
 + struct btrfs_trans_handle *trans = NULL;
  
   path = btrfs_alloc_path();
   if (!path) {
 @@ -3509,7 +3509,13 @@ static int btrfs_uuid_scan_kthread(void *data)
  (int)sizeof(root_item));
   if (btrfs_root_refs(root_item) == 0)
   goto skip;
 - if (!btrfs_is_empty_uuid(root_item.uuid)) {
 +
 + if 

[PATCH v2] Btrfs: fix deadlock in uuid scan kthread

2013-08-27 Thread Filipe David Borba Manana
If there's an ongoing transaction when the uuid scan kthread attempts
to create one, the kthread will block, waiting for that transaction to
finish while it's keeping locks on the tree root, and in turn the existing
transaction is waiting for those locks to be free.

The stack trace reported by the kernel follows.

[36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
[36700.671602] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671602] btrfs-uuid  D  0 15480  2 0x
[36700.671604]  880710bd5b88 0046 8803d36ba850 
0003
[36700.671605]  8806d76dc530 880710bd5fd8 880710bd5fd8 
880710bd5fd8
[36700.671607]  8808098ac530 8806d76dc530 880710bd5b98 
8805e4508e40
[36700.671608] Call Trace:
[36700.671610]  [816f36b9] schedule+0x29/0x70
[36700.671620]  [a05a3bdf] wait_current_trans.isra.33+0xbf/0x120 
[btrfs]
[36700.671623]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671629]  [a05a5b06] start_transaction+0x3d6/0x530 [btrfs]
[36700.671636]  [a05bb1f4] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
[36700.671642]  [a05a5fbb] btrfs_start_transaction+0x1b/0x20 [btrfs]
[36700.671649]  [a05c8a81] btrfs_uuid_scan_kthread+0x211/0x3d0 [btrfs]
[36700.671655]  [a05c8870] ? __btrfs_open_devices+0x2a0/0x2a0 [btrfs]
[36700.671657]  [81065fa0] kthread+0xc0/0xd0
[36700.671659]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671661]  [816fcd1c] ret_from_fork+0x7c/0xb0
[36700.671662]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
[36700.671664] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671665] btrfs   D  0 15481  15212 0x0004
[36700.671666]  880248cbf4c8 0086 8803d36ba700 
8801dbd5c280
[36700.671668]  880807815c40 880248cbffd8 880248cbffd8 
880248cbffd8
[36700.671669]  8805e86a 880807815c40 880248cbf4d8 
8801dbd5c280
[36700.671670] Call Trace:
[36700.671672]  [816f36b9] schedule+0x29/0x70
[36700.671679]  [a05d9b0d] btrfs_tree_lock+0x6d/0x230 [btrfs]
[36700.671680]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671685]  [a0582829] btrfs_search_slot+0x999/0xb00 [btrfs]
[36700.671691]  [a05bd9de] ? 
btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
[36700.671698]  [a05e3e54] __btrfs_write_out_cache+0x8c4/0xa80 [btrfs]
[36700.671704]  [a05e4362] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
[36700.671710]  [a05c4441] ? free_extent_buffer+0x61/0xc0 [btrfs]
[36700.671716]  [a0594c82] btrfs_write_dirty_block_groups+0x562/0x650 
[btrfs]
[36700.671723]  [a0610092] commit_cowonly_roots+0x171/0x24b [btrfs]
[36700.671729]  [a05a4dde] btrfs_commit_transaction+0x4fe/0xa10 
[btrfs]
[36700.671735]  [a0610af3] create_subvol+0x5c0/0x636 [btrfs]
[36700.671742]  [a05d49ff] btrfs_mksubvol.isra.60+0x33f/0x3f0 [btrfs]
[36700.671747]  [a05d4bf2] 
btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
[36700.671752]  [a05d4c6c] ? btrfs_ioctl_snap_create+0x2c/0x80 [btrfs]
[36700.671757]  [a05d4c9e] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
[36700.671759]  [8113a764] ? handle_pte_fault+0x84/0x920
[36700.671764]  [a05d87eb] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
[36700.671766]  [8113c120] ? handle_mm_fault+0x210/0x310
[36700.671768]  [816f83a4] ? __do_page_fault+0x284/0x4e0
[36700.671770]  [81180aa6] do_vfs_ioctl+0x96/0x550
[36700.671772]  [81170fe3] ? __sb_end_write+0x33/0x70
[36700.671774]  [81180ff1] SyS_ioctl+0x91/0xb0
[36700.671775]  [816fcdc2] system_call_fastpath+0x16/0x1b

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: Removed wrong assignment of NULL to transaction pointer, and
addressed code style comment from Josef.

 fs/btrfs/volumes.c |   27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f42e412..3cb9649 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3465,7 +3465,7 @@ static int btrfs_uuid_scan_kthread(void *data)
int slot;
struct btrfs_root_item root_item;
u32 item_size;
-   struct btrfs_trans_handle *trans;
+   struct btrfs_trans_handle *trans = NULL;
 
path = btrfs_alloc_path();
if (!path) {
@@ -3503,13 +3503,18 @@ static int btrfs_uuid_scan_kthread(void *data)
if (item_size  sizeof(root_item))
goto skip;
 
-   trans = NULL;
read_extent_buffer(eb, root_item,
   btrfs_item_ptr_offset(eb, slot),
   (int)sizeof(root_item));
 

[PATCH v4] Btrfs: fix deadlock in uuid scan kthread

2013-08-27 Thread Filipe David Borba Manana
If there's an ongoing transaction when the uuid scan kthread attempts
to create one, the kthread will block, waiting for that transaction to
finish while it's keeping locks on the tree root, and in turn the existing
transaction is waiting for those locks to be free.

The stack trace reported by the kernel follows.

[36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
[36700.671602] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671602] btrfs-uuid  D  0 15480  2 0x
[36700.671604]  880710bd5b88 0046 8803d36ba850 
0003
[36700.671605]  8806d76dc530 880710bd5fd8 880710bd5fd8 
880710bd5fd8
[36700.671607]  8808098ac530 8806d76dc530 880710bd5b98 
8805e4508e40
[36700.671608] Call Trace:
[36700.671610]  [816f36b9] schedule+0x29/0x70
[36700.671620]  [a05a3bdf] wait_current_trans.isra.33+0xbf/0x120 
[btrfs]
[36700.671623]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671629]  [a05a5b06] start_transaction+0x3d6/0x530 [btrfs]
[36700.671636]  [a05bb1f4] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
[36700.671642]  [a05a5fbb] btrfs_start_transaction+0x1b/0x20 [btrfs]
[36700.671649]  [a05c8a81] btrfs_uuid_scan_kthread+0x211/0x3d0 [btrfs]
[36700.671655]  [a05c8870] ? __btrfs_open_devices+0x2a0/0x2a0 [btrfs]
[36700.671657]  [81065fa0] kthread+0xc0/0xd0
[36700.671659]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671661]  [816fcd1c] ret_from_fork+0x7c/0xb0
[36700.671662]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
[36700.671664] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671665] btrfs   D  0 15481  15212 0x0004
[36700.671666]  880248cbf4c8 0086 8803d36ba700 
8801dbd5c280
[36700.671668]  880807815c40 880248cbffd8 880248cbffd8 
880248cbffd8
[36700.671669]  8805e86a 880807815c40 880248cbf4d8 
8801dbd5c280
[36700.671670] Call Trace:
[36700.671672]  [816f36b9] schedule+0x29/0x70
[36700.671679]  [a05d9b0d] btrfs_tree_lock+0x6d/0x230 [btrfs]
[36700.671680]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671685]  [a0582829] btrfs_search_slot+0x999/0xb00 [btrfs]
[36700.671691]  [a05bd9de] ? 
btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
[36700.671698]  [a05e3e54] __btrfs_write_out_cache+0x8c4/0xa80 [btrfs]
[36700.671704]  [a05e4362] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
[36700.671710]  [a05c4441] ? free_extent_buffer+0x61/0xc0 [btrfs]
[36700.671716]  [a0594c82] btrfs_write_dirty_block_groups+0x562/0x650 
[btrfs]
[36700.671723]  [a0610092] commit_cowonly_roots+0x171/0x24b [btrfs]
[36700.671729]  [a05a4dde] btrfs_commit_transaction+0x4fe/0xa10 
[btrfs]
[36700.671735]  [a0610af3] create_subvol+0x5c0/0x636 [btrfs]
[36700.671742]  [a05d49ff] btrfs_mksubvol.isra.60+0x33f/0x3f0 [btrfs]
[36700.671747]  [a05d4bf2] 
btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
[36700.671752]  [a05d4c6c] ? btrfs_ioctl_snap_create+0x2c/0x80 [btrfs]
[36700.671757]  [a05d4c9e] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
[36700.671759]  [8113a764] ? handle_pte_fault+0x84/0x920
[36700.671764]  [a05d87eb] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
[36700.671766]  [8113c120] ? handle_mm_fault+0x210/0x310
[36700.671768]  [816f83a4] ? __do_page_fault+0x284/0x4e0
[36700.671770]  [81180aa6] do_vfs_ioctl+0x96/0x550
[36700.671772]  [81170fe3] ? __sb_end_write+0x33/0x70
[36700.671774]  [81180ff1] SyS_ioctl+0x91/0xb0
[36700.671775]  [816fcdc2] system_call_fastpath+0x16/0x1b

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: Removed wrong assignment of NULL to transaction pointer, and
addressed code style comment from Josef.
V3: Removed unnecessary if statement to check if trans is NULL,
as it can't be NULL anymore in that section of the loop.
V4: Ensure that we don't leak a transaction handle if after looking
again for items we don't end up updating anything in the uuid
tree. Before we create a transaction, we release the path we
found and then repeat the key search and check if the item
still needs to be updated - if not we don't use the transaction
for that iteration of the loop - if this is true for all iterations,
we would leak the transaction handle.

 fs/btrfs/volumes.c |   39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index f42e412..ec1cf0e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3465,7 +3465,7 @@ static int btrfs_uuid_scan_kthread(void *data)
int slot;
   

Re: [PATCH v2] xfstests/btrfs/308: add snapshot-aware defrag for partial extents testcases

2013-08-27 Thread Rich Johnston

Thanks for the patch Liu Bo, it has been committed as test number 10.

commit d943515bbcb8893c5d4ae21340a6022235d9e643
Author: Liu Bo bo.li@oracle.com
Date:   Thu Jul 25 08:15:02 2013 +

xfstests/btrfs/010: add snapshot-aware defrag for partial extents 
testcases


This is to test whether snapshot-aware defrag can work well on 
partial extents.


Reviewed-by: Ben Myers b...@sgi.com
Signed-off-by: Liu Bo bo.li@oracle.com
Signed-off-by: Rich Johnston rjohns...@sgi.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: do not complete ordered extents that are truncated

2013-08-27 Thread Josef Bacik
Currently we will do the normal btrfs_finish_ordered_io if we invalidate a dirty
page that is part of an ordered extent but hasn't yet been submitted for IO.
The side effect of this is that when the rest of the ordered extent is completed
it will still add its extent to the file and it's csums to the tree.  This
usually isn't a problem since this occurs right before a truncate, so these
csums and extents are going to be cleaned up.  However if there is an error
doing the truncate we can leave extents that point to invalid data, and as such
you'll get csum errors trying to read back this extent.  So to handle this just
set a flag on the ordered extent if we are going to invalidate one of its dirty
pages, and then once we go to do the completion stuff we skip updating the file
extents and adding our csums, we just free everything up and exit.  This is
preliminary work we need to have better orphan entry error handling.  Thanks,

Signed-off-by: Josef Bacik jba...@fusionio.com
---
 fs/btrfs/inode.c|   17 +++--
 fs/btrfs/ordered-data.h |3 +++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 1c86f9d..d212b36 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2562,8 +2562,9 @@ static int btrfs_finish_ordered_io(struct 
btrfs_ordered_extent *ordered_extent)
struct extent_state *cached_state = NULL;
struct new_sa_defrag_extent *new = NULL;
int compress_type = 0;
-   int ret;
+   int ret = 0;
bool nolock;
+   bool truncated = false;
 
nolock = btrfs_is_free_space_inode(inode);
 
@@ -2572,6 +2573,11 @@ static int btrfs_finish_ordered_io(struct 
btrfs_ordered_extent *ordered_extent)
goto out;
}
 
+   if (test_bit(BTRFS_ORDERED_TRUNCATED, ordered_extent-flags)) {
+   truncated = true;
+   goto out;
+   }
+
if (test_bit(BTRFS_ORDERED_NOCOW, ordered_extent-flags)) {
BUG_ON(!list_empty(ordered_extent-list)); /* Logic error */
btrfs_ordered_update_i_size(inode, 0, ordered_extent);
@@ -2667,7 +2673,7 @@ out:
if (trans)
btrfs_end_transaction(trans, root);
 
-   if (ret) {
+   if (ret || truncated) {
clear_extent_uptodate(io_tree, ordered_extent-file_offset,
  ordered_extent-file_offset +
  ordered_extent-len - 1, NULL, GFP_NOFS);
@@ -7348,6 +7354,13 @@ static void btrfs_invalidatepage(struct page *page, 
unsigned long offset)
 EXTENT_DIRTY | EXTENT_DELALLOC |
 EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
 EXTENT_DEFRAG, 1, 0, cached_state, GFP_NOFS);
+
+   /*
+* Set this so that we don't actually do the extent update work
+* for this ordered extent.
+*/
+   set_bit(BTRFS_ORDERED_TRUNCATED, ordered-flags);
+
/*
 * whoever cleared the private bit is responsible
 * for the finish_ordered_io
diff --git a/fs/btrfs/ordered-data.h b/fs/btrfs/ordered-data.h
index 68844d5..3d7fe72 100644
--- a/fs/btrfs/ordered-data.h
+++ b/fs/btrfs/ordered-data.h
@@ -69,6 +69,9 @@ struct btrfs_ordered_sum {
   * the isize. */
 #define BTRFS_ORDERED_LOGGED_CSUM 8 /* We've logged the csums on this ordered
   ordered extent */
+#define BTRFS_ORDERED_TRUNCATED 9 /* We truncated a page in this ordered
+extent, do not allow the completion stuff
+to occur */
 
 struct btrfs_ordered_extent {
/* logical offset in the file */
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: do not complete ordered extents that are truncated

2013-08-27 Thread Josef Bacik
On Tue, Aug 27, 2013 at 01:20:14PM -0400, Josef Bacik wrote:
 Currently we will do the normal btrfs_finish_ordered_io if we invalidate a 
 dirty
 page that is part of an ordered extent but hasn't yet been submitted for IO.
 The side effect of this is that when the rest of the ordered extent is 
 completed
 it will still add its extent to the file and it's csums to the tree.  This
 usually isn't a problem since this occurs right before a truncate, so these
 csums and extents are going to be cleaned up.  However if there is an error
 doing the truncate we can leave extents that point to invalid data, and as 
 such
 you'll get csum errors trying to read back this extent.  So to handle this 
 just
 set a flag on the ordered extent if we are going to invalidate one of its 
 dirty
 pages, and then once we go to do the completion stuff we skip updating the 
 file
 extents and adding our csums, we just free everything up and exit.  This is
 preliminary work we need to have better orphan entry error handling.  Thanks,
 
 Signed-off-by: Josef Bacik jba...@fusionio.com
 ---

Ignore this, it is wrong.

Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] utils: add support for getting/changing file system, features

2013-08-27 Thread Jeff Mahoney
This patch adds support for getting and changing file system
feature bits. It supports both unmounted and mounted operation via
a new set of ioctls.

Setting bits not supported by the running kernel, if the file system is
mounted, or the tools, if the file system is unmounted, can be forced with
a -f option.

It supports clearing of feature bits, but I've left that undocumented
intentionally.

Signed-off-by: Jeff Mahoney je...@suse.com
---
 cmds-filesystem.c |  42 +
 ioctl.h   |  12 ++
 man/btrfs.8.in|  28 
 utils.c   | 478 +-
 utils.h   |  10 ++
 5 files changed, 569 insertions(+), 1 deletion(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index f41a72a..ca096e0 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -515,6 +515,47 @@ static int cmd_label(int argc, char **argv)
return get_label(argv[1]);
 }
 
+static const char * const cmd_features_usage[] = {
+   btrfs filesystem features [device|mountpoint] [[-f] features],
+   Get or change the list of features currently enabled by a filesystem,
+   With one argument, get the currently enabled features of filesystem,
+   on device or mountpoint., ,
+   If features is passed, add new features to the file system.,
+   The format of new features can be a comma separated list of names or,
+   or a comma-separated list of specifiers of the following format: A,
+   prefix of compat, compat_ro, or incompat and a decimal number, ,
+   separated by a colon: (e.g. compat:10), ,
+   Features to be enabled will be confirmed against the supported,
+   features of the kernel (if mounted) or the progs (if unmounted). This,
+   behavior can be overridden when operating on unmounted,
+   filesystems by using the -f (force) flag., ,
+   A list of features supported by the progs can be found in the manual.,
+   NULL
+};
+
+static int cmd_features(int argc, char **argv)
+{
+   if (check_argc_min(argc, 2) || check_argc_max(argc, 4))
+   usage(cmd_features_usage);
+
+   if (argc  2) {
+   char *features = argv[2];
+   int force = 0;
+   if (argc  3) {
+   if (!strcmp(argv[3], -f))
+   force = 1;
+   else if (!strcmp(argv[2], -f)) {
+   force = 1;
+   features = argv[3];
+   } else
+   usage(cmd_features_usage);
+   }
+
+   return parse_and_set_features(argv[1], features, force);
+   } else
+   return get_features(argv[1]);
+}
+
 const struct cmd_group filesystem_cmd_group = {
filesystem_cmd_group_usage, NULL, {
{ df, cmd_df, cmd_df_usage, NULL, 0 },
@@ -524,6 +565,7 @@ const struct cmd_group filesystem_cmd_group = {
{ balance, cmd_balance, NULL, balance_cmd_group, 1 },
{ resize, cmd_resize, cmd_resize_usage, NULL, 0 },
{ label, cmd_label, cmd_label_usage, NULL, 0 },
+   { features, cmd_features, cmd_features_usage, NULL, 0 },
{ 0, 0, 0, 0, 0 },
}
 };
diff --git a/ioctl.h b/ioctl.h
index abe6dd4..44483d1 100644
--- a/ioctl.h
+++ b/ioctl.h
@@ -172,6 +172,12 @@ struct btrfs_ioctl_fs_info_args {
__u64 reserved[124];/* pad to 1k */
 };
 
+struct btrfs_ioctl_feature_flags {
+   __u64 compat_flags;
+   __u64 compat_ro_flags;
+   __u64 incompat_flags;
+};
+
 /* balance control ioctl modes */
 #define BTRFS_BALANCE_CTL_PAUSE1
 #define BTRFS_BALANCE_CTL_CANCEL   2
@@ -537,6 +543,12 @@ struct btrfs_ioctl_clone_range_args {
  struct btrfs_ioctl_get_dev_stats)
 #define BTRFS_IOC_DEV_REPLACE _IOWR(BTRFS_IOCTL_MAGIC, 53, \
struct btrfs_ioctl_dev_replace_args)
+#define BTRFS_IOC_GET_FEATURES _IOR(BTRFS_IOCTL_MAGIC, 54, \
+   struct btrfs_ioctl_feature_flags)
+#define BTRFS_IOC_ADD_FEATURES _IOW(BTRFS_IOCTL_MAGIC, 55, \
+   struct btrfs_ioctl_feature_flags)
+#define BTRFS_IOC_GET_SUPPORTED_FEATURES _IOR(BTRFS_IOCTL_MAGIC, 56, \
+   struct btrfs_ioctl_feature_flags[2])
 
 #ifdef __cplusplus
 }
diff --git a/man/btrfs.8.in b/man/btrfs.8.in
index af7df4d..9c5b61e 100644
--- a/man/btrfs.8.in
+++ b/man/btrfs.8.in
@@ -31,6 +31,8 @@ btrfs \- control a btrfs filesystem
 .PP
 \fBbtrfs\fP \fBfilesystem label\fP\fI dev [newlabel]\fP
 .PP
+\fBbtrfs\fP \fBfilesystem features\fP\fI dev|mountpoint [[-f ]newlabel]\fP
+.PP
 \fBbtrfs\fP \fBfilesystem balance\fP\fI path \fP
 .PP
 \fBbtrfs\fP \fBdevice scan\fP\fI [--all-devices|device [device...]]\fP
@@ -280,6 +282,32 @@ NOTE: Currently there are the following limitations:
 - the filesystem should 

[PATCH] btrfs: add ability to query/change feature bits online

2013-08-27 Thread Jeff Mahoney
There are some feature bits that require no offline setup and can
be enabled online. I've only reviewed extended irefs, but there will
probably be more.

We introduce three new ioctls:
- BTRFS_IOC_GET_SUPPORTED_FEATURES: query the kernel for supported features
- BTRFS_IOC_GET_FEATURES: query the kernel for enabled features on a per-fs
  basis.
- BTRFS_IOC_ADD_FEATURES: enable new features on a per-fs basis.

We introduce a new mask _SAFE_ONLINE that allows us to define which
features are safe to enable at runtime.

The failure modes for BTRFS_IOC_ADD_FEATURES are as follows:
- Enabling a completely unsupported feature: warns and returns -ENOTSUPP
- Enabling a feature that can only be done offline: warns and returns -EPERM

The structure passed in is not modified on return since it is possible
to isolate which features would bounce a -EPERM by subtracting the features
provided by a GET_SUPPORTED_FEATURES call.

Signed-off-by: Jeff Mahoney je...@suse.com
---
 fs/btrfs/ctree.h   |6 ++
 fs/btrfs/ioctl.c   |  127 +
 include/uapi/linux/btrfs.h |   12 
 3 files changed, 145 insertions(+)

--- a/fs/btrfs/ctree.h  2013-08-27 11:02:35.062626912 -0400
+++ b/fs/btrfs/ctree.h  2013-08-27 11:03:32.002352821 -0400
@@ -512,7 +512,10 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL  8)
 
 #define BTRFS_FEATURE_COMPAT_SUPP  0ULL
+#define BTRFS_FEATURE_COMPAT_ONLINE_SAFE   0ULL
 #define BTRFS_FEATURE_COMPAT_RO_SUPP   0ULL
+#define BTRFS_FEATURE_COMPAT_RO_ONLINE_SAFE0ULL
+
 #define BTRFS_FEATURE_INCOMPAT_SUPP\
(BTRFS_FEATURE_INCOMPAT_MIXED_BACKREF | \
 BTRFS_FEATURE_INCOMPAT_DEFAULT_SUBVOL |\
@@ -523,6 +526,9 @@ struct btrfs_super_block {
 BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF | \
 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA)
 
+#define BTRFS_FEATURE_INCOMPAT_ONLINE_SAFE \
+   (BTRFS_FEATURE_INCOMPAT_EXTENDED_IREF)
+
 /*
  * A leaf is full of items. offset and size tell us where to find
  * the item in the leaf (relative to the start of the data area)
--- a/fs/btrfs/ioctl.c  2013-08-27 11:02:35.062626912 -0400
+++ b/fs/btrfs/ioctl.c  2013-08-27 15:08:06.900740843 -0400
@@ -4098,6 +4098,127 @@ out_unlock:
return ret;
 }
 
+static int btrfs_ioctl_get_supported_features(struct file *file,
+ void __user *arg)
+{
+   struct btrfs_ioctl_feature_flags features[2];
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
+   features[0].compat_flags = BTRFS_FEATURE_COMPAT_SUPP;
+   features[0].compat_ro_flags = BTRFS_FEATURE_COMPAT_RO_SUPP;
+   features[0].incompat_flags = BTRFS_FEATURE_INCOMPAT_SUPP;
+
+   features[1].compat_flags = BTRFS_FEATURE_COMPAT_ONLINE_SAFE;
+   features[1].compat_ro_flags = BTRFS_FEATURE_COMPAT_RO_ONLINE_SAFE;
+   features[1].incompat_flags = BTRFS_FEATURE_INCOMPAT_ONLINE_SAFE;
+
+   if (copy_to_user(arg, features, sizeof(features)))
+   return -EFAULT;
+
+   return 0;
+}
+
+static int btrfs_ioctl_get_features(struct file *file, void __user *arg)
+{
+   struct btrfs_root *root = BTRFS_I(file_inode(file))-root;
+   struct btrfs_super_block *super_block = root-fs_info-super_copy;
+   struct btrfs_ioctl_feature_flags features;
+
+   if (!capable(CAP_SYS_ADMIN))
+   return -EPERM;
+
+   features.compat_flags = btrfs_super_compat_flags(super_block);
+   features.compat_ro_flags = btrfs_super_compat_ro_flags(super_block);
+   features.incompat_flags = btrfs_super_incompat_flags(super_block);
+
+   if (copy_to_user(arg, features, sizeof(features)))
+   return -EFAULT;
+
+   return 0;
+}
+
+static int check_feature_bits(struct btrfs_root *root, const char *type,
+ u64 flags, u64 supported_flags,
+ u64 allowed_flags)
+{
+   u64 disallowed, unsupported;
+
+   unsupported = flags  ~supported_flags;
+   if (unsupported) {
+   btrfs_warn(root-fs_info,
+  this kernel does not support %s bits %llx, type,
+  unsupported);
+   return -EOPNOTSUPP;
+   }
+
+   disallowed = flags  ~allowed_flags;
+   if (disallowed) {
+   btrfs_warn(root-fs_info,
+  can't enable %s bits %llx while mounted, type,
+  disallowed);
+   return -EPERM;
+   }
+
+   return 0;
+}
+
+#define check_feature(root, flags, mask_base)  \
+check_feature_bits(root, # mask_base, flags,   \
+   BTRFS_FEATURE_ ## mask_base ## _SUPP,   \
+   BTRFS_FEATURE_ ## mask_base ## _ONLINE_SAFE)
+
+static int btrfs_ioctl_add_features(struct file *file, void __user *arg)
+{
+   struct btrfs_root *root = 

[PATCH 1/2] ctree: fix broken macros for compat_ro

2013-08-27 Thread Jeff Mahoney
The macro for compat_ro had the right name but was modifying compat instead.

Signed-off-by: Jeff Mahoney je...@suse.com
---
 ctree.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ctree.h b/ctree.h
index 0b0d701..8ef45f3 100644
--- a/ctree.h
+++ b/ctree.h
@@ -1912,7 +1912,7 @@ BTRFS_SETGET_STACK_FUNCS(super_num_devices, struct 
btrfs_super_block,
 BTRFS_SETGET_STACK_FUNCS(super_compat_flags, struct btrfs_super_block,
 compat_flags, 64);
 BTRFS_SETGET_STACK_FUNCS(super_compat_ro_flags, struct btrfs_super_block,
-compat_flags, 64);
+compat_ro_flags, 64);
 BTRFS_SETGET_STACK_FUNCS(super_incompat_flags, struct btrfs_super_block,
 incompat_flags, 64);
 BTRFS_SETGET_STACK_FUNCS(super_csum_type, struct btrfs_super_block,
-- 
1.8.1.4


-- 
Jeff Mahoney
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Jeff Mahoney
On 8/27/13 9:47 AM, Josef Bacik wrote:
 On Mon, Aug 26, 2013 at 02:53:26PM -0700, Zach Brown wrote:
 With this we can
 go through and convert any BUG_ON()'s that we have to catch actual 
 programming
 mistakes to the new ASSERT() and then fix everybody else to return errors.

 I like the sound of that!

 --- a/fs/btrfs/ctree.h
 +++ b/fs/btrfs/ctree.h
 @@ -3814,6 +3814,22 @@ void btrfs_printk(const struct btrfs_fs_info 
 *fs_info, const char *fmt, ...)
  #define btrfs_debug(fs_info, fmt, args...) \
 btrfs_printk(fs_info, KERN_DEBUG fmt, ##args)
  
 +#ifdef BTRFS_ASSERT
 +
 +static inline void assfail(char *expr, char *file, int lin)
 +{
 +   printk(KERN_ERR BTRFS assertion failed: %s, file: %s, line: %d,
 +  expr, file, line);
 +   BUG();
 +}

 I'm not sure why this is needed.

 +#define ASSERT(expr)   \
 +   (unlikely(expr) ? (void)0 : assfail(#expr, __FILE__, __LINE__))

 (Passing the assertion is unlikely()?  I know, this is from xfs...
 still.)

 
 Yeah I copy+pasted and then thought about it after I sent it.  I will fix it 
 up.
  
 +#else
 +#define ASSERT(expr)   ((void)0)
 +#endif

 Anyway, if you're going to do it this way, why not:

  #ifdef BTRFS_ASSERT
  #define btrfs_assert(cond)  BUG_ON(!(cond))
  #else
  #define btrfs_assert(cond)  do { if (cond) ; } while (0)
  #endif

 
 I like the verbosity, especially with random kernel versions and such, it will
 help me figure out where we BUG_ON()'ed without having to checkout a 
 particular
 version and go hunting.  Thanks,

Agreed. One of the positives of the obnoxious reiserfs warning IDs is
that it uniquely identifies a call site across kernel versions. You can
tell at a glance that it's the same failure you may have been chasing
for a while. Anything to make the ID-at-a-glance easy is worth it.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Jeff Mahoney
On 8/26/13 4:56 PM, Josef Bacik wrote:
 One of the complaints we get a lot is how many BUG_ON()'s we have.  So to help
 with this I'm introducing a kconfig option to enable/disable a new ASSERT()
 mechanism much like what XFS does.  This will allow us developers to still get
 our nice panics but allow users/distros to compile them out.  With this we can
 go through and convert any BUG_ON()'s that we have to catch actual programming
 mistakes to the new ASSERT() and then fix everybody else to return errors.  
 This
 will also allow developers to leave sanity checks in their new code to make 
 sure
 we don't trip over problems while testing stuff and vetting new features.
 Thanks,

I don't think the complaint is so much about the number of BUG_ONs, but
that there's no distinction between something that is supposed to be
impossible and something that is improbable. The BUG_ONs to keep code
correctness are good and are littered all over the kernel with positive
results. The BUG_ONs that are there in place of real error handling
served their purpose and need to be replaced.

So, I don't know if it's a net win to compile the good BUG_ONs out of
the code. Especially if a user runs into something strange yet familiar
and the first response is oh, huh, can you rebuild with asserts enabled?

For the call sites that are unimplemented error handling, something to
annotate those so that we can keep track of them and gradually eliminate
those too would be good, though.

-Jeff

 Signed-off-by: Josef Bacik jba...@fusionio.com
 ---
  fs/btrfs/Kconfig |9 +
  fs/btrfs/ctree.h |   16 
  2 files changed, 25 insertions(+), 0 deletions(-)
 
 diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
 index 2b3b832..398cbd5 100644
 --- a/fs/btrfs/Kconfig
 +++ b/fs/btrfs/Kconfig
 @@ -72,3 +72,12 @@ config BTRFS_DEBUG
 performance, or export extra information via sysfs.
  
 If unsure, say N.
 +
 +config BTRFS_ASSERT
 + bool Btrfs assert support
 + depends on BTRFS_FS
 + help
 +   Enable run-time assertion checking.  This will result in panics if
 +   any of the assertions trip.  This is meant for btrfs developers only.
 +
 +   If unsure, say N.
 diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
 index c90be01..8278a3f 100644
 --- a/fs/btrfs/ctree.h
 +++ b/fs/btrfs/ctree.h
 @@ -3814,6 +3814,22 @@ void btrfs_printk(const struct btrfs_fs_info *fs_info, 
 const char *fmt, ...)
  #define btrfs_debug(fs_info, fmt, args...) \
   btrfs_printk(fs_info, KERN_DEBUG fmt, ##args)
  
 +#ifdef BTRFS_ASSERT
 +
 +static inline void assfail(char *expr, char *file, int lin)
 +{
 + printk(KERN_ERR BTRFS assertion failed: %s, file: %s, line: %d,
 +expr, file, line);
 + BUG();
 +}
 +
 +#define ASSERT(expr) \
 + (unlikely(expr) ? (void)0 : assfail(#expr, __FILE__, __LINE__))
 +#else
 +#define ASSERT(expr) ((void)0)
 +#endif
 +
 +#define btrfs_assert()
  __printf(5, 6)
  void __btrfs_std_error(struct btrfs_fs_info *fs_info, const char *function,
unsigned int line, int errno, const char *fmt, ...);
 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


[PATCH v5] Btrfs: fix deadlock in uuid scan kthread

2013-08-27 Thread Filipe David Borba Manana
If there's an ongoing transaction when the uuid scan kthread attempts
to create one, the kthread will block, waiting for that transaction to
finish while it's keeping locks on the tree root, and in turn the existing
transaction is waiting for those locks to be free.

The stack trace reported by the kernel follows.

[36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
[36700.671602] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671602] btrfs-uuid  D  0 15480  2 0x
[36700.671604]  880710bd5b88 0046 8803d36ba850 
0003
[36700.671605]  8806d76dc530 880710bd5fd8 880710bd5fd8 
880710bd5fd8
[36700.671607]  8808098ac530 8806d76dc530 880710bd5b98 
8805e4508e40
[36700.671608] Call Trace:
[36700.671610]  [816f36b9] schedule+0x29/0x70
[36700.671620]  [a05a3bdf] wait_current_trans.isra.33+0xbf/0x120 
[btrfs]
[36700.671623]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671629]  [a05a5b06] start_transaction+0x3d6/0x530 [btrfs]
[36700.671636]  [a05bb1f4] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
[36700.671642]  [a05a5fbb] btrfs_start_transaction+0x1b/0x20 [btrfs]
[36700.671649]  [a05c8a81] btrfs_uuid_scan_kthread+0x211/0x3d0 [btrfs]
[36700.671655]  [a05c8870] ? __btrfs_open_devices+0x2a0/0x2a0 [btrfs]
[36700.671657]  [81065fa0] kthread+0xc0/0xd0
[36700.671659]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671661]  [816fcd1c] ret_from_fork+0x7c/0xb0
[36700.671662]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
[36700.671664] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671665] btrfs   D  0 15481  15212 0x0004
[36700.671666]  880248cbf4c8 0086 8803d36ba700 
8801dbd5c280
[36700.671668]  880807815c40 880248cbffd8 880248cbffd8 
880248cbffd8
[36700.671669]  8805e86a 880807815c40 880248cbf4d8 
8801dbd5c280
[36700.671670] Call Trace:
[36700.671672]  [816f36b9] schedule+0x29/0x70
[36700.671679]  [a05d9b0d] btrfs_tree_lock+0x6d/0x230 [btrfs]
[36700.671680]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671685]  [a0582829] btrfs_search_slot+0x999/0xb00 [btrfs]
[36700.671691]  [a05bd9de] ? 
btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
[36700.671698]  [a05e3e54] __btrfs_write_out_cache+0x8c4/0xa80 [btrfs]
[36700.671704]  [a05e4362] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
[36700.671710]  [a05c4441] ? free_extent_buffer+0x61/0xc0 [btrfs]
[36700.671716]  [a0594c82] btrfs_write_dirty_block_groups+0x562/0x650 
[btrfs]
[36700.671723]  [a0610092] commit_cowonly_roots+0x171/0x24b [btrfs]
[36700.671729]  [a05a4dde] btrfs_commit_transaction+0x4fe/0xa10 
[btrfs]
[36700.671735]  [a0610af3] create_subvol+0x5c0/0x636 [btrfs]
[36700.671742]  [a05d49ff] btrfs_mksubvol.isra.60+0x33f/0x3f0 [btrfs]
[36700.671747]  [a05d4bf2] 
btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
[36700.671752]  [a05d4c6c] ? btrfs_ioctl_snap_create+0x2c/0x80 [btrfs]
[36700.671757]  [a05d4c9e] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
[36700.671759]  [8113a764] ? handle_pte_fault+0x84/0x920
[36700.671764]  [a05d87eb] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
[36700.671766]  [8113c120] ? handle_mm_fault+0x210/0x310
[36700.671768]  [816f83a4] ? __do_page_fault+0x284/0x4e0
[36700.671770]  [81180aa6] do_vfs_ioctl+0x96/0x550
[36700.671772]  [81170fe3] ? __sb_end_write+0x33/0x70
[36700.671774]  [81180ff1] SyS_ioctl+0x91/0xb0
[36700.671775]  [816fcdc2] system_call_fastpath+0x16/0x1b

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: Removed wrong assignment of NULL to transaction pointer, and
addressed code style comment from Josef.
V3: Removed unnecessary if statement to check if trans is NULL,
as it can't be NULL anymore in that section of the loop.
V4: Ensure that we don't leak a transaction handle if after looking
again for items we don't end up updating anything in the uuid
tree. Before we create a transaction, we release the path we
found and then repeat the key search and check if the item
still needs to be updated - if not we don't use the transaction
for that iteration of the loop - if this is true for all iterations,
we would leak the transaction handle.
V5: If a transaction is not used in a loop iteration, end it before
we call cond_resched() at the end of the loop body. This is more
friendly for other tasks.

 fs/btrfs/volumes.c |   33 -
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 

[PATCH v6] Btrfs: fix deadlock in uuid scan kthread

2013-08-27 Thread Filipe David Borba Manana
If there's an ongoing transaction when the uuid scan kthread attempts
to create one, the kthread will block, waiting for that transaction to
finish while it's keeping locks on the tree root, and in turn the existing
transaction is waiting for those locks to be free.

The stack trace reported by the kernel follows.

[36700.671601] INFO: task btrfs-uuid:15480 blocked for more than 120 seconds.
[36700.671602] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671602] btrfs-uuid  D  0 15480  2 0x
[36700.671604]  880710bd5b88 0046 8803d36ba850 
0003
[36700.671605]  8806d76dc530 880710bd5fd8 880710bd5fd8 
880710bd5fd8
[36700.671607]  8808098ac530 8806d76dc530 880710bd5b98 
8805e4508e40
[36700.671608] Call Trace:
[36700.671610]  [816f36b9] schedule+0x29/0x70
[36700.671620]  [a05a3bdf] wait_current_trans.isra.33+0xbf/0x120 
[btrfs]
[36700.671623]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671629]  [a05a5b06] start_transaction+0x3d6/0x530 [btrfs]
[36700.671636]  [a05bb1f4] ? btrfs_get_token_32+0x64/0xf0 [btrfs]
[36700.671642]  [a05a5fbb] btrfs_start_transaction+0x1b/0x20 [btrfs]
[36700.671649]  [a05c8a81] btrfs_uuid_scan_kthread+0x211/0x3d0 [btrfs]
[36700.671655]  [a05c8870] ? __btrfs_open_devices+0x2a0/0x2a0 [btrfs]
[36700.671657]  [81065fa0] kthread+0xc0/0xd0
[36700.671659]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671661]  [816fcd1c] ret_from_fork+0x7c/0xb0
[36700.671662]  [81065ee0] ? flush_kthread_worker+0xb0/0xb0
[36700.671663] INFO: task btrfs:15481 blocked for more than 120 seconds.
[36700.671664] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[36700.671665] btrfs   D  0 15481  15212 0x0004
[36700.671666]  880248cbf4c8 0086 8803d36ba700 
8801dbd5c280
[36700.671668]  880807815c40 880248cbffd8 880248cbffd8 
880248cbffd8
[36700.671669]  8805e86a 880807815c40 880248cbf4d8 
8801dbd5c280
[36700.671670] Call Trace:
[36700.671672]  [816f36b9] schedule+0x29/0x70
[36700.671679]  [a05d9b0d] btrfs_tree_lock+0x6d/0x230 [btrfs]
[36700.671680]  [81066760] ? add_wait_queue+0x60/0x60
[36700.671685]  [a0582829] btrfs_search_slot+0x999/0xb00 [btrfs]
[36700.671691]  [a05bd9de] ? 
btrfs_lookup_first_ordered_extent+0x5e/0xb0 [btrfs]
[36700.671698]  [a05e3e54] __btrfs_write_out_cache+0x8c4/0xa80 [btrfs]
[36700.671704]  [a05e4362] btrfs_write_out_cache+0xb2/0xf0 [btrfs]
[36700.671710]  [a05c4441] ? free_extent_buffer+0x61/0xc0 [btrfs]
[36700.671716]  [a0594c82] btrfs_write_dirty_block_groups+0x562/0x650 
[btrfs]
[36700.671723]  [a0610092] commit_cowonly_roots+0x171/0x24b [btrfs]
[36700.671729]  [a05a4dde] btrfs_commit_transaction+0x4fe/0xa10 
[btrfs]
[36700.671735]  [a0610af3] create_subvol+0x5c0/0x636 [btrfs]
[36700.671742]  [a05d49ff] btrfs_mksubvol.isra.60+0x33f/0x3f0 [btrfs]
[36700.671747]  [a05d4bf2] 
btrfs_ioctl_snap_create_transid+0x142/0x190 [btrfs]
[36700.671752]  [a05d4c6c] ? btrfs_ioctl_snap_create+0x2c/0x80 [btrfs]
[36700.671757]  [a05d4c9e] btrfs_ioctl_snap_create+0x5e/0x80 [btrfs]
[36700.671759]  [8113a764] ? handle_pte_fault+0x84/0x920
[36700.671764]  [a05d87eb] btrfs_ioctl+0xf0b/0x1d00 [btrfs]
[36700.671766]  [8113c120] ? handle_mm_fault+0x210/0x310
[36700.671768]  [816f83a4] ? __do_page_fault+0x284/0x4e0
[36700.671770]  [81180aa6] do_vfs_ioctl+0x96/0x550
[36700.671772]  [81170fe3] ? __sb_end_write+0x33/0x70
[36700.671774]  [81180ff1] SyS_ioctl+0x91/0xb0
[36700.671775]  [816fcdc2] system_call_fastpath+0x16/0x1b

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: Removed wrong assignment of NULL to transaction pointer, and
addressed code style comment from Josef.
V3: Removed unnecessary if statement to check if trans is NULL,
as it can't be NULL anymore in that section of the loop.
V4: Ensure that we don't leak a transaction handle if after looking
again for items we don't end up updating anything in the uuid
tree. Before we create a transaction, we release the path we
found and then repeat the key search and check if the item
still needs to be updated - if not we don't use the transaction
for that iteration of the loop - if this is true for all iterations,
we would leak the transaction handle.
V5: If a transaction is not used in a loop iteration, end it before
we call cond_resched() at the end of the loop body. This is more
friendly for other tasks.
V6: Removed redundant btrfs_end_transaction() calls on error, because
this is already performed after we exit the loop.

 fs/btrfs/volumes.c |   35 

Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Josef Bacik
On Tue, Aug 27, 2013 at 03:28:24PM -0400, Jeff Mahoney wrote:
 On 8/26/13 4:56 PM, Josef Bacik wrote:
  One of the complaints we get a lot is how many BUG_ON()'s we have.  So to 
  help
  with this I'm introducing a kconfig option to enable/disable a new ASSERT()
  mechanism much like what XFS does.  This will allow us developers to still 
  get
  our nice panics but allow users/distros to compile them out.  With this we 
  can
  go through and convert any BUG_ON()'s that we have to catch actual 
  programming
  mistakes to the new ASSERT() and then fix everybody else to return errors.  
  This
  will also allow developers to leave sanity checks in their new code to make 
  sure
  we don't trip over problems while testing stuff and vetting new features.
  Thanks,
 
 I don't think the complaint is so much about the number of BUG_ONs, but
 that there's no distinction between something that is supposed to be
 impossible and something that is improbable. The BUG_ONs to keep code
 correctness are good and are littered all over the kernel with positive
 results. The BUG_ONs that are there in place of real error handling
 served their purpose and need to be replaced.
 
 So, I don't know if it's a net win to compile the good BUG_ONs out of
 the code. Especially if a user runs into something strange yet familiar
 and the first response is oh, huh, can you rebuild with asserts enabled?
 

Either I provide an option for it or distros do it themselves, this cuts out the
middle man.  I'd really rather they just be on all the time since they aren't
things we should hit anyway, but at least this way people have a choice.
Thanks,

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Jeff Mahoney
On 8/27/13 4:56 PM, Josef Bacik wrote:
 On Tue, Aug 27, 2013 at 03:28:24PM -0400, Jeff Mahoney wrote:
 On 8/26/13 4:56 PM, Josef Bacik wrote:
 One of the complaints we get a lot is how many BUG_ON()'s we have.  So to 
 help
 with this I'm introducing a kconfig option to enable/disable a new ASSERT()
 mechanism much like what XFS does.  This will allow us developers to still 
 get
 our nice panics but allow users/distros to compile them out.  With this we 
 can
 go through and convert any BUG_ON()'s that we have to catch actual 
 programming
 mistakes to the new ASSERT() and then fix everybody else to return errors.  
 This
 will also allow developers to leave sanity checks in their new code to make 
 sure
 we don't trip over problems while testing stuff and vetting new features.
 Thanks,

 I don't think the complaint is so much about the number of BUG_ONs, but
 that there's no distinction between something that is supposed to be
 impossible and something that is improbable. The BUG_ONs to keep code
 correctness are good and are littered all over the kernel with positive
 results. The BUG_ONs that are there in place of real error handling
 served their purpose and need to be replaced.

 So, I don't know if it's a net win to compile the good BUG_ONs out of
 the code. Especially if a user runs into something strange yet familiar
 and the first response is oh, huh, can you rebuild with asserts enabled?

 
 Either I provide an option for it or distros do it themselves, this cuts out 
 the
 middle man.  I'd really rather they just be on all the time since they aren't
 things we should hit anyway, but at least this way people have a choice.

Ok. With my distro hat on, I can tell you I'll be leaving them on. :)

-Jeff


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Eric Sandeen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 8/27/13 4:07 PM, Jeff Mahoney wrote:
 On 8/27/13 4:56 PM, Josef Bacik wrote:
 On Tue, Aug 27, 2013 at 03:28:24PM -0400, Jeff Mahoney wrote:
 On 8/26/13 4:56 PM, Josef Bacik wrote:
 One of the complaints we get a lot is how many BUG_ON()'s we have.  So to 
 help
 with this I'm introducing a kconfig option to enable/disable a new ASSERT()
 mechanism much like what XFS does.  This will allow us developers to still 
 get
 our nice panics but allow users/distros to compile them out.  With this we 
 can
 go through and convert any BUG_ON()'s that we have to catch actual 
 programming
 mistakes to the new ASSERT() and then fix everybody else to return errors. 
  This
 will also allow developers to leave sanity checks in their new code to 
 make sure
 we don't trip over problems while testing stuff and vetting new features.
 Thanks,

 I don't think the complaint is so much about the number of BUG_ONs, but
 that there's no distinction between something that is supposed to be
 impossible and something that is improbable. The BUG_ONs to keep code
 correctness are good and are littered all over the kernel with positive
 results. The BUG_ONs that are there in place of real error handling
 served their purpose and need to be replaced.

 So, I don't know if it's a net win to compile the good BUG_ONs out of
 the code. Especially if a user runs into something strange yet familiar
 and the first response is oh, huh, can you rebuild with asserts enabled?


 Either I provide an option for it or distros do it themselves, this cuts out 
 the
 middle man.  I'd really rather they just be on all the time since they aren't
 things we should hit anyway, but at least this way people have a choice.
 
 Ok. With my distro hat on, I can tell you I'll be leaving them on. :)
 
 -Jeff

XFS also has XFS_WARN as a config option, which keeps all the assertions
in place, but printk's  backtraces w/o the icky BUG().  That might be
good to add as well, and perhaps best for a shipping distro (vs. a developer
debugging who might want to drop a core file when the assert trips).

- -Eric

-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSHRhOAAoJECCuFpLhPd7gYbcP/034ADG3dwTa83FaAWuAurg7
byKWG4EwRqt3PYjUgruxBJAc426O7tz6j1NNTrAwZys9/GJOsisPShA8gO0f+W/A
+bQZJlXoUMbbwVPMcCqsnKMKlXNyKoqgME9AUQOrzMB/SgDtC9Y/OgdqgWF/58UV
X1KC3OOtcfQr/1t19AZuNhJ5oHfytoscv3nnnW5872t1JtL8daomak4fyDuRKgRV
45kQ726nafUlXNmi1TG8GadlcmKxxbBm0vt2ui6RtZWVauPE4Gej+iEUux9WtwSc
48eOQ5iqbFVzC8v++Rc1eT28mBIjSetr+O/Tk+VL4TvYCKA2trMAltNAFinv9AB0
Q+Z9F1K26aFe/Z/gcM57j+c0VOkv1tvSElF1iJcVHPuRvV7k+548g+KVzbXNDPBP
vuV2fnUCpw/XHQlrI+efYLs7Ies0TuV2eGPhmbKWjhossPwOeng71zxuiXSbNMBE
gVcHg6idXjCdaCCIYuJr8+5K4ngnpTEbAUs4C2x6iHzuHZcXScEHWYU/nHvizElL
bCZ162QSeQZAd+NgSzoZSmv4XqFMj6c4q60XvhAuu3fpkVPVY4GshcFhT14Onhfl
/054HqdQIXjUGOdbeuUwmXoaqzpSKDhBmZ0G+ykarD1KRCaEW61JrnFepRPO7G69
Q3oiPIbvdvCw3BZAJaGL
=7vvX
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Jeff Mahoney
On 8/27/13 5:21 PM, Eric Sandeen wrote:
 On 8/27/13 4:07 PM, Jeff Mahoney wrote:
 On 8/27/13 4:56 PM, Josef Bacik wrote:
 On Tue, Aug 27, 2013 at 03:28:24PM -0400, Jeff Mahoney wrote:
 On 8/26/13 4:56 PM, Josef Bacik wrote:
 One of the complaints we get a lot is how many BUG_ON()'s we have.  So to 
 help
 with this I'm introducing a kconfig option to enable/disable a new 
 ASSERT()
 mechanism much like what XFS does.  This will allow us developers to 
 still get
 our nice panics but allow users/distros to compile them out.  With this 
 we can
 go through and convert any BUG_ON()'s that we have to catch actual 
 programming
 mistakes to the new ASSERT() and then fix everybody else to return 
 errors.  This
 will also allow developers to leave sanity checks in their new code to 
 make sure
 we don't trip over problems while testing stuff and vetting new features.
 Thanks,

 I don't think the complaint is so much about the number of BUG_ONs, but
 that there's no distinction between something that is supposed to be
 impossible and something that is improbable. The BUG_ONs to keep code
 correctness are good and are littered all over the kernel with positive
 results. The BUG_ONs that are there in place of real error handling
 served their purpose and need to be replaced.

 So, I don't know if it's a net win to compile the good BUG_ONs out of
 the code. Especially if a user runs into something strange yet familiar
 and the first response is oh, huh, can you rebuild with asserts enabled?


 Either I provide an option for it or distros do it themselves, this cuts 
 out the
 middle man.  I'd really rather they just be on all the time since they 
 aren't
 things we should hit anyway, but at least this way people have a choice.
 
 Ok. With my distro hat on, I can tell you I'll be leaving them on. :)
 
 -Jeff
 
 XFS also has XFS_WARN as a config option, which keeps all the assertions
 in place, but printk's  backtraces w/o the icky BUG().  That might be
 good to add as well, and perhaps best for a shipping distro (vs. a developer
 debugging who might want to drop a core file when the assert trips).

Isn't that the distinction between BUG_ON and WARN_ON? If it's worth a
BUG_ON, things should be bad enough (or could result in being bad
enough) that we want to bail out.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Eric Sandeen
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 8/27/13 4:25 PM, Jeff Mahoney wrote:
 On 8/27/13 5:21 PM, Eric Sandeen wrote:
 On 8/27/13 4:07 PM, Jeff Mahoney wrote:
 On 8/27/13 4:56 PM, Josef Bacik wrote:
 On Tue, Aug 27, 2013 at 03:28:24PM -0400, Jeff Mahoney wrote:
 On 8/26/13 4:56 PM, Josef Bacik wrote:
 One of the complaints we get a lot is how many BUG_ON()'s we have.  So 
 to help
 with this I'm introducing a kconfig option to enable/disable a new 
 ASSERT()
 mechanism much like what XFS does.  This will allow us developers to 
 still get
 our nice panics but allow users/distros to compile them out.  With this 
 we can
 go through and convert any BUG_ON()'s that we have to catch actual 
 programming
 mistakes to the new ASSERT() and then fix everybody else to return 
 errors.  This
 will also allow developers to leave sanity checks in their new code to 
 make sure
 we don't trip over problems while testing stuff and vetting new features.
 Thanks,

 I don't think the complaint is so much about the number of BUG_ONs, but
 that there's no distinction between something that is supposed to be
 impossible and something that is improbable. The BUG_ONs to keep code
 correctness are good and are littered all over the kernel with positive
 results. The BUG_ONs that are there in place of real error handling
 served their purpose and need to be replaced.

 So, I don't know if it's a net win to compile the good BUG_ONs out of
 the code. Especially if a user runs into something strange yet familiar
 and the first response is oh, huh, can you rebuild with asserts enabled?


 Either I provide an option for it or distros do it themselves, this cuts 
 out the
 middle man.  I'd really rather they just be on all the time since they 
 aren't
 things we should hit anyway, but at least this way people have a choice.

 Ok. With my distro hat on, I can tell you I'll be leaving them on. :)

 -Jeff

 XFS also has XFS_WARN as a config option, which keeps all the assertions
 in place, but printk's  backtraces w/o the icky BUG().  That might be
 good to add as well, and perhaps best for a shipping distro (vs. a developer
 debugging who might want to drop a core file when the assert trips).
 
 Isn't that the distinction between BUG_ON and WARN_ON? If it's worth a
 BUG_ON, things should be bad enough (or could result in being bad
 enough) that we want to bail out.
 
 -Jeff

Maybe; just FWIW here was Dave's rationale for xfs.  Right now btrfs
doesn't have the behavior-changing side effect (no BTRFS_DEBUG config)
though, so maybe the distinction is less important...

xfs: introduce CONFIG_XFS_WARN

Running a CONFIG_XFS_DEBUG kernel in production environments is not
the best idea as it introduces significant overhead, can change
the behaviour of algorithms (such as allocation) to improve test
coverage, and (most importantly) panic the machine on non-fatal
errors.

There are many cases where all we want to do is run a
kernel with more bounds checking enabled, such as is provided by the
ASSERT() statements throughout the code, but without all the
potential overhead and drawbacks.

This patch converts all the ASSERT statements to evaluate as
WARN_ON(1) statements and hence if they fail dump a warning and a
stack trace to the log. This has minimal overhead and does not
change any algorithms, and will allow us to find strange out of
bounds problems more easily on production machines.

There are a few places where assert statements contain debug only
code. These are converted to be debug-or-warn only code so that we
still get all the assert checks in the code.

Signed-off-by: Dave Chinner dchin...@redhat.com


-BEGIN PGP SIGNATURE-
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSHRn5AAoJECCuFpLhPd7gpsQQAJlFGX/t9b/LASxSisiv/wdL
ZoQXHhCzxpzVVdsWstzOY0bVXw8vdsG2E+nmih2S6T7AzkoqPDoEnYE9CqpNQFFy
Ca/kJOcfE1T4mIwKZwLHATkJX0V/S6nY7jPa7xdcseie+1H7ldSPaM5Jb6fkvXg/
8lNPTikGeoRJdUwQN4xxNgsivITfJpl65Z+AVg5UAUqqUKZtYZLfVeAlyQFKvOyl
/am80yLLzhFODtV3GcWkaYcInBaB2AaVlqHrpTnf53gG9JGynyFjnZGlysz0flSs
wstNKLOon+wNBg1Dz0HrUSVma87g5hc1WtaZFC/qI3uHuoatsxOWxG6+LXZlr2CN
Jsq3ZwHHxOs4MLgyEYlSirpgKqn/aKA+J8O0mNlltBj2lpU2hKgPS7dmMw5o8VAM
1uei1er15eBlCY0uBncRXIcLEcXfRXo9b69ErQBIbCN7xrGyWdbZ/DVtElaFeImh
Lw+iBXebBbw6SCqCMZFc3vpYdF+9RP6shImBlsqxTzKs5M1gISrFtCF0GqOuPrWt
7jyrredhpKACAaOpxPW8UWh2vL+q51JWzzZKYE35Gy4M/8E64TQ0rYhLGj7x+TYU
FWYzpONK0x7XbmgtEKTutwi9w+vfSlMzFNpUavwFeTZIh8Dw1tEO3dBn59Rs9Oz8
Widxpe+hqz/qK/0O4rTb
=4nmO
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: add support for asserts

2013-08-27 Thread Jeff Mahoney
On 8/27/13 5:28 PM, Eric Sandeen wrote:
 On 8/27/13 4:25 PM, Jeff Mahoney wrote:
 On 8/27/13 5:21 PM, Eric Sandeen wrote:
 On 8/27/13 4:07 PM, Jeff Mahoney wrote:
 On 8/27/13 4:56 PM, Josef Bacik wrote:
 On Tue, Aug 27, 2013 at 03:28:24PM -0400, Jeff Mahoney wrote:
 On 8/26/13 4:56 PM, Josef Bacik wrote:
 One of the complaints we get a lot is how many BUG_ON()'s we have.  So 
 to help
 with this I'm introducing a kconfig option to enable/disable a new 
 ASSERT()
 mechanism much like what XFS does.  This will allow us developers to 
 still get
 our nice panics but allow users/distros to compile them out.  With this 
 we can
 go through and convert any BUG_ON()'s that we have to catch actual 
 programming
 mistakes to the new ASSERT() and then fix everybody else to return 
 errors.  This
 will also allow developers to leave sanity checks in their new code to 
 make sure
 we don't trip over problems while testing stuff and vetting new 
 features.
 Thanks,

 I don't think the complaint is so much about the number of BUG_ONs, but
 that there's no distinction between something that is supposed to be
 impossible and something that is improbable. The BUG_ONs to keep code
 correctness are good and are littered all over the kernel with positive
 results. The BUG_ONs that are there in place of real error handling
 served their purpose and need to be replaced.

 So, I don't know if it's a net win to compile the good BUG_ONs out of
 the code. Especially if a user runs into something strange yet familiar
 and the first response is oh, huh, can you rebuild with asserts 
 enabled?


 Either I provide an option for it or distros do it themselves, this cuts 
 out the
 middle man.  I'd really rather they just be on all the time since they 
 aren't
 things we should hit anyway, but at least this way people have a choice.

 Ok. With my distro hat on, I can tell you I'll be leaving them on. :)

 -Jeff

 XFS also has XFS_WARN as a config option, which keeps all the assertions
 in place, but printk's  backtraces w/o the icky BUG().  That might be
 good to add as well, and perhaps best for a shipping distro (vs. a developer
 debugging who might want to drop a core file when the assert trips).
 
 Isn't that the distinction between BUG_ON and WARN_ON? If it's worth a
 BUG_ON, things should be bad enough (or could result in being bad
 enough) that we want to bail out.
 
 -Jeff
 
 Maybe; just FWIW here was Dave's rationale for xfs.  Right now btrfs
 doesn't have the behavior-changing side effect (no BTRFS_DEBUG config)
 though, so maybe the distinction is less important...

Yeah, I'd agree with the distinction not being there in btrfs (yet).
ReiserFS has a similar mode where there are a ton of checks that are
optionally enabled and does invasive things that can slow things down.
It's disabled pretty much universally AFAIK. One of the things (low) on
my TODO list is to go through all of those and move them into regular
checks since some of them are the types of things fsfuzzer likes to trip
over.

-Jeff

 xfs: introduce CONFIG_XFS_WARN
 
 Running a CONFIG_XFS_DEBUG kernel in production environments is not
 the best idea as it introduces significant overhead, can change
 the behaviour of algorithms (such as allocation) to improve test
 coverage, and (most importantly) panic the machine on non-fatal
 errors.
 
 There are many cases where all we want to do is run a
 kernel with more bounds checking enabled, such as is provided by the
 ASSERT() statements throughout the code, but without all the
 potential overhead and drawbacks.
 
 This patch converts all the ASSERT statements to evaluate as
 WARN_ON(1) statements and hence if they fail dump a warning and a
 stack trace to the log. This has minimal overhead and does not
 change any algorithms, and will allow us to find strange out of
 bounds problems more easily on production machines.
 
 There are a few places where assert statements contain debug only
 code. These are converted to be debug-or-warn only code so that we
 still get all the assert checks in the code.
 
 Signed-off-by: Dave Chinner dchin...@redhat.com
 
 
 

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: bad unlock balance in btrfs_commit_transaction_async

2013-08-27 Thread Dan Mick

Another developer just noticed this in testing; anyone have any ideas?

On 08/22/2013 05:40 PM, Sage Weil wrote:

I just noticed that there is a locking imbalance warning with sb_internal
in the transaction commit code.  I believe this has only started appearing
recently (after I merged -rc5 into my testing tree), but I'm working on
confirming that. The error is

4[27034.835134] =
4[27034.839854] [ BUG: bad unlock balance detected! ]
4[27034.844576] 3.11.0-rc5-ceph-00061-g546140d #1 Not tainted
4[27034.849992] -
4[27034.854713] ceph-osd/30797 is trying to release lock (sb_internal) at:
4[27034.861304] [a0148fd8] 
btrfs_commit_transaction_async+0x1c8/0x2c0 [btrfs]
4[27034.868994] but there are no more locks to release!
4[27034.873887]
4[27034.873887] other info that might help us debug this:
4[27034.880448] no locks held by ceph-osd/30797.
4[27034.884733]
4[27034.884733] stack backtrace:
4[27034.889123] CPU: 0 PID: 30797 Comm: ceph-osd Not tainted 
3.11.0-rc5-ceph-00061-g546140d #1
4[27034.897421] Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 
02/07/2011
4[27034.904938]  a0148fd8 88020baf9c68 81642d85 
0007
4[27034.912411]  88021b32deb0 88020baf9c98 810ab89e 
88020cff8000
4[27034.919883]  0246 88020aaeddd0 a0148fd8 
88020baf9ce8
4[27034.927358] Call Trace:
4[27034.929836]  [a0148fd8] ? 
btrfs_commit_transaction_async+0x1c8/0x2c0 [btrfs]
4[27034.937790]  [81642d85] dump_stack+0x46/0x58
4[27034.942951]  [810ab89e] print_unlock_imbalance_bug+0xfe/0x110
4[27034.949599]  [a0148fd8] ? 
btrfs_commit_transaction_async+0x1c8/0x2c0 [btrfs]
4[27034.957552]  [810aeafe] lock_release+0x15e/0x220
4[27034.963069]  [a0148fff] 
btrfs_commit_transaction_async+0x1ef/0x2c0 [btrfs]
4[27034.970850]  [810af6f5] ? trace_hardirqs_on_caller+0x105/0x1d0
4[27034.977587]  [a0177907] btrfs_ioctl_start_sync+0x47/0xc0 [btrfs]
4[27034.984499]  [a017c575] btrfs_ioctl+0xe55/0x1af0 [btrfs]
4[27034.990700]  [8164ab2b] ? _raw_spin_unlock+0x2b/0x40
4[27034.996552]  [810b40b5] ? do_futex+0xa45/0xbb0
4[27035.001885]  [8119cf0c] ? fget_light+0x3c/0x130
4[27035.007302]  [811922a6] do_vfs_ioctl+0x96/0x560
4[27035.012720]  [8119cf6e] ? fget_light+0x9e/0x130
4[27035.018137]  [8119cf0c] ? fget_light+0x3c/0x130
4[27035.023556]  [81192801] SyS_ioctl+0x91/0xb0
4[27035.028628]  [813338fe] ? trace_hardirqs_on_thunk+0x3a/0x3f
4[27035.035089]  [81653782] system_call_fastpath+0x16/0x1b

This is presumably some breakage in the freeze locking dance that goes on
with async commits (btrfs_commit_transaction_async caller takes the freeze
semaphore, do_async_commit releases it), but it's not obvious to me what
broke yet.  Unless this rings any bells for anyone, I'll go ahead and
bisect.

Thanks!
sage
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bad unlock balance in btrfs_commit_transaction_async

2013-08-27 Thread Yan, Zheng
On Wed, Aug 28, 2013 at 8:56 AM, Dan Mick dan.m...@inktank.com wrote:
 Another developer just noticed this in testing; anyone have any ideas?


btrfs_ioctl_start_sync() calls btrfs_attach_transaction_barrier() which further
calls start_transaction() with type == TRANS_ATTACH.  In start_transaction(),
sb_start_intwrite() is called when (type  __TRANS_FREEZABLE) is true. but
(TRANS_ATTACH  __TRANS_FREEZABLE) is false. So we see the bad
bad unlock balance bug.

Yan, Zheng

 On 08/22/2013 05:40 PM, Sage Weil wrote:

 I just noticed that there is a locking imbalance warning with sb_internal
 in the transaction commit code.  I believe this has only started appearing
 recently (after I merged -rc5 into my testing tree), but I'm working on
 confirming that. The error is

 4[27034.835134] =
 4[27034.839854] [ BUG: bad unlock balance detected! ]
 4[27034.844576] 3.11.0-rc5-ceph-00061-g546140d #1 Not tainted
 4[27034.849992] -
 4[27034.854713] ceph-osd/30797 is trying to release lock (sb_internal)
 at:
 4[27034.861304] [a0148fd8]
 btrfs_commit_transaction_async+0x1c8/0x2c0 [btrfs]
 4[27034.868994] but there are no more locks to release!
 4[27034.873887]
 4[27034.873887] other info that might help us debug this:
 4[27034.880448] no locks held by ceph-osd/30797.
 4[27034.884733]
 4[27034.884733] stack backtrace:
 4[27034.889123] CPU: 0 PID: 30797 Comm: ceph-osd Not tainted
 3.11.0-rc5-ceph-00061-g546140d #1
 4[27034.897421] Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS
 1.6.3 02/07/2011
 4[27034.904938]  a0148fd8 88020baf9c68 81642d85
 0007
 4[27034.912411]  88021b32deb0 88020baf9c98 810ab89e
 88020cff8000
 4[27034.919883]  0246 88020aaeddd0 a0148fd8
 88020baf9ce8
 4[27034.927358] Call Trace:
 4[27034.929836]  [a0148fd8] ?
 btrfs_commit_transaction_async+0x1c8/0x2c0 [btrfs]
 4[27034.937790]  [81642d85] dump_stack+0x46/0x58
 4[27034.942951]  [810ab89e]
 print_unlock_imbalance_bug+0xfe/0x110
 4[27034.949599]  [a0148fd8] ?
 btrfs_commit_transaction_async+0x1c8/0x2c0 [btrfs]
 4[27034.957552]  [810aeafe] lock_release+0x15e/0x220
 4[27034.963069]  [a0148fff]
 btrfs_commit_transaction_async+0x1ef/0x2c0 [btrfs]
 4[27034.970850]  [810af6f5] ?
 trace_hardirqs_on_caller+0x105/0x1d0
 4[27034.977587]  [a0177907] btrfs_ioctl_start_sync+0x47/0xc0
 [btrfs]
 4[27034.984499]  [a017c575] btrfs_ioctl+0xe55/0x1af0 [btrfs]
 4[27034.990700]  [8164ab2b] ? _raw_spin_unlock+0x2b/0x40
 4[27034.996552]  [810b40b5] ? do_futex+0xa45/0xbb0
 4[27035.001885]  [8119cf0c] ? fget_light+0x3c/0x130
 4[27035.007302]  [811922a6] do_vfs_ioctl+0x96/0x560
 4[27035.012720]  [8119cf6e] ? fget_light+0x9e/0x130
 4[27035.018137]  [8119cf0c] ? fget_light+0x3c/0x130
 4[27035.023556]  [81192801] SyS_ioctl+0x91/0xb0
 4[27035.028628]  [813338fe] ?
 trace_hardirqs_on_thunk+0x3a/0x3f
 4[27035.035089]  [81653782] system_call_fastpath+0x16/0x1b

 This is presumably some breakage in the freeze locking dance that goes on
 with async commits (btrfs_commit_transaction_async caller takes the freeze
 semaphore, do_async_commit releases it), but it's not obvious to me what
 broke yet.  Unless this rings any bells for anyone, I'll go ahead and
 bisect.

 Thanks!
 sage
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] btrfs-progs: prevent mkfs from aborting with small volume

2013-08-27 Thread Hidetoshi Seto
(2013/08/26 23:23), Eric Sandeen wrote:
 Thanks for looking into this - how small of a device did you test?
 
 I tried a 2MB device w/ these 2 patches and still got:
 
 [btrfs-progs]# truncate --size=2m testfile
 [btrfs-progs]# ./mkfs.btrfs testfile
 
 WARNING! - Btrfs v0.20-rc1-360-geeeb4e9 IS EXPERIMENTAL
 WARNING! - see http://btrfs.wiki.kernel.org before using
 
 SMALL VOLUME: forcing mixed metadata/data groups
 mkfs.btrfs: volumes.c:857: btrfs_alloc_chunk: Assertion `!(ret)' failed.
 Aborted (core dumped)
 
 which was at:
 
 ret = btrfs_alloc_dev_extent(trans, device,
  info-chunk_root-root_key.objectid,
  BTRFS_FIRST_CHUNK_TREE_OBJECTID, key.offset,
  calc_size, dev_offset);
 BUG_ON(ret);
 
 :(

Wow...
It seems that this abort is different problem from the bug which
my patches are going to fix.  I'll try to make new patch to fix this
problem.

 
 Also, I'm curious - I know the code existed before your patch 2/2,
 but I don't understand why it reserves 1MB for the first superblock 
 when the first superblock is actually at 64k.  Any idea?
 
 -Eric

I'm not sure... According to the git-log, this 1M trick is in 
the following old commit by Chris:

  commit a6de0bd778475504f42a142c83b8077993cbddfe
  Author: Chris Mason chris.ma...@oracle.com
  Date:   Thu Apr 3 16:35:48 2008 -0400

 Add mirroring support across multiple drives


Thanks,
H.Seto

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html