Re: KERNEL PANIC + CORRUPTED BTRFS?

2016-04-11 Thread Chris Murphy
On Mon, Apr 11, 2016 at 3:51 PM, lenovomi  wrote:
> Hi,
>
> i didnt try mount -o ro, when i tried to mount it via esata i got
> kernel panic immediately. Then i conntected enclosure with drives via
> usb and tried to mount it :

OK so try '-o ro,recovery' and report back what you get.



>
> https://bpaste.net/show/641ab9172539
> plugged via usb -> mount randomly one of the drive mount /dev/sda /mnt/brtfs
>
> I was told on irc channel that i should not run btrfs check and if so
> i should run it as
> btrfs check --repair --init-extent-tree
>
>
> Also there was recommendation to run btrfs restore before repair.

Did you use btrfs restore?
https://btrfs.wiki.kernel.org/index.php/Restore

And did you use --repair --init-extent-tree? I don't recommend it
until you use restore as well.



> Still not clear what should i do as next step.

1. mount with -o ro,recovery  and get important date backed up. It
sounds like you don't have a backup?

2. If that doesn't work, use btrfs restore. It's tedious but at least
you can update your backup.

3. Next try btrfs check without repair. There's some nuance whether
it's better to use init-extent-tree or try zeroing the log. But don't
use repair until there's a current backup with 1 or 2.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume

2016-04-11 Thread Qu Wenruo



Mark Fasheh wrote on 2016/04/11 11:09 -0700:

On Mon, Apr 11, 2016 at 09:05:47AM +0800, Qu Wenruo wrote:



Mark Fasheh wrote on 2016/04/08 12:18 -0700:

On Fri, Apr 08, 2016 at 03:10:35PM +0200, Holger Hoffstätte wrote:

[cc: Mark and Qu]

On 04/08/16 13:51, Holger Hoffstätte wrote:

On 04/08/16 13:14, Filipe Manana wrote:

Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
patches, it didn't reproduce here:


Great, that's good to know (sort of :). Thanks also to Liu Bo.


Are you sure that you are not using some patches not in 4.6?


We have a bingo!

Reverting "qgroup: Fix qgroup accounting when creating snapshot"

>from last Wednesday immediately fixes the problem.

Not surprising, I had some issues testing it out too. I'm pretty sure this
patch is corrupting memory, I just haven't found where yet though my
educated guess is that the transaction is being reused improperly.
--Mark

--
Mark Fasheh



Still digging the bug Mark has reported about the patch.

Good to have another report, as I can't always reproduce the soft
lockup from Mark.

It seems that the WARN_ON will bring another clue to fix it.

BTW, the memory corruption assumption seems to be quite helpful.
I didn't consider in that way, but it seems to be the only reason
causing dead spinlock while no other thread spinning and no lockdep
warning.


It seems to be the call to commit_cowonly_roots() in your patch which sets
everything off. If I remove that call I can run all day without a crash.

Btw, I'm not convinced this fixes the qgroup numbers anyway - we are still
inconsistent even if I don't get a crash.

Have you tested that the actual numbers on your end are coming out ok?
--Mark


Yes, my initial test shows that the snapshot of fs tree doesn't break 
the number anymore.


And commit_cowonly_roots() is the core of the fix, without it the bug 
won't be fixed.


I'm still digging but it seems to be related to missing 
switch_commit_roots() call after commit_cowonly_roots(), but still 
uncertain, as I'm not familiar with the commit codes.


Thanks,
Qu



--
Mark Fasheh





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: use dynamic allocation for root item in create_subvol

2016-04-11 Thread Tsutomu Itoh
On 2016/04/12 3:04, David Sterba wrote:
> The size of root item is more than 400 bytes, which is quite a lot of
> stack space. As we do IO from inside the subvolume ioctls, we should
> keep the stack usage low in case the filesystem is on top of other
> layers (NFS, device mapper, iscsi, etc).
> 
> Signed-off-by: David Sterba 
> ---
>   fs/btrfs/ioctl.c | 49 ++---
>   1 file changed, 26 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 053e677839fe..0be13b9c53d9 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -439,7 +439,7 @@ static noinline int create_subvol(struct inode *dir,
>   {
>   struct btrfs_trans_handle *trans;
>   struct btrfs_key key;
> - struct btrfs_root_item root_item;
> + struct btrfs_root_item *root_item;
>   struct btrfs_inode_item *inode_item;
>   struct extent_buffer *leaf;
>   struct btrfs_root *root = BTRFS_I(dir)->root;
> @@ -455,6 +455,10 @@ static noinline int create_subvol(struct inode *dir,
>   u64 qgroup_reserved;
>   uuid_le new_uuid;
>   
> + root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
> + if (!root_item)
> + return -ENOMEM;
> +
>   ret = btrfs_find_free_objectid(root->fs_info->tree_root, );
>   if (ret)
>   return ret;

'kfree(root_item)' is necessary here and other 'return'.

Thanks,
Tsutomu

> @@ -509,47 +513,45 @@ static noinline int create_subvol(struct inode *dir,
>   BTRFS_UUID_SIZE);
>   btrfs_mark_buffer_dirty(leaf);
>   
> - memset(_item, 0, sizeof(root_item));
> -
> - inode_item = _item.inode;
> + inode_item = _item->inode;
>   btrfs_set_stack_inode_generation(inode_item, 1);
>   btrfs_set_stack_inode_size(inode_item, 3);
>   btrfs_set_stack_inode_nlink(inode_item, 1);
>   btrfs_set_stack_inode_nbytes(inode_item, root->nodesize);
>   btrfs_set_stack_inode_mode(inode_item, S_IFDIR | 0755);
>   
> - btrfs_set_root_flags(_item, 0);
> - btrfs_set_root_limit(_item, 0);
> + btrfs_set_root_flags(root_item, 0);
> + btrfs_set_root_limit(root_item, 0);
>   btrfs_set_stack_inode_flags(inode_item, BTRFS_INODE_ROOT_ITEM_INIT);
>   
> - btrfs_set_root_bytenr(_item, leaf->start);
> - btrfs_set_root_generation(_item, trans->transid);
> - btrfs_set_root_level(_item, 0);
> - btrfs_set_root_refs(_item, 1);
> - btrfs_set_root_used(_item, leaf->len);
> - btrfs_set_root_last_snapshot(_item, 0);
> + btrfs_set_root_bytenr(root_item, leaf->start);
> + btrfs_set_root_generation(root_item, trans->transid);
> + btrfs_set_root_level(root_item, 0);
> + btrfs_set_root_refs(root_item, 1);
> + btrfs_set_root_used(root_item, leaf->len);
> + btrfs_set_root_last_snapshot(root_item, 0);
>   
> - btrfs_set_root_generation_v2(_item,
> - btrfs_root_generation(_item));
> + btrfs_set_root_generation_v2(root_item,
> + btrfs_root_generation(root_item));
>   uuid_le_gen(_uuid);
> - memcpy(root_item.uuid, new_uuid.b, BTRFS_UUID_SIZE);
> - btrfs_set_stack_timespec_sec(_item.otime, cur_time.tv_sec);
> - btrfs_set_stack_timespec_nsec(_item.otime, cur_time.tv_nsec);
> - root_item.ctime = root_item.otime;
> - btrfs_set_root_ctransid(_item, trans->transid);
> - btrfs_set_root_otransid(_item, trans->transid);
> + memcpy(root_item->uuid, new_uuid.b, BTRFS_UUID_SIZE);
> + btrfs_set_stack_timespec_sec(_item->otime, cur_time.tv_sec);
> + btrfs_set_stack_timespec_nsec(_item->otime, cur_time.tv_nsec);
> + root_item->ctime = root_item->otime;
> + btrfs_set_root_ctransid(root_item, trans->transid);
> + btrfs_set_root_otransid(root_item, trans->transid);
>   
>   btrfs_tree_unlock(leaf);
>   free_extent_buffer(leaf);
>   leaf = NULL;
>   
> - btrfs_set_root_dirid(_item, new_dirid);
> + btrfs_set_root_dirid(root_item, new_dirid);
>   
>   key.objectid = objectid;
>   key.offset = 0;
>   key.type = BTRFS_ROOT_ITEM_KEY;
>   ret = btrfs_insert_root(trans, root->fs_info->tree_root, ,
> - _item);
> + root_item);
>   if (ret)
>   goto fail;
>   
> @@ -601,12 +603,13 @@ static noinline int create_subvol(struct inode *dir,
>   BUG_ON(ret);
>   
>   ret = btrfs_uuid_tree_add(trans, root->fs_info->uuid_root,
> -   root_item.uuid, BTRFS_UUID_KEY_SUBVOL,
> +   root_item->uuid, BTRFS_UUID_KEY_SUBVOL,
> objectid);
>   if (ret)
>   btrfs_abort_transaction(trans, root, ret);
>   
>   fail:
> + kfree(root_item);
>   trans->block_rsv = NULL;
>   trans->bytes_reserved = 0;
>   btrfs_subvolume_release_metadata(root, _rsv, qgroup_reserved);
> 

--
To unsubscribe from this list: send the line 

Re: KERNEL PANIC + CORRUPTED BTRFS?

2016-04-11 Thread lenovomi
Hi,

i didnt try mount -o ro, when i tried to mount it via esata i got
kernel panic immediately. Then i conntected enclosure with drives via
usb and tried to mount it :

https://bpaste.net/show/641ab9172539
plugged via usb -> mount randomly one of the drive mount /dev/sda /mnt/brtfs

I was told on irc channel that i should not run btrfs check and if so
i should run it as
btrfs check --repair --init-extent-tree


Also there was recommendation to run btrfs restore before repair.


It was mounted and all worked fine, then i find out that i miss one
nfs mount and find out cubox was dead, so i rebooted it and now i cant
mount that btrfs volume (data r0, metadata in raid1)

I was told that both metadata copies were corrupted?


Still not clear what should i do as next step.


thanks

On Mon, Apr 11, 2016 at 11:16 PM, Chris Murphy  wrote:
> On Mon, Apr 11, 2016 at 2:08 PM, lenovomi  wrote:
>> Hi,
>>
>>
>> I was running cubox with kernel 4.4.0 and with btrfs raid1 ...
>>
>> not it failed somehow and getting kernel panic during the boot...
>>
>> https://bpaste.net/show/0455daa876de
>>
>> i tried to connect the esata box with 3.x
>>
>> https://bpaste.net/show/98732bc6ce49
>>
>>
>> Any idea? Does it mean that whole volume is dead and i lost ALL THE DATA?
>
> Probably not, but there isn't enough information.
>
> Try 'mount -o ro,recovery' and post all kernel messages for it. If
> that fails, then run 'btrfs check' using a recent btrfs-progs (4.4.1
> or newer hopefully) without --repair, and post those results.
>
> What happened between the last time it mounted OK and when it fails?
> Power failure or just kernel panic and reboot? I don't see the
> complete kernel panic in either of your pastes. Actually the minute
> before the kernel panic might be useful also so I suggest not trimming
> the messages so much.
>
>
> --
> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: track transid for delayed ref flushing

2016-04-11 Thread Josef Bacik
Using the offwakecputime bpf script I noticed most of our time was spent waiting
on the delayed ref throttling.  This is what is supposed to happen, but
sometimes the transaction can commit and then we're waiting for throttling that
doesn't matter anymore.  So change this stuff to be a little smarter by tracking
the transid we were in when we initiated the throttling.  If the transaction we
get is different then we can just bail out.  This resulted in a 50% speedup in
my fs_mark test, and reduced the amount of time spent throttling by 60 seconds
over the entire run (which is about 30 minutes).  Thanks,

Signed-off-by: Josef Bacik 
---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/extent-tree.c | 15 ---
 fs/btrfs/inode.c   |  1 +
 fs/btrfs/transaction.c |  3 ++-
 4 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 55a24c5..4222936 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3505,7 +3505,7 @@ void btrfs_put_block_group(struct btrfs_block_group_cache 
*cache);
 int btrfs_run_delayed_refs(struct btrfs_trans_handle *trans,
   struct btrfs_root *root, unsigned long count);
 int btrfs_async_run_delayed_refs(struct btrfs_root *root,
-unsigned long count, int wait);
+unsigned long count, u64 transid, int wait);
 int btrfs_lookup_data_extent(struct btrfs_root *root, u64 start, u64 len);
 int btrfs_lookup_extent_info(struct btrfs_trans_handle *trans,
 struct btrfs_root *root, u64 bytenr,
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 4b5a517..f23f426 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2839,6 +2839,7 @@ int btrfs_should_throttle_delayed_refs(struct 
btrfs_trans_handle *trans,
 
 struct async_delayed_refs {
struct btrfs_root *root;
+   u64 transid;
int count;
int error;
int sync;
@@ -2854,9 +2855,16 @@ static void delayed_ref_async_start(struct btrfs_work 
*work)
 
async = container_of(work, struct async_delayed_refs, work);
 
-   trans = btrfs_join_transaction(async->root);
+   trans = btrfs_attach_transaction(async->root);
if (IS_ERR(trans)) {
-   async->error = PTR_ERR(trans);
+   if (PTR_ERR(trans) != -ENOENT)
+   async->error = PTR_ERR(trans);
+   goto done;
+   }
+
+   /* Don't bother flushing if we got into a different transaction */
+   if (trans->transid != async->transid) {
+   btrfs_end_transaction(trans, async->root);
goto done;
}
 
@@ -2880,7 +2888,7 @@ done:
 }
 
 int btrfs_async_run_delayed_refs(struct btrfs_root *root,
-unsigned long count, int wait)
+unsigned long count, u64 transid, int wait)
 {
struct async_delayed_refs *async;
int ret;
@@ -2892,6 +2900,7 @@ int btrfs_async_run_delayed_refs(struct btrfs_root *root,
async->root = root->fs_info->tree_root;
async->count = count;
async->error = 0;
+   async->transid = transid;
if (wait)
async->sync = 1;
else
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 723e4bb..e6dd4cc 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4534,6 +4534,7 @@ delete:
BUG_ON(ret);
if (btrfs_should_throttle_delayed_refs(trans, root))
btrfs_async_run_delayed_refs(root,
+trans->transid,
trans->delayed_ref_updates * 2, 0);
if (be_nice) {
if (truncate_space_check(trans, root,
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 43885e5..7c7671d 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -817,6 +817,7 @@ static int __btrfs_end_transaction(struct 
btrfs_trans_handle *trans,
 {
struct btrfs_transaction *cur_trans = trans->transaction;
struct btrfs_fs_info *info = root->fs_info;
+   u64 transid = trans->transid;
unsigned long cur = trans->delayed_ref_updates;
int lock = (trans->type != TRANS_JOIN_NOLOCK);
int err = 0;
@@ -904,7 +905,7 @@ static int __btrfs_end_transaction(struct 
btrfs_trans_handle *trans,
 
kmem_cache_free(btrfs_trans_handle_cachep, trans);
if (must_run_delayed_refs) {
-   btrfs_async_run_delayed_refs(root, cur,
+   btrfs_async_run_delayed_refs(root, cur, transid,
 must_run_delayed_refs == 1);
}
return err;
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: KERNEL PANIC + CORRUPTED BTRFS?

2016-04-11 Thread Chris Murphy
On Mon, Apr 11, 2016 at 2:08 PM, lenovomi  wrote:
> Hi,
>
>
> I was running cubox with kernel 4.4.0 and with btrfs raid1 ...
>
> not it failed somehow and getting kernel panic during the boot...
>
> https://bpaste.net/show/0455daa876de
>
> i tried to connect the esata box with 3.x
>
> https://bpaste.net/show/98732bc6ce49
>
>
> Any idea? Does it mean that whole volume is dead and i lost ALL THE DATA?

Probably not, but there isn't enough information.

Try 'mount -o ro,recovery' and post all kernel messages for it. If
that fails, then run 'btrfs check' using a recent btrfs-progs (4.4.1
or newer hopefully) without --repair, and post those results.

What happened between the last time it mounted OK and when it fails?
Power failure or just kernel panic and reboot? I don't see the
complete kernel panic in either of your pastes. Actually the minute
before the kernel panic might be useful also so I suggest not trimming
the messages so much.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KERNEL PANIC + CORRUPTED BTRFS?

2016-04-11 Thread lenovomi
Hi,


I was running cubox with kernel 4.4.0 and with btrfs raid1 ...

not it failed somehow and getting kernel panic during the boot...

https://bpaste.net/show/0455daa876de

i tried to connect the esata box with 3.x

https://bpaste.net/show/98732bc6ce49


Any idea? Does it mean that whole volume is dead and i lost ALL THE DATA?


Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARN_ON in record_root_in_trans() when deleting freshly renamed subvolume

2016-04-11 Thread Mark Fasheh
On Mon, Apr 11, 2016 at 09:05:47AM +0800, Qu Wenruo wrote:
> 
> 
> Mark Fasheh wrote on 2016/04/08 12:18 -0700:
> >On Fri, Apr 08, 2016 at 03:10:35PM +0200, Holger Hoffstätte wrote:
> >>[cc: Mark and Qu]
> >>
> >>On 04/08/16 13:51, Holger Hoffstätte wrote:
> >>>On 04/08/16 13:14, Filipe Manana wrote:
> Using Chris' for-linus-4.6 branch, which is 4.5-rc6 + all 4.6 btrfs
> patches, it didn't reproduce here:
> >>>
> >>>Great, that's good to know (sort of :). Thanks also to Liu Bo.
> >>>
> Are you sure that you are not using some patches not in 4.6?
> >>
> >>We have a bingo!
> >>
> >>Reverting "qgroup: Fix qgroup accounting when creating snapshot"
> >>from last Wednesday immediately fixes the problem.
> >
> >Not surprising, I had some issues testing it out too. I'm pretty sure this
> >patch is corrupting memory, I just haven't found where yet though my
> >educated guess is that the transaction is being reused improperly.
> > --Mark
> >
> >--
> >Mark Fasheh
> >
> >
> Still digging the bug Mark has reported about the patch.
> 
> Good to have another report, as I can't always reproduce the soft
> lockup from Mark.
> 
> It seems that the WARN_ON will bring another clue to fix it.
> 
> BTW, the memory corruption assumption seems to be quite helpful.
> I didn't consider in that way, but it seems to be the only reason
> causing dead spinlock while no other thread spinning and no lockdep
> warning.

It seems to be the call to commit_cowonly_roots() in your patch which sets
everything off. If I remove that call I can run all day without a crash.

Btw, I'm not convinced this fixes the qgroup numbers anyway - we are still
inconsistent even if I don't get a crash.

Have you tested that the actual numbers on your end are coming out ok?
--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: use dynamic allocation for root item in create_subvol

2016-04-11 Thread David Sterba
The size of root item is more than 400 bytes, which is quite a lot of
stack space. As we do IO from inside the subvolume ioctls, we should
keep the stack usage low in case the filesystem is on top of other
layers (NFS, device mapper, iscsi, etc).

Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 49 ++---
 1 file changed, 26 insertions(+), 23 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 053e677839fe..0be13b9c53d9 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -439,7 +439,7 @@ static noinline int create_subvol(struct inode *dir,
 {
struct btrfs_trans_handle *trans;
struct btrfs_key key;
-   struct btrfs_root_item root_item;
+   struct btrfs_root_item *root_item;
struct btrfs_inode_item *inode_item;
struct extent_buffer *leaf;
struct btrfs_root *root = BTRFS_I(dir)->root;
@@ -455,6 +455,10 @@ static noinline int create_subvol(struct inode *dir,
u64 qgroup_reserved;
uuid_le new_uuid;
 
+   root_item = kzalloc(sizeof(*root_item), GFP_KERNEL);
+   if (!root_item)
+   return -ENOMEM;
+
ret = btrfs_find_free_objectid(root->fs_info->tree_root, );
if (ret)
return ret;
@@ -509,47 +513,45 @@ static noinline int create_subvol(struct inode *dir,
BTRFS_UUID_SIZE);
btrfs_mark_buffer_dirty(leaf);
 
-   memset(_item, 0, sizeof(root_item));
-
-   inode_item = _item.inode;
+   inode_item = _item->inode;
btrfs_set_stack_inode_generation(inode_item, 1);
btrfs_set_stack_inode_size(inode_item, 3);
btrfs_set_stack_inode_nlink(inode_item, 1);
btrfs_set_stack_inode_nbytes(inode_item, root->nodesize);
btrfs_set_stack_inode_mode(inode_item, S_IFDIR | 0755);
 
-   btrfs_set_root_flags(_item, 0);
-   btrfs_set_root_limit(_item, 0);
+   btrfs_set_root_flags(root_item, 0);
+   btrfs_set_root_limit(root_item, 0);
btrfs_set_stack_inode_flags(inode_item, BTRFS_INODE_ROOT_ITEM_INIT);
 
-   btrfs_set_root_bytenr(_item, leaf->start);
-   btrfs_set_root_generation(_item, trans->transid);
-   btrfs_set_root_level(_item, 0);
-   btrfs_set_root_refs(_item, 1);
-   btrfs_set_root_used(_item, leaf->len);
-   btrfs_set_root_last_snapshot(_item, 0);
+   btrfs_set_root_bytenr(root_item, leaf->start);
+   btrfs_set_root_generation(root_item, trans->transid);
+   btrfs_set_root_level(root_item, 0);
+   btrfs_set_root_refs(root_item, 1);
+   btrfs_set_root_used(root_item, leaf->len);
+   btrfs_set_root_last_snapshot(root_item, 0);
 
-   btrfs_set_root_generation_v2(_item,
-   btrfs_root_generation(_item));
+   btrfs_set_root_generation_v2(root_item,
+   btrfs_root_generation(root_item));
uuid_le_gen(_uuid);
-   memcpy(root_item.uuid, new_uuid.b, BTRFS_UUID_SIZE);
-   btrfs_set_stack_timespec_sec(_item.otime, cur_time.tv_sec);
-   btrfs_set_stack_timespec_nsec(_item.otime, cur_time.tv_nsec);
-   root_item.ctime = root_item.otime;
-   btrfs_set_root_ctransid(_item, trans->transid);
-   btrfs_set_root_otransid(_item, trans->transid);
+   memcpy(root_item->uuid, new_uuid.b, BTRFS_UUID_SIZE);
+   btrfs_set_stack_timespec_sec(_item->otime, cur_time.tv_sec);
+   btrfs_set_stack_timespec_nsec(_item->otime, cur_time.tv_nsec);
+   root_item->ctime = root_item->otime;
+   btrfs_set_root_ctransid(root_item, trans->transid);
+   btrfs_set_root_otransid(root_item, trans->transid);
 
btrfs_tree_unlock(leaf);
free_extent_buffer(leaf);
leaf = NULL;
 
-   btrfs_set_root_dirid(_item, new_dirid);
+   btrfs_set_root_dirid(root_item, new_dirid);
 
key.objectid = objectid;
key.offset = 0;
key.type = BTRFS_ROOT_ITEM_KEY;
ret = btrfs_insert_root(trans, root->fs_info->tree_root, ,
-   _item);
+   root_item);
if (ret)
goto fail;
 
@@ -601,12 +603,13 @@ static noinline int create_subvol(struct inode *dir,
BUG_ON(ret);
 
ret = btrfs_uuid_tree_add(trans, root->fs_info->uuid_root,
- root_item.uuid, BTRFS_UUID_KEY_SUBVOL,
+ root_item->uuid, BTRFS_UUID_KEY_SUBVOL,
  objectid);
if (ret)
btrfs_abort_transaction(trans, root, ret);
 
 fail:
+   kfree(root_item);
trans->block_rsv = NULL;
trans->bytes_reserved = 0;
btrfs_subvolume_release_metadata(root, _rsv, qgroup_reserved);
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Stack usage reduction

2016-04-11 Thread David Sterba
Hi,

using the gcc option -fstack-usage I measured the stack usage and tried to get
rid of the worst offenders. The best improvement was in create_subvol where we
stored a 400B+ structure on stack, but otherwise it's not always clear why the
stack usage is that high. Most functions consume less than 300 bytes, and this
number can account for inlined functions or other invisible compiler
optimization magic.

A few samples of what's left:

scrub.c:3056:31:scrub_stripe 568  static
volumes.c:3984:12:btrfs_uuid_scan_kthread536  static
scrub.c:2834:31:scrub_raid56_parity  384  static
ioctl.c:5225:12:btrfs_ioctl_set_fslabel  304  static
ioctl.c:1261:5:btrfs_defrag_file 304  static
tree-log.c:3593:21:copy_items288  
dynamic,bounded
ioctl.c:5202:12:btrfs_ioctl_get_fslabel  288  static
ioctl.c:3457:12:btrfs_clone  288  static
extent_io.c:2873:12:__do_readpage288  static
file.c:691:5:__btrfs_drop_extents272  static
file.c:2646:13:btrfs_fallocate   272  static
extent-tree.c:2469:21:__btrfs_run_delayed_refs   272  static
extent_io.c:3779:5:btree_write_cache_pages   272  static
extent_io.c:1730:6:extent_clear_unlock_delalloc  272  static
tree-log.c:4432:12:btrfs_log_inode   264  
dynamic,bounded
tree-log.c:578:21:replay_one_extent  256  static
transaction.c:1322:21:create_pending_snapshot256  static
ioctl.c:434:21:create_subvol 256  static
inode.c:1221:21:run_delalloc_nocow   256  static
extent_io.c:4145:5:extent_readpages  256  static
tree-log.c:4126:12:btrfs_log_changed_extents 248  
dynamic,bounded
relocation.c:680:22:build_backref_tree   248  static
inode.c:4287:5:btrfs_truncate_inode_items248  static
extent_io.c:3312:31:__extent_writepage_io248  static
volumes.c:2948:12:insert_balance_item240  static
relocation.c:1762:5:replace_path 240  static
...

More detailed info would be needed to decide whether it's worth to reshuffle
the stack variables, from what I've seen it would make the code readability
worse, so I've stopped.


The following changes since commit bb7ab3b92e46da06b580c6f83abe7894dc449cca:

  btrfs: Fix misspellings in comments. (2016-03-14 15:05:02 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git dev/stack-reduce

for you to fetch changes up to 355ef15a6eafc0cafd49b049d7173040adb44f61:

  btrfs: reuse existing variable in scrub_stripe, reduce stack usage 
(2016-03-24 18:06:34 +0100)


David Sterba (2):
  btrfs: use dynamic allocation for root item in create_subvol
  btrfs: reuse existing variable in scrub_stripe, reduce stack usage

 fs/btrfs/ioctl.c | 49 ++---
 fs/btrfs/scrub.c | 19 +--
 2 files changed, 35 insertions(+), 33 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] btrfs: reuse existing variable in scrub_stripe, reduce stack usage

2016-04-11 Thread David Sterba
The key variable occupies 17 bytes, the key_start is used once, we can
simply reuse existing 'key' for that purpose. As the key is not a simple
type, compiler doest not do it on itself.

Signed-off-by: David Sterba 
---
 fs/btrfs/scrub.c | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index 39dbdcbf4d13..07db452e4a15 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -3070,7 +3070,6 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
int slot;
u64 nstripes;
struct extent_buffer *l;
-   struct btrfs_key key;
u64 physical;
u64 logical;
u64 logic_end;
@@ -3079,7 +3078,7 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
int mirror_num;
struct reada_control *reada1;
struct reada_control *reada2;
-   struct btrfs_key key_start;
+   struct btrfs_key key;
struct btrfs_key key_end;
u64 increment = map->stripe_len;
u64 offset;
@@ -3158,21 +3157,21 @@ static noinline_for_stack int scrub_stripe(struct 
scrub_ctx *sctx,
scrub_blocked_if_needed(fs_info);
 
/* FIXME it might be better to start readahead at commit root */
-   key_start.objectid = logical;
-   key_start.type = BTRFS_EXTENT_ITEM_KEY;
-   key_start.offset = (u64)0;
+   key.objectid = logical;
+   key.type = BTRFS_EXTENT_ITEM_KEY;
+   key.offset = (u64)0;
key_end.objectid = logic_end;
key_end.type = BTRFS_METADATA_ITEM_KEY;
key_end.offset = (u64)-1;
-   reada1 = btrfs_reada_add(root, _start, _end);
+   reada1 = btrfs_reada_add(root, , _end);
 
-   key_start.objectid = BTRFS_EXTENT_CSUM_OBJECTID;
-   key_start.type = BTRFS_EXTENT_CSUM_KEY;
-   key_start.offset = logical;
+   key.objectid = BTRFS_EXTENT_CSUM_OBJECTID;
+   key.type = BTRFS_EXTENT_CSUM_KEY;
+   key.offset = logical;
key_end.objectid = BTRFS_EXTENT_CSUM_OBJECTID;
key_end.type = BTRFS_EXTENT_CSUM_KEY;
key_end.offset = logic_end;
-   reada2 = btrfs_reada_add(csum_root, _start, _end);
+   reada2 = btrfs_reada_add(csum_root, , _end);
 
if (!IS_ERR(reada1))
btrfs_reada_wait(reada1);
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] btrfs: send: use vmalloc only as fallback for send_buf

2016-04-11 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/send.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 19b7bf4284ee..8f6f9d6d14df 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -6022,10 +6022,13 @@ long btrfs_ioctl_send(struct file *mnt_file, void 
__user *arg_)
sctx->clone_roots_cnt = arg->clone_sources_count;
 
sctx->send_max_size = BTRFS_SEND_BUF_SIZE;
-   sctx->send_buf = vmalloc(sctx->send_max_size);
+   sctx->send_buf = kmalloc(sctx->send_max_size, GFP_KERNEL | 
__GFP_NOWARN);
if (!sctx->send_buf) {
-   ret = -ENOMEM;
-   goto out;
+   sctx->send_buf = vmalloc(sctx->send_max_size);
+   if (!sctx->send_buf) {
+   ret = -ENOMEM;
+   goto out;
+   }
}
 
sctx->read_buf = vmalloc(BTRFS_SEND_READ_SIZE);
@@ -6214,7 +6217,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
fput(sctx->send_filp);
 
vfree(sctx->clone_roots);
-   vfree(sctx->send_buf);
+   kvfree(sctx->send_buf);
vfree(sctx->read_buf);
 
name_cache_free(sctx);
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] btrfs: send: use vmalloc only as fallback for clone_sources_tmp

2016-04-11 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/send.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 02967374d0d9..53a40a7077a2 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -6059,10 +6059,13 @@ long btrfs_ioctl_send(struct file *mnt_file, void 
__user *arg_)
alloc_size = arg->clone_sources_count * sizeof(*arg->clone_sources);
 
if (arg->clone_sources_count) {
-   clone_sources_tmp = vmalloc(alloc_size);
+   clone_sources_tmp = kmalloc(alloc_size, GFP_KERNEL | 
__GFP_NOWARN);
if (!clone_sources_tmp) {
-   ret = -ENOMEM;
-   goto out;
+   clone_sources_tmp = vmalloc(alloc_size);
+   if (!clone_sources_tmp) {
+   ret = -ENOMEM;
+   goto out;
+   }
}
 
ret = copy_from_user(clone_sources_tmp, arg->clone_sources,
@@ -6100,7 +6103,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
sctx->clone_roots[i].root = clone_root;
clone_sources_to_rollback = i + 1;
}
-   vfree(clone_sources_tmp);
+   kvfree(clone_sources_tmp);
clone_sources_tmp = NULL;
}
 
@@ -6218,7 +6221,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
btrfs_root_dec_send_in_progress(sctx->parent_root);
 
kfree(arg);
-   vfree(clone_sources_tmp);
+   kvfree(clone_sources_tmp);
 
if (sctx) {
if (sctx->send_filp)
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] btrfs: send: use temporary variable to store allocation size

2016-04-11 Thread David Sterba
We're going to use the argument multiple times later.

Signed-off-by: David Sterba 
---
 fs/btrfs/send.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index fc9d7f6212c1..ab1b4d259836 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -5939,6 +5939,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
u32 i;
u64 *clone_sources_tmp = NULL;
int clone_sources_to_rollback = 0;
+   unsigned alloc_size;
int sort_clone_roots = 0;
int index;
 
@@ -6044,24 +6045,25 @@ long btrfs_ioctl_send(struct file *mnt_file, void 
__user *arg_)
sctx->waiting_dir_moves = RB_ROOT;
sctx->orphan_dirs = RB_ROOT;
 
-   sctx->clone_roots = vzalloc(sizeof(struct clone_root) *
-   (arg->clone_sources_count + 1));
+   alloc_size = sizeof(struct clone_root) * (arg->clone_sources_count + 1);
+
+   sctx->clone_roots = vzalloc(alloc_size);
if (!sctx->clone_roots) {
ret = -ENOMEM;
goto out;
}
 
+   alloc_size = arg->clone_sources_count * sizeof(*arg->clone_sources);
+
if (arg->clone_sources_count) {
-   clone_sources_tmp = vmalloc(arg->clone_sources_count *
-   sizeof(*arg->clone_sources));
+   clone_sources_tmp = vmalloc(alloc_size);
if (!clone_sources_tmp) {
ret = -ENOMEM;
goto out;
}
 
ret = copy_from_user(clone_sources_tmp, arg->clone_sources,
-   arg->clone_sources_count *
-   sizeof(*arg->clone_sources));
+   alloc_size);
if (ret) {
ret = -EFAULT;
goto out;
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] btrfs: send: use vmalloc only as fallback for clone_roots

2016-04-11 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/send.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index ab1b4d259836..02967374d0d9 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -6047,10 +6047,13 @@ long btrfs_ioctl_send(struct file *mnt_file, void 
__user *arg_)
 
alloc_size = sizeof(struct clone_root) * (arg->clone_sources_count + 1);
 
-   sctx->clone_roots = vzalloc(alloc_size);
+   sctx->clone_roots = kzalloc(alloc_size, GFP_KERNEL | __GFP_NOWARN);
if (!sctx->clone_roots) {
-   ret = -ENOMEM;
-   goto out;
+   sctx->clone_roots = vzalloc(alloc_size);
+   if (!sctx->clone_roots) {
+   ret = -ENOMEM;
+   goto out;
+   }
}
 
alloc_size = arg->clone_sources_count * sizeof(*arg->clone_sources);
@@ -6221,7 +6224,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
if (sctx->send_filp)
fput(sctx->send_filp);
 
-   vfree(sctx->clone_roots);
+   kvfree(sctx->clone_roots);
kvfree(sctx->send_buf);
kvfree(sctx->read_buf);
 
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] btrfs: clone: use vmalloc only as fallback for nodesize bufer

2016-04-11 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/ioctl.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 21423dd15da4..0cb80379e6f6 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3468,13 +3468,16 @@ static int btrfs_clone(struct inode *src, struct inode 
*inode,
u64 last_dest_end = destoff;
 
ret = -ENOMEM;
-   buf = vmalloc(root->nodesize);
-   if (!buf)
-   return ret;
+   buf = kmalloc(root->nodesize, GFP_KERNEL | __GFP_NOWARN);
+   if (!buf) {
+   buf = vmalloc(root->nodesize);
+   if (!buf)
+   return ret;
+   }
 
path = btrfs_alloc_path();
if (!path) {
-   vfree(buf);
+   kvfree(buf);
return ret;
}
 
@@ -3775,7 +3778,7 @@ static int btrfs_clone(struct inode *src, struct inode 
*inode,
 
 out:
btrfs_free_path(path);
-   vfree(buf);
+   kvfree(buf);
return ret;
 }
 
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] btrfs: send: use vmalloc only as fallback for read_buf

2016-04-11 Thread David Sterba
Signed-off-by: David Sterba 
---
 fs/btrfs/send.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 8f6f9d6d14df..fc9d7f6212c1 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -6031,10 +6031,13 @@ long btrfs_ioctl_send(struct file *mnt_file, void 
__user *arg_)
}
}
 
-   sctx->read_buf = vmalloc(BTRFS_SEND_READ_SIZE);
+   sctx->read_buf = kmalloc(BTRFS_SEND_READ_SIZE, GFP_KERNEL | 
__GFP_NOWARN);
if (!sctx->read_buf) {
-   ret = -ENOMEM;
-   goto out;
+   sctx->read_buf = vmalloc(BTRFS_SEND_READ_SIZE);
+   if (!sctx->read_buf) {
+   ret = -ENOMEM;
+   goto out;
+   }
}
 
sctx->pending_dir_moves = RB_ROOT;
@@ -6218,7 +6221,7 @@ long btrfs_ioctl_send(struct file *mnt_file, void __user 
*arg_)
 
vfree(sctx->clone_roots);
kvfree(sctx->send_buf);
-   vfree(sctx->read_buf);
+   kvfree(sctx->read_buf);
 
name_cache_free(sctx);
 
-- 
2.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/6] Add more vmalloc fallbacks to memory allocations

2016-04-11 Thread David Sterba
Hi,

inspired by a recent fix where we tried to kmalloc a 64k nodesize buffer,
without the vmalloc fallback, and failed. This series add the "kmalloc-first
and vmalloc-fallback" logic to more places, namely to the buffers used during
send.  If the memory is not fragmented, kmalloc succeeds and does not take the
resources required for the mappings.


The following changes since commit 56f23fdbb600e6087db7b009775b95ce07cc3195:

  Btrfs: fix file/data loss caused by fsync after rename and new inode 
(2016-04-06 17:01:44 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git dev/kvalloc

for you to fetch changes up to c3e3930516c4d14ed1d6d70964fbc4f3faa36844:

  btrfs: clone: use vmalloc only as fallback for nodesize bufer (2016-04-11 
19:06:39 +0200)


David Sterba (6):
  btrfs: send: use vmalloc only as fallback for send_buf
  btrfs: send: use vmalloc only as fallback for read_buf
  btrfs: send: use temporary variable to store allocation size
  btrfs: send: use vmalloc only as fallback for clone_roots
  btrfs: send: use vmalloc only as fallback for clone_sources_tmp
  btrfs: clone: use vmalloc only as fallback for nodesize bufer

 fs/btrfs/ioctl.c | 13 -
 fs/btrfs/send.c  | 56 +++-
 2 files changed, 43 insertions(+), 26 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: do not create empty block group if we have allocated data

2016-04-11 Thread David Sterba
On Mon, Dec 14, 2015 at 06:29:32PM -0800, Liu Bo wrote:
> Now we force to create empty block group to keep data profile alive,
> however, in the below example, we eventually get an empty block group
> while we're trying to get more space for other types (metadata/system),
> 
> - Before,
> block group "A": size=2G, used=1.2G
> block group "B": size=2G, used=512M
> 
> - After "btrfs balance start -dusage=50 mount_point",
> block group "A": size=2G, used=(1.2+0.5)G
> block group "C": size=2G, used=0
> 
> Since there is no data in block group C, it won't be deleted
> automatically and we have to get the unused 2G until the next mount.
> 
> Balance itself just moves data and doesn't remove data, so it's safe
> to not create such a empty block group if we already have data
>  allocated in other block groups.
> 
> Signed-off-by: Liu Bo 

I'm adding the patch to my for-next.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: csum failed on innexistent inode

2016-04-11 Thread Henk Slager
On Mon, Apr 11, 2016 at 3:48 AM, Jérôme Poulin  wrote:
> Sorry for the confusion, allow me to clarify and I will summarize with
> what I learned since I now understand that corruption was present
> before disk went bad.
>
> Note that this BTRFS was once on a MD RAID5 on LVM on LUKS before
> being moved in-place to LVM on LUKS on BTRFS RAID10. But since balance
> worked at the time.

I haven't used LVM for years, but those in-place actions normally work
if size calculations etc are correct. Otherwise you would know
immediately.

> Also note that this computer was booted twice for about 30 minutes
> period with bad ram before it was replaced.

This is very important info. It is clear now that there was bad memory
and that it is just half an hour.

> I think my checksums errors were present, but unknown to me, before
> the hardware disk failure. The bad memory might be the root cause of
> this problem but I can't be sure.

When I look at all the info now and also think of my own experience
with bad ram module and btrfs, I think this bad memory is the root
cause. I have seen btrfs RAID10 correcting a few errors (likely coming
from earlier crashes with btrfs RAID5 on older disks). If it can't
correct, there is something else wrong and likely affecting more
devices than the RAID profile is able to correct.

> On Sun, Apr 10, 2016 at 1:25 PM, Henk Slager  wrote:
>> It was not fully clear what the sequence of events were:
>> - HW problem
>> - btrfs SW problem
>> - 1st scrub
>> - the --repair-sector with hdparm
>> - 2nd scrub
>> - 3rd scrub?
>>
>
> 1. Errors in dmesg and confirmation from smartd that hardware problems
> were present.
> 2. Attempt to repair sector using --repair-sector which reset the
> sector to zeroes.
> 3. Scrub detected errors and fixed some but there were 18 uncorrectable.
> 4. Disk has been changed using btrfs replace. Corruption still present.
> 5. Balance attempted but aborts when encountering the first uncorrectable 
> error.
> 6. Tentative to locate bad sector/inode without success leading to
> another scrub with the same errors.
> 7. Attempt to reset stats and scrub again. Still getting the same errors.
> 8. New disk added and data profile converted from RAID10 to RAID1,
> balance abort on first uncorrectable error.
>
>
>> There is also DM between the harddisk and btrfs and I am not sure if
>> whether the hdparm action did repair or further corrupt things.
>>
>
> I confirmed after using --repair-sector that the sector has been reset
> to zeroes using --read-sector. I also tried read-sector first which
> failed and added an entry to the SMART log. After repair-sector,
> read-sector returned the zeroed sector.
>
>> How do you know for sure that the contents of the 'logical blocks' are
>> the same on both devices?
>>
>
> After a balance, here is what dmesg shows (complete warning output):
> BTRFS warning (device dm-36): csum failed ino 330 off 1809084416 csum
> 4147641019 expected csum 1755301217
> BTRFS warning (device dm-36): csum failed ino 330 off 1809195008 csum
> 1515428513 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809199104 csum
> 1927504681 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809211392 csum
> 3086571080 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809149952 csum
> 3254083717 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809162240 csum
> 3157020538 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809166336 csum
> 1092724678 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809178624 csum
> 4235459038 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809182720 csum
> 1764946502 expected csum 2566472073
> BTRFS warning (device dm-36): csum failed ino 330 off 1809084416 csum
> 4147641019 expected csum 1755301217
>
>
> After a scrub (complete error output):
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 1, gen 0
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 2, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334876672 on dev /dev/dm-32
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334987264 on dev /dev/dm-32
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 3, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296334991360 on dev /dev/dm-32
> BTRFS error (device dm-36): bdev /dev/dm-32 errs: wr 0, rd 0, flush 0,
> corrupt 4, gen 0
> BTRFS error (device dm-36): unable to fixup (regular) error at logical
> 1296335003648 on dev /dev/dm-32
> BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, flush 0,
> corrupt 1, gen 0
> BTRFS error (device dm-36): bdev /dev/dm-36 errs: wr 0, rd 0, 

refcount overflow in 4.4.6-grsec kernel

2016-04-11 Thread Tobias Hunger
Hi,

I updated my archlinux to use a grsec kernel (version 4.4.6). Now I
get lots of errors from PAX and all backtraces show mention btrfs.

Is this a known problem? Is there anything I can help to debug this?

This is the dump from the logs:

Apr 11 07:43:36 kernel: PAX: refcount overflow detected in:
pacman:11700, uid/euid: 0/0
Apr 11 07:43:36 kernel: CPU: 1 PID: 11700 Comm: pacman Not tainted
4.4.6.201604021734-1-grsec #1
Apr 11 07:43:36 kernel: Hardware name: LENOVO, BIOS 1.08 03/09/2016
Apr 11 07:43:36 kernel: task: 880524c28a80 ti: 880524c294a8
task.ti: 880524c294a8
Apr 11 07:43:36 kernel: RIP: 0010:[]
[] btrfs_qgroup_reserve_meta+0x73/0x90 [btrfs]
Apr 11 07:43:36 kernel: RSP: 0018:c9000d6e3a90  EFLAGS: 0a06
Apr 11 07:43:36 kernel: RAX:  RBX: 8804fecc5050
RCX: 
Apr 11 07:43:36 kernel: RDX: 880524e708c8 RSI: c9000d6e3a48
RDI: 880524c12d70
Apr 11 07:43:36 kernel: RBP: c9000d6e3aa0 R08: 
R09: 88036cf5d048
Apr 11 07:43:36 kernel: R10: 8803739a0410 R11: 
R12: 00014000
Apr 11 07:43:36 kernel: R13: 8804fecc5050 R14: 0005
R15: 00014000
Apr 11 07:43:36 kernel: FS:  03f0e634c740()
GS:88054144() knlGS:
Apr 11 07:43:36 kernel: CS:  0010 DS:  ES:  CR0: 80050033
Apr 11 07:43:36 kernel: CR2: 0074a6a8 CR3: 0660c000
CR4: 003606f0
Apr 11 07:43:36 kernel: DR0:  DR1: 
DR2: 
Apr 11 07:43:36 kernel: DR3:  DR6: fffe0ff0
DR7: 0400
Apr 11 07:43:36 kernel: Stack:
Apr 11 07:43:36 kernel:  0002 0201
c9000d6e3af8 c025ab06
Apr 11 07:43:36 kernel:  861be853 4111
880373ccfa00 8805250d0620
Apr 11 07:43:36 kernel:  8804fecc5050 0005
880448a4db88 0001
Apr 11 07:43:36 kernel: Call Trace:
Apr 11 07:43:36 kernel:  []
start_transaction+0x346/0x430 [btrfs]
Apr 11 07:43:36 kernel:  [] ? lookup_fast+0x53/0x350
Apr 11 07:43:36 kernel:  []
btrfs_start_transaction+0x22/0x30 [btrfs]
Apr 11 07:43:36 kernel:  [] btrfs_create+0x46/0x250 [btrfs]
Apr 11 07:43:36 kernel:  [] ? __inode_permission+0x3c/0xc0
Apr 11 07:43:36 kernel:  [] vfs_create+0xa5/0xe0
Apr 11 07:43:36 kernel:  [] path_openat+0x13c3/0x1400
Apr 11 07:43:36 kernel:  [] do_filp_open+0xb6/0x130
Apr 11 07:43:36 kernel:  [] do_sys_open+0x151/0x230
Apr 11 07:43:36 kernel:  [] SyS_open+0x28/0x40
Apr 11 07:43:36 kernel:  []
entry_SYSCALL_64_fastpath+0x12/0x86
Apr 11 07:43:36 kernel:  [] ?
entry_SYSCALL_64_fastpath+0x45/0x86
Apr 11 07:43:36 kernel:  [] ?
entry_SYSCALL_64_fastpath+0x45/0x86
Apr 11 07:43:36 kernel: Code: 44 21 e0 41 39 c4 75 32 49 63 f4 48 89
df e8 b5 cb ff ff 85 c0 78 18 f0 44 01 a3 fc 04 00 00 71 0a f0 44 29
a3 fc 04 00 00 cd 04  02 31 c0 5b 41 5c 5d 48 0f ba 2c 24 3f c3
A

Best Regards,
Tobias
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html