Re: [PATCH] fstests: common: Enhance _exclude_scratch_mount_option to handle multiply options and generic fs type

2016-09-06 Thread Dave Chinner
On Tue, Sep 06, 2016 at 01:06:39PM +0800, Qu Wenruo wrote:
> Considering not every contributor will add comment about excluded
> mount options, and in case generic test cases needs to exclude one
> mount option for given fstype, it will be quite hard to find the
> reason.

That is why we review changes. If it's not obvious to the reviewer
why the mount option is excluded, or it's not documented in the
commit message, then the reviewer should be asking for it to be
added.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fstests: common: Enhance _exclude_scratch_mount_option to handle multiply options and generic fs type

2016-09-06 Thread Dave Chinner
On Tue, Sep 06, 2016 at 12:20:39PM +0800, Eryu Guan wrote:
> On Mon, Sep 05, 2016 at 03:13:33PM +0800, Qu Wenruo wrote:
> > Enhance _exclude_scratch_mount_option() function to get real mount
> > options from $MOUNT_OPTIONS.
> 
> This seems unnecessarily complex to me.
> 
> > 
> > Now it can understand and extract real mount option from string like
> > "-o opt1,opt2 -oopt3".
> > Furthermore, it doesn't use grep method, which can lead to false alert
> > for options like inode_cache and noinode_cache.
> > It now do compare with the first n characters of the prohibited list,
> > so it can handle "data=" and above "no" prefix well.
> 
> I think we can fix it by adding "-w" option to grep, and replacing
> "data=" with "data", "=" seems not necessary.
> 
> > 
> > And add a new parameter, 'fstype' for _exclude_scratch_mount_option().
> > So for generic test cases, it can still prohibit mount options for given
> > fs(mainly for btrfs though)
> 
> This requires every caller of this helper provides an additional fstype
> argument, and in most cases this argument is not useful (generic or
> current FSTYP). If btrfs needs to be handled differently, how about
> checking the fstype in the test and adding additional mount option rules
> if fstype is btrfs?
> 
> > 
> > Finally, allow it to accept multiple options at the same time.
> > No need for multiple _exclude_scratch_mount_option lines now
> 
> So _exclude_scratch_mount_option is simply:
> 
> # skip test if MOUNT_OPTIONS contains the given mount options
> _exclude_scratch_mount_option()
> {
> for opt in $*; do
> if echo $MOUNT_OPTIONS | grep -qw "$opt"; then
> _notrun "mount option \"$opt\" not allowed in this 
> test"
> fi
> done
> }
> 
> (Note that the comment in current code is wrong, MKFS_OPTIONS should be
> MOUNT_OPTIONS)
> 
> What do you and/or other people think?

Much simpler, easier to understand and tell why the test did not
run.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fstests: common: Enhance _exclude_scratch_mount_option to handle multiply options and generic fs type

2016-09-06 Thread Eryu Guan
On Tue, Sep 06, 2016 at 01:06:39PM +0800, Qu Wenruo wrote:
> 
> 
> At 09/06/2016 12:20 PM, Eryu Guan wrote:
> > On Mon, Sep 05, 2016 at 03:13:33PM +0800, Qu Wenruo wrote:
> > > Enhance _exclude_scratch_mount_option() function to get real mount
> > > options from $MOUNT_OPTIONS.
> > 
> > This seems unnecessarily complex to me.
> 
> Considering the fact that we didn't specify any other options except -o,
> yes, it's a quite complex than original one.
> 
> But, it's still needed for case like "-oopt3" can't be handled by "grep -w",
> as there is no space between "-o" and "opt3".
> 
> > 
> > > 
> > > Now it can understand and extract real mount option from string like
> > > "-o opt1,opt2 -oopt3".
> > > Furthermore, it doesn't use grep method, which can lead to false alert
> > > for options like inode_cache and noinode_cache.
> > > It now do compare with the first n characters of the prohibited list,
> > > so it can handle "data=" and above "no" prefix well.
> > 
> > I think we can fix it by adding "-w" option to grep, and replacing
> > "data=" with "data", "=" seems not necessary.
> 
> In that case, "-oopt3" can't be handled by "-w" option.

Ah, I missed that. Then "normalize" MOUNT_OPTIONS first should be OK,
like what you do in this patch.

> 
> > 
> > > 
> > > And add a new parameter, 'fstype' for _exclude_scratch_mount_option().
> > > So for generic test cases, it can still prohibit mount options for given
> > > fs(mainly for btrfs though)
> > 
> > This requires every caller of this helper provides an additional fstype
> > argument, and in most cases this argument is not useful (generic or
> > current FSTYP). If btrfs needs to be handled differently, how about
> > checking the fstype in the test and adding additional mount option rules
> > if fstype is btrfs?
> 
> Currently, only 2 callers in fact. ext4/271 and xfs/134.
> So the cost is still quite low to add a 'fstype'.

I mean in most cases the fstype is not useful, not only to existing
callers but also to future callers. We can always add special-case when
needed for a certain fstype.

And there're other callers outside of test case, e.g.
_require_metadata_journaling()
_require_atime()
where you have no idea which filesystem is under testing.

> 
> The main object is, to info reviewer or testcase writer which fs type needs
> special handling.
> 
> Considering not every contributor will add comment about excluded mount
> options, and in case generic test cases needs to exclude one mount option
> for given fstype, it will be quite hard to find the reason.
> 
> So with new fstype parameter, at least we known which fstype is to be blame.
> (Although, it will be btrfs for most of time)

Reviewers and maintainers should ask for comments to explain the reason
why a certain mount option is excluded. So I'd suggest something
like(not tested)

# skip test if MOUNT_OPTIONS contains the given mount options
_exclude_scratch_mount_option()
{
local opts=`echo $MOUNT_OPTIONS | sed 's/-o/ /g'`
for opt in $*; do
if echo $opts | grep -qw "$opt"; then
_notrun "mount option \"$opt\" not allowed in this test"
fi
done
}

And in normal case, we do

# exclude mount option xxx because ...
_exclude_scratch_mount_option "opt1" "opt2"

And when btrfs needs special handling

# comments to explain why btrfs needs special handling
if [ "$FSTYP" == "btrfs" ]; then
_exclude_scratch_mount_option "opt1" "opt2"
fi

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: qgroup: Fix regression leads to corrupted qgroup status

2016-09-06 Thread Qu Wenruo
Commit 93dabf211d74daf6e3de642bdd887a90a00f7b49
Author: Mark Fasheh 
Date:   Fri Jun 17 13:37:48 2016 -0700

btrfs-progs: check: verify qgroups above level 0

This commit introduced a new regression which corrupts
read_qgroup_status, since it iterate leaf with manually specified slot,
not correct path->slot[0].

This leads to wrong slot[0] and read_qgroup_status() will read out wrong
flags, leading to regression.

Fix read_qgroup_status() by using eb and slot instread of wrong path
strucutre.

Reported-by: Tsutomu Itoh 
Cc: Mark Fasheh 
Signed-off-by: Qu Wenruo 
---
 qgroup-verify.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/qgroup-verify.c b/qgroup-verify.c
index 66eb870..f6df12d 100644
--- a/qgroup-verify.c
+++ b/qgroup-verify.c
@@ -874,15 +874,14 @@ static int add_qgroup_relation(u64 memberid, u64 parentid)
return 0;
 }
 
-static void read_qgroup_status(struct btrfs_path *path,
+static void read_qgroup_status(struct extent_buffer *eb, int slot,
  struct counts_tree *counts)
 {
struct btrfs_qgroup_status_item *status_item;
u64 flags;
 
-   status_item = btrfs_item_ptr(path->nodes[0], path->slots[0],
-struct btrfs_qgroup_status_item);
-   flags = btrfs_qgroup_status_flags(path->nodes[0], status_item);
+   status_item = btrfs_item_ptr(eb, slot, struct btrfs_qgroup_status_item);
+   flags = btrfs_qgroup_status_flags(eb, status_item);
/*
 * Since qgroup_inconsist/rescan_running is just one bit,
 * assign value directly won't work.
@@ -946,7 +945,7 @@ loop:
}
 
if (key.type == BTRFS_QGROUP_STATUS_KEY) {
-   read_qgroup_status(, );
+   read_qgroup_status(leaf, i, );
continue;
}
 
-- 
2.9.3



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread Qu Wenruo



At 09/07/2016 09:38 AM, Sean Fu wrote:

On Mon, Sep 05, 2016 at 03:56:41PM +0800, Qu Wenruo wrote:



At 09/05/2016 09:19 AM, Zhao Lei wrote:

Hi, Sean Fu


From: Sean Fu [mailto:fxinr...@gmail.com]
Sent: Sunday, September 04, 2016 7:54 PM
To: dste...@suse.com
Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
zhao...@cn.fujitsu.com; linux-btrfs@vger.kernel.org;
linux-ker...@vger.kernel.org; Sean Fu 
Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in
btrfs_read_chunk_tree.

The input argument root is already set with "fs_info->chunk_root".
"chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
"open_ctree".
“root->fs_info = fs_info” in "btrfs_alloc_root".


The root argument of this function means "any root".
And the function is designed getting chunk root from
"any root" in head.

Since there is only one caller of this function,
and the caller always send chunk_root as root argument in
current code, we can remove above conversion,
and I suggest renaming root to chunk_root to make it clear,
something like:

- btrfs_read_chunk_tree(struct btrfs_root *root)
+ btrfs_read_chunk_tree(struct btrfs_root *chunk_root)


Since root is only used to get fs_info->chunk_root, why not use fs_info
directly?

Sorry for late reply.
chunk_root is processed in btrfs_read_chunk_tree.
Why should we pass fs_info directly to btrfs_read_chunk_tree?
Could you give me more detail?

Many thanks


Normally we should only pass btrfs_root as parameter if it's a 
file/log/relocation tree which can't be grabbed directly from fs_info.


For system wide trees, which are already in fs_info, like 
fs_info->extent_root/chunk_root/..., we should pass fs_info.


Which is much much safer than passing a btrfs_root.
Careless caller can pass wrong tree and cause undefined behavior.

And such behavior makes caller more aware of what they really want to do.
Cases like just to grab sectorsize/nodesize shouldn't need a full 
btrfs_root.

(Jeff's patchset has already done such things quite well)

Thanks,
Qu



Thanks,
Qu



Thanks
Zhaolei


Signed-off-by: Sean Fu 
---
fs/btrfs/volumes.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 366b335..384a6d2 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
int ret;
int slot;

-   root = root->fs_info->chunk_root;
-
path = btrfs_alloc_path();
if (!path)
return -ENOMEM;
--
2.6.2






--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread Sean Fu
On Tue, Sep 06, 2016 at 11:12:20AM -0400, Jeff Mahoney wrote:
> On 9/6/16 5:58 AM, David Sterba wrote:
> > On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
>  Since root is only used to get fs_info->chunk_root, why not use fs_info
>  directly?
> >>>
> >>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
> >>> to go back and check what else is missing.
> >>
> >> Actually, most of this didn't land.  Pretty much anything that's a root
> >> ->fs_info conversion is in there.
> > 
> > Only half of the patchset has been merged so far because it did not pass
> > testing, so I bisected to some point. I was about to let you know once
> > most of 4.9 patches are prepared so there are less merge conflicts.
> 
> Ok, thanks.  I was going to start the rebase today but I'll hold off
> until you're set for 4.9.
Sorry for late reply.
Many thanks to all of you.
> 
> -Jeff
> 
> -- 
> Jeff Mahoney
> SUSE Labs
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread Sean Fu
On Tue, Sep 06, 2016 at 01:19:39PM +0800, Zhao Lei wrote:
> Hi, Sean Fu
> 
> > -Original Message-
> > From: Sean Fu [mailto:fxinr...@gmail.com]
> > Sent: Tuesday, September 06, 2016 11:51 AM
> > To: dste...@suse.com
> > Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
> > zhao...@cn.fujitsu.com; linux-ker...@vger.kernel.org;
> > linux-btrfs@vger.kernel.org; Sean Fu 
> > Subject: [PATCH v2] Btrfs: remove unnecessary code of chunk_root assignment
> > in btrfs_read_chunk_tree.
> > 
> > The input argument root is already set with "fs_info->chunk_root".
> > "chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
> > "open_ctree".
> > "root->fs_info = fs_info” in "btrfs_alloc_root".
> > 
> > Signed-off-by: Sean Fu 
> > ---
> > Changes in v2:
> >   - Renaming root to chunk_root to make it clear.
> > 
> >  fs/btrfs/volumes.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 366b335..eb3d04a 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -6591,7 +6591,7 @@ out_short_read:
> > return -EIO;
> >  }
> > 
> > -int btrfs_read_chunk_tree(struct btrfs_root *root)
> > +int btrfs_read_chunk_tree(struct btrfs_root *chunk_root)
> 
> Maybe you also need to modify function body to make it
> pass compile, and the header file also need to be modified.
Sorry for my mistake. I have released v3 for it.
> 
> BTW, Qu Wenruo give us a better way in reply, we
> can use fs_info directly.
Sorry, I don't know why we should use fs_info directly.
I have asked Qu for more detail.
Thanks for your review.
> 
> Thanks
> Zhaolei
> 
> >  {
> > struct btrfs_path *path;
> > struct extent_buffer *leaf;
> > @@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
> > int ret;
> > int slot;
> > 
> > -   root = root->fs_info->chunk_root;
> > -
> > path = btrfs_alloc_path();
> > if (!path)
> > return -ENOMEM;
> > --
> > 2.6.2
> > 
> 
> 
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread Sean Fu
On Mon, Sep 05, 2016 at 03:56:41PM +0800, Qu Wenruo wrote:
> 
> 
> At 09/05/2016 09:19 AM, Zhao Lei wrote:
> >Hi, Sean Fu
> >
> >>From: Sean Fu [mailto:fxinr...@gmail.com]
> >>Sent: Sunday, September 04, 2016 7:54 PM
> >>To: dste...@suse.com
> >>Cc: c...@fb.com; anand.j...@oracle.com; fdman...@suse.com;
> >>zhao...@cn.fujitsu.com; linux-btrfs@vger.kernel.org;
> >>linux-ker...@vger.kernel.org; Sean Fu 
> >>Subject: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in
> >>btrfs_read_chunk_tree.
> >>
> >>The input argument root is already set with "fs_info->chunk_root".
> >>"chunk_root = fs_info->chunk_root = btrfs_alloc_root(fs_info)" in caller
> >>"open_ctree".
> >>“root->fs_info = fs_info” in "btrfs_alloc_root".
> >>
> >The root argument of this function means "any root".
> >And the function is designed getting chunk root from
> >"any root" in head.
> >
> >Since there is only one caller of this function,
> >and the caller always send chunk_root as root argument in
> >current code, we can remove above conversion,
> >and I suggest renaming root to chunk_root to make it clear,
> >something like:
> >
> >- btrfs_read_chunk_tree(struct btrfs_root *root)
> >+ btrfs_read_chunk_tree(struct btrfs_root *chunk_root)
> 
> Since root is only used to get fs_info->chunk_root, why not use fs_info
> directly?
Sorry for late reply.
chunk_root is processed in btrfs_read_chunk_tree.
Why should we pass fs_info directly to btrfs_read_chunk_tree?
Could you give me more detail?

Many thanks
> 
> Thanks,
> Qu
> 
> >
> >Thanks
> >Zhaolei
> >
> >>Signed-off-by: Sean Fu 
> >>---
> >> fs/btrfs/volumes.c | 2 --
> >> 1 file changed, 2 deletions(-)
> >>
> >>diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >>index 366b335..384a6d2 100644
> >>--- a/fs/btrfs/volumes.c
> >>+++ b/fs/btrfs/volumes.c
> >>@@ -6600,8 +6600,6 @@ int btrfs_read_chunk_tree(struct btrfs_root *root)
> >>int ret;
> >>int slot;
> >>
> >>-   root = root->fs_info->chunk_root;
> >>-
> >>path = btrfs_alloc_path();
> >>if (!path)
> >>return -ENOMEM;
> >>--
> >>2.6.2
> >>
> >
> >
> >
> >
> >--
> >To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >the body of a message to majord...@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recommendation on raid5 drive error resolution

2016-09-06 Thread Gareth Pye
Things have been copying off really well.

I'm starting to suspect the issue was the PSU which I've swapped out.
What is the line I should see in dmesg if the degraded option was
actually used when mounting the file system?

On Thu, Sep 1, 2016 at 9:25 PM, Austin S. Hemmelgarn
 wrote:
> On 2016-08-31 19:04, Gareth Pye wrote:
>>
>> ro,degraded has mounted it nicely and my rsync of the more useful data
>> is progressing at the speed of WiFi.
>>
>> There are repeated read errors from one drive still but the rsync
>> hasn't bailed yet, which I think means there isn't any overlapping
>> errors in any of the files it has touched thus far. Am I right or is
>> their likely to be corrupt data in the files I've synced off?
>
> Unless you've been running with nocow or nodatasum in your mount options,
> then what you've concluded should be correct.  I would still suggest
> verifying the data by some external means if possible, this type of
> situation is not something that's well tested, and TBH I'm amazed that
> things are working to the degree that they are.
>
>>
>> On Wed, Aug 31, 2016 at 7:46 AM, Gareth Pye  wrote:
>>>
>>> Or I could just once again select the right boot device in the bios. I
>>> think I want some new hardware :)
>>>
>>> On Wed, Aug 31, 2016 at 7:23 AM, Gareth Pye 
>>> wrote:

 On Wed, Aug 31, 2016 at 4:28 AM, Chris Murphy 
 wrote:
>
> But I'd try a newer kernel before you
> give up on it.



 Any recommendations on liveCDs that have recent kernels & btrfs tools?
 For no apparent reason system isn't booting normally either, and I'm
 reluctant to fix that before at least confirming the things I at least
 partially care about have a recent backup.

 --
 Gareth Pye - blog.cerberos.id.au
 Level 2 MTG Judge, Melbourne, Australia
>>>
>>>
>>>
>>>
>>> --
>>> Gareth Pye - blog.cerberos.id.au
>>> Level 2 MTG Judge, Melbourne, Australia
>>
>>
>>
>>
>



-- 
Gareth Pye - blog.cerberos.id.au
Level 2 MTG Judge, Melbourne, Australia
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] btrfs-progs: Introduce new send-dump object

2016-09-06 Thread Qu Wenruo
Introduce send-dump.[ch] which implements a new btrfs_send_ops to
exam and output all operations inside a send stream.

It has a better output format than the old and no longer compilable
send-test tool, but still tries to be script friendly.

Provides the basis for later "inspect-internal dump-send" command.

Signed-off-by: Qu Wenruo 
---
 Makefile.in |   2 +-
 send-dump.c | 367 
 send-dump.h |  24 
 3 files changed, 392 insertions(+), 1 deletion(-)
 create mode 100644 send-dump.c
 create mode 100644 send-dump.h

diff --git a/Makefile.in b/Makefile.in
index ac6b353..97caf95 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -80,7 +80,7 @@ objects = ctree.o disk-io.o radix-tree.o extent-tree.o 
print-tree.o \
  extent-cache.o extent_io.o volumes.o utils.o repair.o \
  qgroup.o raid6.o free-space-cache.o list_sort.o props.o \
  ulist.o qgroup-verify.o backref.o string-table.o task-utils.o \
- inode.o file.o find-root.o free-space-tree.o help.o
+ inode.o file.o find-root.o free-space-tree.o help.o send-dump.o
 cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \
   cmds-inspect.o cmds-balance.o cmds-send.o cmds-receive.o \
   cmds-quota.o cmds-qgroup.o cmds-replace.o cmds-check.o \
diff --git a/send-dump.c b/send-dump.c
new file mode 100644
index 000..bf451c7
--- /dev/null
+++ b/send-dump.c
@@ -0,0 +1,367 @@
+/*
+ * Copyright (C) 2016 Fujitsu.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "utils.h"
+#include "commands.h"
+#include "send-utils.h"
+#include "send-stream.h"
+#include "send-dump.h"
+
+#define path_cat_out_with_error(function_name, out_path, path1, path2, ret) \
+ret = path_cat_out(out_path, path1, path2);\
+if (ret < 0) { \
+   error("%s: path invalid: %s\n", function_name, path2);  \
+   return ret; \
+}
+
+#define TITLE_WIDTH16
+#define PATH_WIDTH 32
+
+static void print_dump(const char *title, const char *path,
+  const char *fmt, ...)
+{
+   va_list args;
+   char real_title[TITLE_WIDTH + 1];
+
+   real_title[0]='\0';
+   /* Append ':' to title*/
+   strncpy(real_title, title, TITLE_WIDTH - 1);
+   strncat(real_title, ":", TITLE_WIDTH);
+
+   /* Unified output */
+   printf("%-*s%-*s", TITLE_WIDTH, real_title, PATH_WIDTH, path);
+   va_start(args, fmt);
+   /* Command specified output */
+   vprintf(fmt, args);
+   va_end(args);
+   printf("\n");
+}
+
+static int print_subvol(const char *path, const u8 *uuid, u64 ctransid,
+   void *user)
+{
+   struct btrfs_dump_send_args *r = user;
+   char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
+   int ret;
+
+   path_cat_out_with_error("subvol", r->full_subvol_path, r->root_path,
+   path, ret);
+   uuid_unparse(uuid, uuid_str);
+
+   print_dump("subvol", r->full_subvol_path, "uuid: %s, transid: %llu",
+  uuid_str, ctransid);
+   return 0;
+}
+
+static int print_snapshot(const char *path, const u8 *uuid, u64 ctransid,
+ const u8 *parent_uuid, u64 parent_ctransid,
+ void *user)
+{
+   struct btrfs_dump_send_args *r = user;
+   char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
+   char parent_uuid_str[BTRFS_UUID_UNPARSED_SIZE];
+   int ret;
+
+   path_cat_out_with_error("snapshot", r->full_subvol_path, r->root_path,
+   path, ret);
+   uuid_unparse(uuid, uuid_str);
+   uuid_unparse(parent_uuid, parent_uuid_str);
+
+   print_dump("snapshot", r->full_subvol_path,
+  "uuid: %s, transid: %llu, parent_uuid: %s, parent_transid: 
%llu",
+  uuid_str, ctransid, parent_uuid_str, parent_ctransid);
+   return 0;
+}
+
+static int print_mkfile(const char *path, void *user)
+{
+   struct btrfs_dump_send_args *r = user;
+   char full_path[PATH_MAX];
+   int ret;
+
+   path_cat_out_with_error("mkfile", full_path, r->full_subvol_path, path,
+   ret);

[PATCH 3/3] btrfs-progs: Remove send-test tool

2016-09-06 Thread Qu Wenruo
Since new "receive --dump" has better output and structure, it's time
to remove old and function-weak send-test tool.

Signed-off-by: Qu Wenruo 
---
 Makefile.in |   6 +-
 send-test.c | 447 
 2 files changed, 1 insertion(+), 452 deletions(-)
 delete mode 100644 send-test.c

diff --git a/Makefile.in b/Makefile.in
index 97caf95..5ae860d 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -346,10 +346,6 @@ ioctl-test: $(objects) $(libs) ioctl-test.o
@echo "[LD] $@"
$(Q)$(CC) $(CFLAGS) -o ioctl-test $(objects) $(libs) ioctl-test.o 
$(LDFLAGS) $(LIBS)
 
-send-test: $(objects) $(libs) send-test.o
-   @echo "[LD] $@"
-   $(Q)$(CC) $(CFLAGS) -o send-test $(objects) $(libs) send-test.o 
$(LDFLAGS) $(LIBS)
-
 library-test: $(libs_shared) library-test.o
@echo "[LD] $@"
$(Q)$(CC) $(CFLAGS) -o library-test library-test.o $(LDFLAGS) -lbtrfs
@@ -381,7 +377,7 @@ clean-all: clean clean-doc clean-gen
 clean: $(CLEANDIRS)
@echo "Cleaning"
$(Q)$(RM) -f $(progs) cscope.out *.o *.o.d \
- dir-test ioctl-test quick-test send-test library-test 
library-test-static \
+ dir-test ioctl-test quick-test library-test library-test-static \
  btrfs.static mkfs.btrfs.static \
  $(check_defs) \
  $(libs) $(lib_links) \
diff --git a/send-test.c b/send-test.c
deleted file mode 100644
index 4645b89..000
--- a/send-test.c
+++ /dev/null
@@ -1,447 +0,0 @@
-/*
- * Copyright (C) 2013 SUSE.  All rights reserved.
- *
- * This code is adapted from cmds-send.c and cmds-receive.c,
- * Both of which are:
- *
- * Copyright (C) 2012 Alexander Block.  All rights reserved.
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public
- * License v2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * You should have received a copy of the GNU General Public
- * License along with this program; if not, write to the
- * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
- * Boston, MA 021110-1307, USA.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-/*
- * This should be compilable without the rest of the btrfs-progs
- * source distribution.
- */
-#if BTRFS_FLAT_INCLUDES
-#include "send-utils.h"
-#include "send-stream.h"
-#else
-#include 
-#include 
-#endif /* BTRFS_FLAT_INCLUDES */
-
-static int pipefd[2];
-struct btrfs_ioctl_send_args io_send = {0, };
-static char *subvol_path;
-static char *root_path;
-
-struct recv_args {
-   char *full_subvol_path;
-   char *root_path;
-};
-
-void usage(int error)
-{
-   printf("send-test  \n");
-   if (error)
-   exit(error);
-}
-
-static int print_subvol(const char *path, const u8 *uuid, u64 ctransid,
-   void *user)
-{
-   struct recv_args *r = user;
-   char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
-
-   r->full_subvol_path = path_cat(r->root_path, path);
-   uuid_unparse(uuid, uuid_str);
-
-   printf("subvol\t%s\t%llu\t%s\n", uuid_str,
-  (unsigned long long)ctransid, r->full_subvol_path);
-
-   return 0;
-}
-
-static int print_snapshot(const char *path, const u8 *uuid, u64 ctransid,
- const u8 *parent_uuid, u64 parent_ctransid,
- void *user)
-{
-   struct recv_args *r = user;
-   char uuid_str[BTRFS_UUID_UNPARSED_SIZE];
-   char parent_uuid_str[BTRFS_UUID_UNPARSED_SIZE];
-
-   r->full_subvol_path = path_cat(r->root_path, path);
-   uuid_unparse(uuid, uuid_str);
-   uuid_unparse(parent_uuid, parent_uuid_str);
-
-   printf("snapshot\t%s\t%llu\t%s\t%llu\t%s\n", uuid_str,
-  (unsigned long long)ctransid, parent_uuid_str,
-  (unsigned long long)parent_ctransid, r->full_subvol_path);
-
-   return 0;
-}
-
-static int print_mkfile(const char *path, void *user)
-{
-   struct recv_args *r = user;
-   char *full_path = path_cat(r->full_subvol_path, path);
-
-   printf("mkfile\t%s\n", full_path);
-
-   free(full_path);
-   return 0;
-}
-
-static int print_mkdir(const char *path, void *user)
-{
-   struct recv_args *r = user;
-   char *full_path = path_cat(r->full_subvol_path, path);
-
-   printf("mkdir\t%s\n", full_path);
-
-   free(full_path);
-   return 0;
-}
-
-static int print_mknod(const char *path, u64 mode, u64 dev, void *user)
-{
-   struct recv_args *r = user;
-   char *full_path = path_cat(r->full_subvol_path, path);
-
- 

[PATCH v2 2/3] btrfs-progs: receive: Introduce option to exam and dump send stream

2016-09-06 Thread Qu Wenruo
Introduce new option, '--dump' for receive subcommand.

With this command, user can exam and dump the metadata info of a send
stream.
Which is quite useful for education purpose or bug reporting.

Signed-off-by: Qu Wenruo 
---
 Documentation/btrfs-receive.asciidoc | 17 +--
 cmds-receive.c   | 41 ++--
 2 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/Documentation/btrfs-receive.asciidoc 
b/Documentation/btrfs-receive.asciidoc
index e246603..e4e3ba4 100644
--- a/Documentation/btrfs-receive.asciidoc
+++ b/Documentation/btrfs-receive.asciidoc
@@ -3,18 +3,25 @@ btrfs-receive(8)
 
 NAME
 
-btrfs-receive - receive subvolumes from send stream
+btrfs-receive - receive/exam subvolumes from send stream
 
 SYNOPSIS
 
 *btrfs receive* [options] 
 
+or
+
+*btrfs receive* --dump [options]
+
 DESCRIPTION
 ---
 
 Receive a stream of changes and replicate one or more subvolumes that were
 previously used with *btrfs send* The received subvolumes are stored to
-'path'.
+'path', if '--dump' option is not given.
+
+If '--dump' option is given, *btrfs receive* will only exam the validation of
+the stream, and print the metadata info for debug usage.
 
 *btrfs receive* will fail int the following cases:
 
@@ -56,6 +63,12 @@ By default the mountpoint is searched in '/proc/self/mounts'.
 If you do not have '/proc', eg. in a chroot environment, use this option to 
tell
 us where this filesystem is mounted.
 
+--dump::
+exam the stream and print metadata info for debug/education purpose.
++
+Does not accept 'path' parameter. So with this option, *btrfs receive* won't
+modify your filesystem, and can be run by non-privileged users.
+
 EXIT STATUS
 ---
 *btrfs receive* returns a zero exit status if it succeeds. Non zero is
diff --git a/cmds-receive.c b/cmds-receive.c
index f4a3a4f..32bf086 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -49,6 +49,7 @@
 #include "send.h"
 #include "send-stream.h"
 #include "send-utils.h"
+#include "send-dump.h"
 
 static int g_verbose = 0;
 
@@ -1194,6 +1195,7 @@ int cmd_receive(int argc, char **argv)
struct btrfs_receive r;
int receive_fd = fileno(stdin);
u64 max_errors = 1;
+   int dump = 0;
int ret = 0;
 
memset(, 0, sizeof(r));
@@ -1206,9 +1208,11 @@ int cmd_receive(int argc, char **argv)
 
while (1) {
int c;
+   enum { GETOPT_VAL_DUMP = 257 };
static const struct option long_opts[] = {
{ "max-errors", required_argument, NULL, 'E' },
{ "chroot", no_argument, NULL, 'C' },
+   { "dump", no_argument, NULL, GETOPT_VAL_DUMP },
{ NULL, 0, NULL, 0 }
};
 
@@ -1245,6 +1249,9 @@ int cmd_receive(int argc, char **argv)
goto out;
}
break;
+   case GETOPT_VAL_DUMP:
+   dump = 1;
+   break;
case '?':
default:
error("receive args invalid");
@@ -1252,7 +1259,9 @@ int cmd_receive(int argc, char **argv)
}
}
 
-   if (check_argc_exact(argc - optind, 1))
+   if (dump && check_argc_exact(argc - optind, 0))
+   usage(cmd_receive_usage);
+   if (!dump && check_argc_exact(argc - optind, 1))
usage(cmd_receive_usage);
 
tomnt = argv[optind];
@@ -1265,19 +1274,37 @@ int cmd_receive(int argc, char **argv)
}
}
 
-   ret = do_receive(, tomnt, realmnt, receive_fd, max_errors);
+   if (dump) {
+   struct btrfs_dump_send_args dump_args;
+
+   dump_args.root_path = malloc(PATH_MAX);
+   dump_args.root_path[0] = '.';
+   dump_args.root_path[1] = '\0';
+   dump_args.full_subvol_path = malloc(PATH_MAX);
+   dump_args.full_subvol_path[0] = '.';
+   dump_args.full_subvol_path[1] = '\0';
+   ret = btrfs_read_and_process_send_stream(receive_fd,
+   _print_send_ops, _args, 0, 0);
+   if (ret < 0)
+   error("failed to dump the send stream: %s",
+ strerror(-ret));
+   } else {
+   ret = do_receive(, tomnt, realmnt, receive_fd, max_errors);
+   }
+
if (receive_fd != fileno(stdin))
close(receive_fd);
-
 out:
 
return !!ret;
 }
 
 const char * const cmd_receive_usage[] = {
-   "btrfs receive [-ve] [-f ] [--max-errors ] ",
-   "Receive subvolumes from stdin.",
-   "Receives one or more subvolumes that were previously",
+   "btrfs receive [options] ",
+   "or"
+   "btrfs receive --dump [options]",
+   "Receive/exam subvolumes from stdin.",
+   "Receives/exams one 

[PATCH 0/3] Introduce dump option for btrfs-receive

2016-09-06 Thread Qu Wenruo
Introduce new "--dump" option for btrfs-receive, which will exam and dump
metadata info of a send stream.
This is quite handy to debug send stream.

Since such function is provided by old send-test tool, which doesn't even
compile now, remove the old send-test tool.

changelog:
v2:
  Move from inspect subcommand to receive subcommand.

Qu Wenruo (3):
  btrfs-progs: Introduce new send-dump object
  btrfs-progs: receive: Introduce option to exam and dump send stream
  btrfs-progs: Remove send-test tool

 Documentation/btrfs-receive.asciidoc |  17 +-
 Makefile.in  |   8 +-
 cmds-receive.c   |  41 +++-
 send-dump.c  | 367 
 send-dump.h  |  24 ++
 send-test.c  | 447 ---
 6 files changed, 443 insertions(+), 461 deletions(-)
 create mode 100644 send-dump.c
 create mode 100644 send-dump.h
 delete mode 100644 send-test.c

-- 
2.9.3



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send extremely slow (almost stuck)

2016-09-06 Thread Oliver Freyermuth
Duncan wrote on Mon, 05 Sep 2016 19:14:30 -0700: 
> I had something very similar happen here a few weeks ago, except with my 
> firefox profile dir (I don't run thunderbird, preferring claws-mail, but 
> I do run firefox as my browser).

Indeed, I also note Firefox doing a lot of IO especially if session recovery is 
enabled,
so I can totally imagine this causing similar issues... 

> My use-case does neither snapshots nor send/receive, however, so it was 
> just the single root subvolume (5).  But there was supposedly a file in 
> that dir according to bash's tab-completion, that would neither list, nor 
> rm, which meant the dir couldn't rm -r either.  (Interestingly enough, rm 
> -i asked if I wanted to rm "weird file" whatever, and weird it indeed was!
> )

Sadly, for me there is / was no file at all "visible", neither via tab nor via 
'rm -i'. 

> So I immediately copied all the normal files to a new dir, and deleted 
> the normal files from the problem dir, leaving only the weird one.
> Then I renamed the problem dir in ordered to be able to rename the new 
> dir (with the good files) back to the name firefox expected.

That was exactly my "backup plan" I applied yesterday. In my case, luckily,
I even had a full backup of the profile just a few hours old, so I just took 
that
to replace the folder with a fresh one after renaming it. 

> Then I decided to see what I could do with the renamed dir.  I believe I 
> rebooted (or umount/mount cycled the filesystem) as well.  I think I had 
> to use the magic-sysrq m/remount-ro key as it refused to umount even from 
> systemd emergency mode.  But here's the interesting part.  At least after 
> the rename and a reboot, it *DID* let me delete (using mc) the dir!  I 
> honestly didn't expect it'd let me, but it did.

For me all the shutdowns went fine (the problem must / may have been present 
for weeks, 
I only noticed now that btrfs send finally did something - and errored out) 
- and the problem, sadly, was not fixed after any reboot. 
I guess after all for me it was corruption on the directory itself
(or rather its isize), 
while for you it was some other sort of metadata corruption causing 
a "weirdly behaving" file. 

> The difference, however, is that I didn't have any snapshots/subvolumes 
> or other reflinks to the "weird" file, only the one normal hardlink.  So 
> even if it's the same thing, I'm not sure if it'll work for you given the 
> multiple snapshot reflinks to the file, as it did for me with just the 
> one.

I did at least try to delete all snapshots which could reference that file - 
did not help. 
I also tried running 'btrfs defrag' on that folder, which should have broken up 
any reflinks, 
this also did not help. 

But luckily (as you can see from my other mail) two "btrfs check --repair" 
iterations finally
fixed my issue. I hope the experts can figure out something from my uploaded 
debug info
to prevent such things in the future. 

Thanks a lot in any case for your experience report! 

I hope my "repair experience" from my other mail made from my user's 
perspective may at some point
of time also be of help to you (even though, I hope, you'll never need it). 

Cheers and thanks again, 
Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] Btrfs: fix memory leak due to invalid btree height

2016-09-06 Thread Liu Bo
On Tue, Sep 06, 2016 at 06:50:19PM +0200, David Sterba wrote:
> On Fri, May 13, 2016 at 05:07:02PM -0700, Liu Bo wrote:
> > Thanks to fuzz testing, we can have invalid btree root node height.
> 
> Shouldn't we do this kind of sanity checks earlier? Not at the search
> slot time but when it's read from disk. The check that you're adding can
> stay, but without the early check we could hit it very often thus making
> it very noisy.

We do have such an early check when it's read from disk
(btree_readpage_end_io_hook) and this can protect us from 99.9% cases,
the only corner case is that the fuzz image changes our chunk root node
to superblock bytenr, so we firstly reads superblock into a dummy eb, and when
we get to read chunk root, we firstly search eb tree and find one eb
matching the bytenr, then we take this invalid eb to do
btrfs_search_slot() and we come cross this surprise.

Anyway, this patch was made before I found we could actually free
superblock's eb immediately after use.  Now with freeing that eb I don't
think we can have the above problem.

Thanks,

-liubo

> 
> > Btrfs limits btree height to 7 and if the given height is 9, then btrfs
> > will have problems in both releasing root node's lock and freeing the node.
> 
> 
> > 
> > Signed-off-by: Liu Bo 
> > ---
> >  fs/btrfs/ctree.c | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> > index ec7928a..3fccbcc 100644
> > --- a/fs/btrfs/ctree.c
> > +++ b/fs/btrfs/ctree.c
> > @@ -2756,6 +2756,13 @@ again:
> > }
> > }
> > }
> > +   if (level > BTRFS_MAX_LEVEL - 1 || level < 0) {
> > +   WARN_ONCE(1, KERN_WARNING "Invalid btree height %d\n", level);
> > +   if (!p->skip_locking)
> > +   btrfs_tree_unlock_rw(b, root_lock);
> > +   free_extent_buffer(b);
> > +   return -EINVAL;
> > +   }
> > p->nodes[level] = b;
> > if (!p->skip_locking)
> > p->locks[level] = root_lock;
> > -- 
> > 2.5.5
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send extremely slow (almost stuck)

2016-09-06 Thread Oliver Freyermuth
Am 06.09.2016 um 04:46 schrieb Qu Wenruo:
> But your idea to locate the inode seems good enough for debugging though.

Based on this I even had another idea which seems to have worked well - and I 
am now also able to provide any additional debug output you may need. 


Since my procedure may be interesting / helpful for other "debugging users", 
I'll shortly outline it here. 
I had enough extra space on an external HDD. I cloned the full btrfs partition 
with 'dd' to an image on this HDD.
I loop'ed that image read-only on another machine, created an overlay-file and 
used device mapper
to get a read-write block device for any experiment (based on the read-only 
image).
Details on that e.g. at 
https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file
 . 


> In this case, before reverting to a backup, would you please run a "btrfsck 
> check" and paste the output?

Now, I ran 'btrfs check' on that device. I'm using the very fresh btrfs-progs 
4.7.2. 
The output is here:
http://pastebin.com/rMrW40RU
Notably, it claims to have found some other issues, mainly wrong link counts 
and dir isizes, but for various inodes...

Now, I could also safely run 'btrfs check --repair' on this device without any 
risks.
The output from that is here:
http://pastebin.com/XW9ChuqU

Another 'btrfs check' run afterwards now reveals different issues:
http://pastebin.com/TFKJa81e

Now, another repair:
http://pastebin.com/33iqaE9E

Now, finally, btrfs check is happy:
http://pastebin.com/izkERtKp

After mounting, finally (kernel 4.7.2) I see in kernel log:
[12108.696912] BTRFS info (device dm-0): disk space caching is enabled
[12108.713176] BTRFS info (device dm-0): checking UUID tree

I can now delete the "broken" .thunderbird folder on this "repaired" fs.
I can also mount it and write data on it.

Concluding from these results that it should be safe to do the same to my 
original block device with the same btrfs-progs version
I did just that (check, repair, check, repair, check) from a live system 
directly on the machine. 
Up until now, the FS seems to be doing well again - I took the chance to enable 
skinny extents and am now doing a full metadata balance,
saving me about 0.25 % of metadata space. 
So finally, first time in my life, 'btrfs check --repair' did not eat my data! 
:-) 


The cool thing is that now I still have the broken image (extracted with dd) 
around and can play with it to provide you with any debug-info
without having to work directly with the broken FS on the machine itself. 


Now, let's get started on that.

ls -aldi .thunderbird-broken/p6bm45oa.default/
162786 drwx-- 1 olifre olifre 2482  5. Sep 23:07 
.thunderbird-broken/p6bm45oa.default/
As you can see, I had renamed .thunderbird to .thunderbird-broken. The real 
issue is in any case the profile-subfolder within.
So the affected ino is indeed 162786 which also shows up (as one of several 
issues...) in the btrfs check (and repair) output.

> Further more, your btrfs-debug-tree dump should provide more help for this 
> case.

Just to make sure the debug-tree output matches the rest of all the information 
I'm giving you, I re-ran that on the dd'ed image from the broken FS like so:
btrfs-debug-tree -t 442 xmg13.img | sed "s/name:.*//" > debug-tree

I ran the output through xz (or rather, pixz) and here it is:
https://cernbox.cern.ch/index.php/s/imjwqsOFerUklqr/download
I'll probably not keep the file up there forever, but at least for quite some 
days.

If you can think of any other information which may be useful to diagnose the 
underlying issue which caused that corruption
just let me know. I'll keep the image of the broken FS around for a few weeks.

Cheers, 
Oliver
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix BUG_ON in btrfs_mark_buffer_dirty

2016-09-06 Thread Liu Bo
Hi Filipe,

On Mon, Sep 05, 2016 at 04:28:09PM +0100, Filipe Manana wrote:
> On Fri, Sep 2, 2016 at 8:35 PM, Liu Bo  wrote:
> > This can only happen with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y.
> >
> > Commit 1ba98d0 ("Btrfs: detect corruption when non-root leaf has zero item")
> > assumes that a leaf is its root when leaf->bytenr == 
> > btrfs_root_bytenr(root),
> > however, we should not use btrfs_root_bytenr(root) since it's mainly got
> > updated during committing transaction.  So the check can fail when doing
> > COW on this leaf while it is a root.
> >
> > This changes to use "if (leaf == btrfs_root_node(root))" instead, just like
> > how we check whether leaf is a root in __btrfs_cow_block().
> >
> > Reported-by: Jeff Mahoney 
> > Signed-off-by: Liu Bo 
> 
> Hi Bo,
> 
> Even with this patch applied against latest branch for-linus-4.8, at
> least on a build with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y,
> the issue still happens for me when running fsstress with balance in parallel:

Thanks for the report.

This panic shows that we can have non-root btree leaf with 0 nritems during
split_leaf(), but a btrfs_search_slot which calls split_leaf() like this is
inserting an item, and while we set @right's nritems to 0, we also assign 
@disk_key
associated with @right in the parent node, so I think we're actually having
 nritem 0 temporarily and we can remove this btrfs_mark_buffer_dirty() like the
following quick patch.

Thanks,

-liubo

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index d1c56c9..5e5ceb5 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -4341,7 +4341,11 @@ again:
if (path->slots[1] == 0)
fixup_low_keys(fs_info, path, _key, 1);
}
-   btrfs_mark_buffer_dirty(right);
+   /*
+* We create a new leaf 'right' for the required ins_len and
+* we'll do btrfs_mark_buffer_dirty() on this leaf after copying
+* the content of ins_len to 'right'.
+*/
return ret;
}
 


> 
> [  366.998044] BTRFS: device fsid 69759b3a-96ae-467d-aa63-364144e73a28
> devid 1 transid 3 /dev/sdc
> [  367.023652] BTRFS info (device sdc): turning on discard
> [  367.025036] BTRFS info (device sdc): disk space caching is enabled
> [  367.026360] BTRFS info (device sdc): has skinny extents
> [  367.069415] BTRFS info (device sdc): creating UUID tree
> [  367.133704] BTRFS info (device sdc): relocating block group 29360128 flags 
> 36
> [  367.142196] BTRFS info (device sdc): found 2 extents
> [  367.147932] BTRFS info (device sdc): relocating block group 20971520 flags 
> 34
> [  367.157947] BTRFS info (device sdc): found 1 extents
> [  367.162649] BTRFS info (device sdc): relocating block group 12582912 flags 
> 1
> [  367.182872] BTRFS info (device sdc): found 1 extents
> [  367.189932] BTRFS info (device sdc): found 1 extents
> [  367.200983] BTRFS info (device sdc): relocating block group
> 1270874112 flags 1
> [  367.235862] BTRFS critical (device sdc): corrupt leaf, non-root
> leaf's nritems is 0: block=1103937536,root=5, slot=0
> [  367.238223] BTRFS info (device sdc): leaf 1103937536 total ptrs 0
> free space 3995
> [  367.239371] BTRFS: assertion failed: 0, file: fs/btrfs/disk-io.c, line: 
> 4059
> [  367.240321] [ cut here ]
> [  367.241245] kernel BUG at fs/btrfs/ctree.h:3367!
> [  367.241961] invalid opcode:  [#1] PREEMPT SMP
> [  367.242685] Modules linked in: btrfs crc32c_generic xor raid6_pq
> acpi_cpufreq tpm_tis tpm sg i2c_piix4 i2c_core psmouse ppdev processor
> evdev parport_pc parport serio_raw pcspkr button loop autofs4 ext4
> crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi
> ata_piix libata virtio_pci virtio_ring e1000 virtio scsi_mod floppy
> [last unloaded: btrfs]
> [  367.244302] CPU: 11 PID: 3451 Comm: fdm-stress Not tainted
> 4.7.0-rc6-btrfs-next-34+ #1
> [  367.244302] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
> [  367.244302] task: 880183ef8bc0 ti: 880183fe task.ti:
> 880183fe
> [  367.244302] RIP: 0010:[]  []
> assfail.constprop.42+0x1c/0x1e [btrfs]
> [  367.244302] RSP: 0018:880183fe3c78  EFLAGS: 00010296
> [  367.244302] RAX: 0040 RBX: 880222a66ab0 RCX: 
> 0001
> [  367.244302] RDX: 810a0a23 RSI: 814a82cb RDI: 
> 
> [  367.244302] RBP: 880183fe3c78 R08: 0001 R09: 
> 
> [  367.244302] R10: 880183fe3b70 R11: 82f3650d R12: 
> 880183e8e000
> [  367.244302] R13: 8800b3e7d000 R14: 0a59 R15: 
> 0017
> [  367.244302] FS:  7f0b85310700() GS:88023f4c()
> knlGS:
> [  367.244302] CS:  0010 DS:  ES:  CR0: 80050033
> [  367.244302] CR2: 

State of the fuzzer

2016-09-06 Thread Lukas Lueg
Hi,

I'm currently fuzzing rev 2076992 and things start to slowly, slowly
quiet down. We will probably run out of steam at the end of the week
when a total of (roughly) half a billion BTRFS-images have passed by.
I will switch revisions to current HEAD and restart the whole process
then. A few things:

* There are a couple of crashes (mostly segfaults) I have not reported
yet. I'll report them if they show up again with the latest revision.
* The coverage-analysis shows assertion failures which are currently
silenced. An assertion failure is technically a worse disaster
successfully prevented, it still constitutes unexpected/unusable
behaviour, though. Do you want assertions to be enabled and images
triggering those assertions reported? This is basically the same
conundrum as with BUG_ON and abort().
* A few endless loops entered into by btrfsck are currently
unmitigated (see bugs 155621, 155571, 11 and 155151). It would be
nice if those had been taken care of by next week if possible.


Best regards
Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: don't leak reloc root nodes on errorg

2016-09-06 Thread Liu Bo
On Mon, Sep 05, 2016 at 05:20:53PM +0200, David Sterba wrote:
> On Fri, Sep 02, 2016 at 06:08:35PM -0700, Liu Bo wrote:
> > On Fri, Sep 02, 2016 at 03:25:43PM -0400, Josef Bacik wrote:
> > > We don't track the reloc roots in any sort of normal way, so the only way 
> > > the
> > > root/commit_root nodes get free'd is if the relocation finishes 
> > > successfully and
> > > the reloc root is deleted.  Fix this by free'ing them in free_reloc_roots.
> > > Thanks,
> > 
> > Looks good.
> > 
> > > 
> > > Signed-off-by: Josef Bacik 
> > > ---
> > >  fs/btrfs/relocation.c | 4 
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> > > index 7fc6ea7..62dfc2c 100644
> > > --- a/fs/btrfs/relocation.c
> > > +++ b/fs/btrfs/relocation.c
> > > @@ -2349,6 +2349,10 @@ void free_reloc_roots(struct list_head *list)
> > >   while (!list_empty(list)) {
> > >   reloc_root = list_entry(list->next, struct btrfs_root,
> > >   root_list);
> > > + free_extent_buffer(reloc_root->node);
> > > + free_extent_buffer(reloc_root->commit_root);
> > > + reloc_root->node = NULL;
> > > + reloc_root->commit_root = NULL;
> > 
> > What about reloc_root itself?
> 
> > >   __del_reloc_root(reloc_root);
> 
> It's deleted at the end of __del_reloc_root

__del_reloc_root() is dealing with the associated rb_node, not
reloc_root.

I posted a patch to free reloc_root while free'ing its corresponding fs
root, but it'd be good if we can free it here.

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [OT] ccache and tmpfs builds Was: Balancing subvolume on a specific device

2016-09-06 Thread Duncan
Austin S. Hemmelgarn posted on Tue, 06 Sep 2016 08:32:02 -0400 as
excerpted:

> On 2016-09-02 06:55, Duncan wrote:
>> Kai Krakow posted on Thu, 01 Sep 2016 21:45:19 +0200 as excerpted:
>>
>>> Off topic: Is ccache really that helpful? I dumped it a few years ago

>>> What would help a whole lot more would be to cache this really really
>>> inefficient configure tool of hell which often runs much longer than
>>> the build phase of the whole source itself.
>>
>> IDK if you were around back then, but some time ago there was a
>> confcache project that tried to do just that.  But unfortunately, it
>> was enough of a niche use-case (most folks just run binary distros and
>> don't care, and others have switched to cmake or the like and don't
>> care) and came with enough problem corner-cases that required upstream
>> cooperation that wasn't coming as they didn't care, to fix, that the
>> project was eventually given up.  =:^(

> Alternatively, there's a mechanism (I think it's called something like
> config.local) that lets you pre-load certain things in the configure
> script.  Some things aren't safe to load here, but many package
> invariant things (like if you have working headers, or how malloc(0)
> returns, or similar things that are never going to change unless you
> change your CHOST) can be put here.  I don't know how much it may help
> speed things up (I don't use it right now myself), but it may be worth
> trying.

Good point.  I'm not far enough to the developer side to know the 
details, but IIRC confcache actually worked by prepopulating 
config.local, thereby bypassing various tests where the result was 
already cached.  The problem was, however, that there were too many 
corner-cases that should have invalidated the cache or part of it that 
were too hard to catch reliably, making it a maintainability nightmare as 
code was added to take care of all these corner-cases.  I strongly 
believe that's what ultimately did the project in.  Well, that and the 
primary guy behind it graduating or whatever and moving on, I think, but 
it made enough difference that were it /easy/ to maintain, someone would 
have definitely done so.

[On parallel builds as a work-around]

> Pretty much the same here, I just don't use parallel builds ATM (I've
> had lots of ebuild bugs hit me recently, and it's a bit easier to debug
> such things if everything is serialized)

I've pretty much gone the other way.  Parallel by default (I have 
scriptlets that handle the emerge commandline details and all the default 
ones add the --jobs and --load-average options), but with builds in tmpfs 
and ccache, when something fails, 90%+ of the time I simply rebuild it 
individually instead of finding the log.  In fact, with the defaults 
including continue as well, so portage continues on to build other 
packages even if one fails, most of the time it's continuing on after 
I've caught the failed build and am rerunning it in a different terminal 
window with full output to the terminal.

Say there's 50 packages to update and #13 fails.  But by the time it 
fails 14-16 are already building and complete just fine, and while it 
won't schedule any new builds after a failure until it recalculates the 
build graph and removes deps of the failed, once it does that it 
continues on.  So in our example original #20 and 21 are dropped as 
reverse deps of the failed, but of the first 16, 15 succeeded and one 
failed, and of the entire 50, two more are dropped.  After the 
recalulation it then has (original) #17-19 and 22-50 to build.  So then 
it starts the recalculated set with 32 (if I did my math right), and 
continues building from there.

But by this time I've noticed the failed #13 and set it rebuilding with 
output to a different terminal.  Because of the load-average limits that 
simply slows up the continuing rebuild of the 32 a bit, but with the 
builds in tmpfs, the rebuild now ccached, and the rebuild's reads all in 
system cache, the rebuild will go quite fast even with a heavy load from 
the continuing build in the other terminal, and it's simply easier to 
redo the single failed build and get the log on-terminal this time, than 
it is to load the failed build log in a text editor.

So I rebuild the failed one by itself, and sometimes it'll build fine the 
second time around -- maybe it was just an undeclared dep that's built 
now or something.  (Since I'm running live-git kde, sometimes those live-
build deps in particular have changed and it simply hasn't been reflected 
in the ebuild yet.)

Or with a live-git package sometimes the dev has already caught and fixed 
the problem and a rebuild "just works".

Or it may be something easily solved by a simple patch, often quickly 
googled from bugzie or whatever, that I can simply drop in /etc/portage/
patches, and again rebuild.  Or I can bypass the problem with a quick 
entry in package.use or package.env.

In these cases I'll often have the problem solved and the package and any 

Re: [PATCH 15/31] btrfs: call functions that overwrite their root parameter with fs_info

2016-09-06 Thread David Sterba
Hi,

On Fri, Jun 24, 2016 at 06:15:08PM -0400, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> There are 11 functions that accept a root parameter and immediately
> overwrite it.  We can pass those an fs_info pointer instead.
> 
> Signed-off-by: Jeff Mahoney 

including this patch, the rest of the series hasn't been merged in the
last cycle.

The first batch of 4.9 patches is done, please refresh the series on top
of that. Branch misc-4.9 in
http://repo.or.cz/linux-2.6/btrfs-unstable.git or
https://github.com/kdave/btrfs-devel . Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: fix WARNING in btrfs_select_ref_head()

2016-09-06 Thread David Sterba
Josef, can you please review this?

On Mon, Jun 20, 2016 at 09:18:52AM +0800, Wang Xiaoguang wrote:
> This issue was found when testing in-band dedupe enospc behaviour,
> sometimes run_one_delayed_ref() may fail for enospc reason, then
> __btrfs_run_delayed_refs()will return, but forget to add num_heads_read
> back, which will trigger "WARN_ON(delayed_refs->num_heads_ready == 0)" in
> btrfs_select_ref_head().
> 
> Signed-off-by: Wang Xiaoguang 
> ---
>  fs/btrfs/extent-tree.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 6146729..eeedff3 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -2665,7 +2665,10 @@ static noinline int __btrfs_run_delayed_refs(struct 
> btrfs_trans_handle *trans,
>  
>   btrfs_free_delayed_extent_op(extent_op);
>   if (ret) {
> + spin_lock(_refs->lock);
>   locked_ref->processing = 0;
> + delayed_refs->num_heads_ready++;
> + spin_unlock(_refs->lock);
>   btrfs_delayed_ref_unlock(locked_ref);
>   btrfs_put_delayed_ref(ref);
>   btrfs_debug(fs_info, "run_one_delayed_ref returned %d", 
> ret);
> -- 
> 1.8.3.1
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Security implications of btrfs receive?

2016-09-06 Thread Graham Cobb
Thanks to Austin and Duncan for their replies.

On 06/09/16 13:15, Austin S. Hemmelgarn wrote:
> On 2016-09-05 05:59, Graham Cobb wrote:
>> Does the "path" argument of btrfs-receive mean that *all* operations are
>> confined to that path?  For example, if a UUID or transid is sent which
>> refers to an entity outside the path will that other entity be affected
>> or used?
> As far as I know, no, it won't be affected.
>> Is it possible for a file to be created containing shared
>> extents from outside the path?
> As far as I know, the only way for this to happen is if you're
> referencing a parent subvolume for a relative send that is itself
> sharing extents outside of the path.  From a practical perspective,
> unless you're doing deduplication on the receiving end, the this
> shouldn't be possible.

Unfortunately that is not the case.  I decided to do some tests to see
what happens.  It is possible for a receive into one path to reference
and access a subvolume from a different path on the same btrfs disk.  I
have created a bash script to demonstrate this at:

https://gist.github.com/GrahamCobb/c7964138057e4e092a75319c9fb240a3

This does require the attacker to know the (source) subvolume UUID they
want to copy.  I am not sure how hard UUIDs are to guess.

By the way, this is exactly the same whether or not the --chroot option
is specified on the "btrfs receive" command.

The next question this raises for me is whether this means that
processes in a chroot or in a container (or in a mandatory access
controls environment) can access files outside the chroot/container if
they know the UUID of the subvolume?  After all, btrfs-receive uses
IOCTLs that any program running as root can use.

Graham
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] Btrfs: fix memory leak due to invalid btree height

2016-09-06 Thread David Sterba
On Fri, May 13, 2016 at 05:07:02PM -0700, Liu Bo wrote:
> Thanks to fuzz testing, we can have invalid btree root node height.

Shouldn't we do this kind of sanity checks earlier? Not at the search
slot time but when it's read from disk. The check that you're adding can
stay, but without the early check we could hit it very often thus making
it very noisy.

> Btrfs limits btree height to 7 and if the given height is 9, then btrfs
> will have problems in both releasing root node's lock and freeing the node.


> 
> Signed-off-by: Liu Bo 
> ---
>  fs/btrfs/ctree.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index ec7928a..3fccbcc 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2756,6 +2756,13 @@ again:
>   }
>   }
>   }
> + if (level > BTRFS_MAX_LEVEL - 1 || level < 0) {
> + WARN_ONCE(1, KERN_WARNING "Invalid btree height %d\n", level);
> + if (!p->skip_locking)
> + btrfs_tree_unlock_rw(b, root_lock);
> + free_extent_buffer(b);
> + return -EINVAL;
> + }
>   p->nodes[level] = b;
>   if (!p->skip_locking)
>   p->locks[level] = root_lock;
> -- 
> 2.5.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs-progs: Doc: Add warning for build RAID btrfs on partions from the same device

2016-09-06 Thread David Sterba
On Fri, Sep 02, 2016 at 09:41:45AM +0800, Qu Wenruo wrote:
> Quite a common sense for any RAID-like multi-device setup, just in case.
> 
> Signed-off-by: Qu Wenruo 

Both applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Some help with the code.

2016-09-06 Thread Tomasz Kusmierz
This is predominantly for maintainers:

I've noticed that there is a lot of code for btrfs ... and after few
glimpses I've noticed that there are occurrences which beg for some
refactoring to make it less of a pain to maintain.

I'm speaking of occurrences where:
- within a function there are multiple checks for null pointer and
then whenever there is anything hanging on the end of that pointer to
finally call the function, pass the pointer to it and watch it perform
same checks to finally deallocate stuff on the end of a pointer.
- single line functions ... called only in two places

and so on.

I know that you guys are busy, but maintaining code that is only
growing must be a pain.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH]btrfs-progs: Post btrfs-convert verify permissions and acls

2016-09-06 Thread David Sterba
On Mon, Sep 05, 2016 at 09:27:36PM +0200, Lakshmipathi.G wrote:
> Signed-off-by: Lakshmipathi.G 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread Jeff Mahoney
On 9/6/16 5:58 AM, David Sterba wrote:
> On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
 Since root is only used to get fs_info->chunk_root, why not use fs_info
 directly?
>>>
>>> Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
>>> to go back and check what else is missing.
>>
>> Actually, most of this didn't land.  Pretty much anything that's a root
>> ->fs_info conversion is in there.
> 
> Only half of the patchset has been merged so far because it did not pass
> testing, so I bisected to some point. I was about to let you know once
> most of 4.9 patches are prepared so there are less merge conflicts.

Ok, thanks.  I was going to start the rebase today but I'll hold off
until you're set for 4.9.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 02/13] btrfs-progs: check: introduce function to find dir_item

2016-09-06 Thread David Sterba
On Thu, Jul 28, 2016 at 03:08:14PM +0800, Lu Fengqi wrote:
> +static int find_dir_item(struct btrfs_root *root, struct btrfs_key *ref_key,
> +  struct btrfs_key *key, u64 index, char *name,
> +  u32 namelen, u32 mode)
> +{
> + struct btrfs_path *path;
> + struct extent_buffer *node;
> + struct btrfs_dir_item *di;
> + struct btrfs_key location;
> + char namebuf[BTRFS_NAME_LEN] = {0};
> + u32 total;
> + u32 cur = 0;
> + u32 len;
> + u32 name_len;
> + u32 data_len;
> + u8 filetype;
> + int slot;
> + int ret;
> +
> + path = btrfs_alloc_path();

Unhandled error (and there's one more in check_root_ref). Also, the
function can return negative values, but the caller does not propagate
the falue as it rather expects the bitmask of the specific errors.

Fixing that seems to be more intrusive, as the whole callchain does not
expect that consistently.

Please fix that and resend the whole patchset.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns

2016-09-06 Thread Chris Mason
We use a btrfs_log_ctx structure to pass information into the
tree log commit, and get error values out.  It gets added to a per
log-transaction list which we walk when things go bad.

Commit d1433debe added an optimization to skip waiting for the log
commit, but didn't take root_log_ctx out of the list.  This
patch makes sure we remove things before exiting.

Signed-off-by: Chris Mason 
Fixes: d1433debe7f4346cf9fc0dafc71c3137d2a97bc4
cc: sta...@vger.kernel.org # 3.15+
---
 fs/btrfs/tree-log.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index e935035..ef9c55b 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2867,6 +2867,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 
if (log_root_tree->log_transid_committed >= root_log_ctx.log_transid) {
blk_finish_plug();
+   list_del_init(_log_ctx.list);
mutex_unlock(_root_tree->log_mutex);
ret = root_log_ctx.log_ret;
goto out;
-- 
2.8.0.rc2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Kill the btree inode

2016-09-06 Thread Josef Bacik

On 09/05/2016 12:31 PM, David Sterba wrote:

On Fri, Sep 02, 2016 at 03:39:59PM -0400, Josef Bacik wrote:

In order to provide a better way to do subpage blocksizes we need to stop
allocating pages from a per fs btree inode and instead allocate our own pages.
This work depends on 3 generic patches that I've sent previously

remove mapping from balance_dirty_pages*()
writeback: allow for dirty metadata accounting
writeback: introduce super_operations->write_metadata


What's the status of the patches? I don't see them in current
linux-next. I can put them to our for-next for the time being so we can
test the btree_inode removal.


Still waiting on reviews.  The first pass had no big objections, just need to 
get people to look at the next pass.





This is a pretty big change but ultimately makes extent_buffer reclaim much
cleaner and will make the sub-pagesize blocksize work significantly cleaner.
I've been hammering on this for a few weeks now and seems to be pretty solid.


The preparatory patches are ok. The core patch is quite large, although
there are simple transformations to the new eb_info, the remaining
changes are still quite big for a review. But I don't see a way how to
split it, it's basically an all-or-nothing change. Overall it looks ok
so I'll put it to for-next.



Yeah I broke out what I could but unfortunately ripping out the btree_inode 
means we have to make a bunch of other changes all at once, or we wouldn't be 
able to bisect across it without things breaking horribly.  Thanks,


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: let btrfs_delete_unused_bgs() to clean relocated bgs

2016-09-06 Thread Josef Bacik

On 09/05/2016 12:32 AM, Naohiro Aota wrote:

2016-09-02 (金) の 09:35 -0400 に Josef Bacik さんは書きました:

On 09/02/2016 03:46 AM, Naohiro Aota wrote:


Currently, btrfs_relocate_chunk() is removing relocated BG by
itself. But
the work can be done by btrfs_delete_unused_bgs() (and it's better
since it
trim the BG). Let's dedupe the code.

While btrfs_delete_unused_bgs() is already hitting the relocated
BG, it
skip the BG since the BG has "ro" flag set (to keep balancing BG
intact).
On the other hand, btrfs cannot drop "ro" flag here to prevent
additional
writes. So this patch make use of "removed" flag.
btrfs_delete_unused_bgs() now detect the flag to distinguish
whether a
read-only BG is relocating or not.



This seems racey to me.  We remove the last part of the block group,
it ends up
on the unused_bgs_list, we process this list, see that removed isn't
set and we
skip it, then later we set removed, but it's too late.  I think the
right way is
to actually do a transaction, set ->removed, manually add it to the
unused_bgs_list if it's not already, then end the transaction.  This
way we are
guaranteed to have the bg on the list when it is ready to be
removed.  This is
my analysis after looking at it for 10 seconds after being awake for
like 30
minutes so if I'm missing something let me know.  Thanks,


I don't think a race will happen. Since we are holding
delete_unused_bgs_mutex here, btrfs_delte_unused_bgs() checks ->removed
flag after we unlock the mutex i.e. we setup the flag properly. For a
case btrfs_delete_usused_bgs() checks the BG before we hold
delte_unused_bgs_mutex, then that BG is removed by it (if it's empty)
and btrfs_relocate_chunk() should never see it.



Ok that's what I was missing, thanks

Reviewed-by: Josef Bacik 

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress

2016-09-06 Thread Josef Bacik

On 09/06/2016 06:18 AM, Wang Xiaoguang wrote:

hello,

On 09/02/2016 09:28 PM, Josef Bacik wrote:

On 09/01/2016 10:58 PM, Wang Xiaoguang wrote:

In btrfs_async_reclaim_metadata_space(), we use ticket's address to
determine whether asynchronous metadata reclaim work is making progress.

ticket = list_first_entry(_info->tickets,
  struct reserve_ticket, list);
if (last_ticket == ticket) {
flush_state++;
} else {
last_ticket = ticket;
flush_state = FLUSH_DELAYED_ITEMS_NR;
if (commit_cycles)
commit_cycles--;
}

But indeed it's wrong, we should not rely on local variable's address to
do this check, because addresses may be same. In my test environment, I
dd one 168MB file in a 256MB fs, found that for this file, every time
wait_reserve_ticket() called, local variable ticket's address is same,

For above codes, assume a previous ticket's address is addrA, last_ticket
is addrA. Btrfs_async_reclaim_metadata_space() finished this ticket and
wake up it, then another ticket is added, but with the same address addrA,
now last_ticket will be same to current ticket, then current ticket's flush
work will start from current flush_state, not initial FLUSH_DELAYED_ITEMS_NR,
which may result in some enospc issues(I have seen this in my test machine).

Signed-off-by: Wang Xiaoguang 


We want to track the progress of the individual tickets, not whether or not we
make progress on the space_info, so instead store the global ticket id in
space_info and store the individual ticket_id in the ticket itself, and use
that as the last_tickets_id.  Thanks,

Sorry for being late.

In btrfs_async_reclaim_metadata_space(), there is codes:
if (flush_state > COMMIT_TRANS) {
commit_cycles++;
if (commit_cycles > 2) {
wake_all_tickets(_info->tickets);
space_info->flush = 0;
} else {
flush_state = FLUSH_DELAYED_ITEMS_NR;
}
}
From above codes, if it can not satisfy one current ticket, it'll discard all
current
queued tickets. So I think your original codes is to track whether or not we
make progress on the space_info, and this patch can fix the issue which is
described
in commit message.
Also I'm not sure whether I have understood your design completely. If I'm
wrong, sorry.


Sorry I misread your patch, I thought you were doing space_info->ticket_id++ 
when you added a ticket to the list, not when you removed it.  You can add


Reviewed-by: Josef Bacik 

Thanks,

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [OT] Re: Balancing subvolume on a specific device

2016-09-06 Thread Austin S. Hemmelgarn

On 2016-09-02 06:55, Duncan wrote:

Kai Krakow posted on Thu, 01 Sep 2016 21:45:19 +0200 as excerpted:


Am Sat, 20 Aug 2016 06:30:11 + (UTC)
schrieb Duncan <1i5t5.dun...@cox.net>:


There's at least three other options to try to get what you mention,
however.  FWIW, I'm a gentooer and thus build everything from sources
here, and use ccache myself.  What I do is put all my build stuff, the
gentoo git and assorted overlay git trees, ccache, kernel sources, the
binpkg cache, etc, all on a separate "build" btrfs on normal
partitions, / not/ a subvolume.  That way it can go wherever I want,
and it, along with the main system (/) and /home, but /not/ my media
partition (all of which are fully independent filesystems on their own
partitions, most of them btrfs raid1 on a parallel set of partitions on
a pair of ssds), on ssd. Works great. =:^)


Off topic: Is ccache really that helpful? I dumped it a few years ago
after getting some build errors and/or packaging bugs with it (software
that would just segfault when built with ccache), and in the end it
didn't give a serious speed boost anyways after comparing the genlop
results.


Two comments on ccache...

1) ccache hasn't caused me any serious issues in over a decade of gentoo
usage, including some periods with various hardware issues.  The few
problems I /did/ have at one point were related to crashes while building
and thus corruption of the ccache, but those errors were pretty easily
identified as ccache errors (IDR what the specifics were, but something
about corrupted input files that really made no sense /except/ in the
context of ccache or serious hardware error and I wasn't seeing anything
else related to the latter, so it was pretty clear) and easily enough
fixed by setting CCACHE_RECACHE=1 (write-only mode, basically) for the
specific packages in question, to flush out the corruption by writing
uncorrupted new copies of the files in question.

2a) ccache won't help a lot with ordinary new-version upgrade-cycle
builds, at least with portage, because the build-path is part of the
hash, and portage's default build path includes the package and version
number, so for upgrades, the path and therefore the hash will be
different, resulting in a ccache miss on a new version build, even if
it's the exact same command building the exact same sources.

Similarly, rebuilds of the same sources using the same commands but after
tool (gcc, etc) upgrades won't hash-match (nor would you want them to as
rebuilding with the new version is usually the point), because the hashes
on the tools themselves don't match.

This is why ccache is no longer recommended for ordinary gentooers -- the
hit rate simply doesn't justify it.

2b) cache *does*, however, help in two types of circumstances:

2bi) In ordinary usage, particularly during test compiles in the
configure step, some limited code (here test code) is repeatedly built
with identical commands and paths.  This is where the hits that /are/
generated during normal upgrade usage normally come from, and they can
speed things up somewhat.  However, it's a pretty limited effect and this
by itself doesn't really justify usage.

More measurably practical would be rebuilds of existing versions with
existing tools, perhaps because a library dep upgrade forces it
(intermediate objects involving that library will hash-fail and be
rebuilt, but objects internal to the package itself or only involving
other libs should hash-check and cache-hit), or due to some ebuild change
(like a USE flag change with --newuse) not involving a version bump.
There is often a rather marked ccache related speedup in this sort of
rebuild, but again, while it does happen for most users, it normally
doesn't happen /enough/ to be worth the trouble.

But some users do still run ccache for this case, particularly if like me
they really REALLY hate to see a big build like firefox taking the same
long time it did before, just to change a single USE flag or something.

2bii) Where ccache makes the MOST sense is where people are running large
numbers of live-vcs builds with unchanging () version numbers,
probably via smart-live-rebuild checking to see what packages actually
have new commits since the last build.

I'm running live-git kde, tho a relatively lite version without packages
I don't use and with (my own) patches to kill the core plasma semantic-
desktop (baloo and friends) dependencies, since in my experience semantic-
desktop and its deps simply *are* *not* *worth* *it* !!  That's 170+ kde-
related packages, plus a few misc others, all live-git  version,
which means they build with the same version path and the upstream commit
changes may be as small/simple as some minversion dep bump or l10n
changes to some *.desktop file, neither of which change the code at all,
so in those cases rebuilds should be 100% ccache hits, provided the
ccache is big enough, of course.

Again, live-git (or other live-vcs) rebuilds are where ccache REALLY
shines, 

Re: Security implications of btrfs receive?

2016-09-06 Thread Austin S. Hemmelgarn

On 2016-09-05 05:59, Graham Cobb wrote:

Does anyone know of a security analysis of btrfs receive?
I'm not a developer, and definitely not a security specialist, just a 
security minded sysadmin who has some idea what's going on, but I can at 
least try and answer this.


I assume that just using btrfs receive requires root (is that so?).  But
I was thinking of setting up a backup server which would receive
snapshots from various client systems, each in their own path, and I
wondered how much the security of the backup server (and other clients'
backups) was dependent on the security of the client.
In an ideal situation, I'd suggest setting this up with everything 
running on behalf of the client in it's own container (probably using 
LXC, but lmctfy or similar would work too).  You do need root for 
receive to work (because it has to set ownership of files), but you can 
still sandbox it to a reasonable degree.


Does the "path" argument of btrfs-receive mean that *all* operations are
confined to that path?  For example, if a UUID or transid is sent which
refers to an entity outside the path will that other entity be affected
or used?

As far as I know, no, it won't be affected.

Is it possible for a file to be created containing shared
extents from outside the path?
As far as I know, the only way for this to happen is if you're 
referencing a parent subvolume for a relative send that is itself 
sharing extents outside of the path.  From a practical perspective, 
unless you're doing deduplication on the receiving end, the this 
shouldn't be possible.

Is it possible to confuse/affect
filesystem metadata which would affect the integrity of subvolumes or
files outside the path or prevent other clients from doing something
legitimate?

Do the answers change if the --chroot option is given?
This will give you no more or less security than a chroot would give on 
the system,   Pretty much, if there aren't any bugs, this will prevent 
things from getting written outside of the path specified.

I am confused
about the -m option -- does that mean that the root mount point has to
be visible in the chroot?
In this case, 'root' refers to the top level of the filesystem that the 
path resides in.  So, if you have the filesystem your putting the data 
on mounted at /mnt/backups, and the data for this particular client goes 
in /mnt/backups/client, the appropriate value for the -m option would be 
/mnt/backups.


Lastly, even if receive is designed to be very secure, it is possible
that it could trigger/use code paths in the btrfs kernel code which are
not normally used during normal file operations and so could trigger
bugs not normally seen.  Has any work been done on testing for that (for
example tests using malicious streams, including ones which btrfs-send
cannot generate)?
In general, most of what btrfs receive is doing is just regular file 
operations (open, write, chown, chmod, rename, and similar things), it 
only really gets into the ioctls when dealing with shared extents, and 
when initially setting up the target subvolume, and there really isn't 
anything run on the kernel side that wouldn't be done during normal 
filesystem operation (except possibly some of the extent sharing 
things).  I haven't done any testing myself (I don't use send-receive 
except for cloning VM's, so I don't have much incentive to worry about 
it), but my guess would be that a bogus data stream is most likely to 
crash the receive process in userspace, and has relatively little risk 
of trashing the filesystem beyond what would normally happen for 
something like a bogus rsync transfer.


I am just wondering whether any work has been done/published on this area.
There isn't any that I know of personally, but in theory it shouldn't be 
hard to test.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress

2016-09-06 Thread Wang Xiaoguang

hello,

On 09/02/2016 09:28 PM, Josef Bacik wrote:

On 09/01/2016 10:58 PM, Wang Xiaoguang wrote:

In btrfs_async_reclaim_metadata_space(), we use ticket's address to
determine whether asynchronous metadata reclaim work is making progress.

ticket = list_first_entry(_info->tickets,
  struct reserve_ticket, list);
if (last_ticket == ticket) {
flush_state++;
} else {
last_ticket = ticket;
flush_state = FLUSH_DELAYED_ITEMS_NR;
if (commit_cycles)
commit_cycles--;
}

But indeed it's wrong, we should not rely on local variable's address to
do this check, because addresses may be same. In my test environment, I
dd one 168MB file in a 256MB fs, found that for this file, every time
wait_reserve_ticket() called, local variable ticket's address is same,

For above codes, assume a previous ticket's address is addrA, 
last_ticket

is addrA. Btrfs_async_reclaim_metadata_space() finished this ticket and
wake up it, then another ticket is added, but with the same address 
addrA,
now last_ticket will be same to current ticket, then current ticket's 
flush
work will start from current flush_state, not initial 
FLUSH_DELAYED_ITEMS_NR,
which may result in some enospc issues(I have seen this in my test 
machine).


Signed-off-by: Wang Xiaoguang 


We want to track the progress of the individual tickets, not whether 
or not we make progress on the space_info, so instead store the global 
ticket id in space_info and store the individual ticket_id in the 
ticket itself, and use that as the last_tickets_id.  Thanks,

Sorry for being late.

In btrfs_async_reclaim_metadata_space(), there is codes:
if (flush_state > COMMIT_TRANS) {
commit_cycles++;
if (commit_cycles > 2) {
wake_all_tickets(_info->tickets);
space_info->flush = 0;
} else {
flush_state = FLUSH_DELAYED_ITEMS_NR;
}
}
From above codes, if it can not satisfy one current ticket, it'll 
discard all current

queued tickets. So I think your original codes is to track whether or not we
make progress on the space_info, and this patch can fix the issue which 
is described

in commit message.
Also I'm not sure whether I have understood your design completely. If 
I'm wrong, sorry.


Regards,
Xiaoguang Wang



Josef






--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: remove unnecessary code of chunk_root assignment in btrfs_read_chunk_tree.

2016-09-06 Thread David Sterba
On Mon, Sep 05, 2016 at 11:13:40PM -0400, Jeff Mahoney wrote:
> >> Since root is only used to get fs_info->chunk_root, why not use fs_info
> >> directly?
> > 
> > Weird.  Exactly this was a part of my fs_info patchset.  I guess I need
> > to go back and check what else is missing.
> 
> Actually, most of this didn't land.  Pretty much anything that's a root
> ->fs_info conversion is in there.

Only half of the patchset has been merged so far because it did not pass
testing, so I bisected to some point. I was about to let you know once
most of 4.9 patches are prepared so there are less merge conflicts.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html