Re: [PATCH] Improve error stats message

2018-03-07 Thread Nikolay Borisov


On  7.03.2018 19:37, Diego wrote:
> A typical notification of filesystem errors looks like this:
> 
> BTRFS error (device sda2): bdev /dev/sda2 errs: wr 0, rd 1, flush 0, corrupt 
> 0, gen 0
> 
> The device name is being printed twice. Also, these abbreviatures
> feel unnecesary. Make the message look like this instead:
> 
> BTRFS error (device sda2): errors: write 0, read 1, flush 0, corrupt 0, 
> generation 0
> 
> 
> Signed-off-by: Diego Calleja 
> ---
>  fs/btrfs/volumes.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 2ceb924ca0d6..52fee5bb056f 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -7238,9 +7238,8 @@ static void btrfs_dev_stat_print_on_error(struct 
> btrfs_device *dev)
>  {
>   if (!dev->dev_stats_valid)
>   return;
> - btrfs_err_rl_in_rcu(dev->fs_info,
> - "bdev %s errs: wr %u, rd %u, flush %u, corrupt %u, gen %u",
> -rcu_str_deref(dev->name),
> + btrfs_err_rl(dev->fs_info,
> + "errors: write %u, read %u, flush %u, corrupt %u, generation 
> %u",
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_WRITE_ERRS),
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_READ_ERRS),
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_FLUSH_ERRS),

I think what would be better is to expose the btrfs_dev_name functino in
a header file and instead of open-coding rcu_str_deref use that function
instead. Also I agree that write/read/ are better than wr/rd.

> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs-progs: free-space-cache: Enhance free space cache free space check

2018-03-07 Thread Qu Wenruo
When we found free space difference between free space cache and block
group item, we just discard this free space cache.

Normally such difference is caused by btrfs_reserve_extent() called by
delalloc which is out of a transaction.
And since all btrfs_release_extent() is called with a transaction, under
heavy race free space cache can have less free space than block group
item.

Normally kernel will detect such difference and just discard that cache.

However we must be more careful if free space cache has more free space
cache, and if that happens, paried with above race one invalid free
space cache can be loaded into kernel.

So if we find any free space cache who has more free space then block
group item, we report it as an error other than ignoring it.

Signed-off-by: Qu Wenruo 
---
v2:
  Fix the timming of free space output.
---
 check/main.c   |  4 +++-
 free-space-cache.c | 32 
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/check/main.c b/check/main.c
index 97baae583f04..bc31f7e32061 100644
--- a/check/main.c
+++ b/check/main.c
@@ -5339,7 +5339,9 @@ static int check_space_cache(struct btrfs_root *root)
error += ret;
} else {
ret = load_free_space_cache(root->fs_info, cache);
-   if (!ret)
+   if (ret < 0)
+   error++;
+   if (ret <= 0)
continue;
}
 
diff --git a/free-space-cache.c b/free-space-cache.c
index f933f9f1cf3f..9b83a71ca59a 100644
--- a/free-space-cache.c
+++ b/free-space-cache.c
@@ -438,7 +438,8 @@ int load_free_space_cache(struct btrfs_fs_info *fs_info,
struct btrfs_path *path;
u64 used = btrfs_block_group_used(_group->item);
int ret = 0;
-   int matched;
+   u64 bg_free;
+   s64 diff;
 
path = btrfs_alloc_path();
if (!path)
@@ -448,18 +449,33 @@ int load_free_space_cache(struct btrfs_fs_info *fs_info,
  block_group->key.objectid);
btrfs_free_path(path);
 
-   matched = (ctl->free_space == (block_group->key.offset - used -
-  block_group->bytes_super));
-   if (ret == 1 && !matched) {
-   __btrfs_remove_free_space_cache(ctl);
+   bg_free = block_group->key.offset - used - block_group->bytes_super;
+   diff = ctl->free_space - bg_free;
+   if (ret == 1 && diff) {
fprintf(stderr,
-  "block group %llu has wrong amount of free space\n",
-  block_group->key.objectid);
+  "block group %llu has wrong amount of free space, free 
space cache has %llu block group has %llu\n",
+  block_group->key.objectid, ctl->free_space, bg_free);
+   __btrfs_remove_free_space_cache(ctl);
+   /*
+* Due to btrfs_reserve_extent() can happen out of a
+* transaction, but all btrfs_release_extent() happens inside
+* a transaction, so under heavy race it's possible that free
+* space cache has less free space, and both kernel just discard
+* such cache. But if we find some case where free space cache
+* has more free space, this means under certain case such
+* cache can be loaded and cause double allocate.
+*
+* Detect such possibility here.
+*/
+   if (diff > 0)
+   error(
+"free space cache has more free space than block group item, this could leads 
to serious corruption, please contact btrfs developers");
ret = -1;
}
 
if (ret < 0) {
-   ret = 0;
+   if (diff <= 0)
+   ret = 0;
 
fprintf(stderr,
   "failed to load free space cache for block group %llu\n",
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to change/fix 'Received UUID'

2018-03-07 Thread Andrei Borzenkov
08.03.2018 09:06, Marc MERLIN пишет:
> On Tue, Mar 06, 2018 at 12:02:47PM -0800, Marc MERLIN wrote:
>>> https://github.com/knorrie/python-btrfs/commit/1ace623f95300ecf581b1182780fd6432a46b24d
>>
>> Well, I had never heard about it until now, thank you.
>>
>> I'll see if I can make it work when I get a bit of time.
> 
> Sorry, I missed the fact that there was no code to write at all.
> gargamel:/var/local/src/python-btrfs/examples# ./set_received_uuid.py 
> 2afc7a5e-107f-d54b-8929-197b80b70828 31337 1234.5678 
> /mnt/btrfs_bigbackup/DS1/Video_ro.20180220_21:03:41
> Current subvolume information:
>   subvol_id: 94887
>   received_uuid: ----
>   stime: 0.0 (1970-01-01T00:00:00)
>   stransid: 0  
>   rtime: 0.0 (1970-01-01T00:00:00)
>   rtransid: 0  
> 
> Setting received subvolume...
> 
> Resulting subvolume information:
>   subvol_id: 94887
>   received_uuid: 2afc7a5e-107f-d54b-8929-197b80b70828
>   stime: 1234.5678 (1970-01-01T00:20:34.567800)
>   stransid: 31337
>   rtime: 1520488877.415709329 (2018-03-08T06:01:17.415709)
>   rtransid: 255755
> 
> gargamel:/var/local/src/python-btrfs/examples# btrfs property set -ts 
> /mnt/btrfs_bigbackup/DS1/Video_ro.20180220_21:03:41 ro true
> 
> 
> ABORT: btrfs send -p /mnt/btrfs_pool1/Video_ro.20180205_21:05:15 
> Video_ro.20180307_22:03:03 |  btrfs receive /mnt/btrfs_bigbackup/DS1//. failed
> At subvol Video_ro.20180307_22:03:03
> At snapshot Video_ro.20180307_22:03:03
> ERROR: cannot find parent subvolume
> 
> gargamel:/mnt/btrfs_pool1# btrfs subvolume show 
> /mnt/btrfs_pool1/Video_ro.20180220_21\:03\:41/
> Video_ro.20180220_21:03:41

Not sure I understand how this subvolume is related. You send
differences between Video_ro.20180205_21:05:15 and
Video_ro.20180307_22:03:03, so you need to have (replica of)
Video_ro.20180205_21:05:15 on destination. How exactly
Video_ro.20180220_21:03:41 comes in picture here?

> Name:   Video_ro.20180220_21:03:41
> UUID:   2afc7a5e-107f-d54b-8929-197b80b70828
> Parent UUID:e5ec5c1e-6b49-084e-8820-5a8cfaa1b089
> Received UUID:  0e220a4f-6426-4745-8399-0da0084f8b23> 
> Creation time:  2018-02-20 21:03:42 -0800
> Subvolume ID:   11228
> Generation: 4174
> Gen at creation:4150
> Parent ID:  5
> Top level ID:   5
> Flags:  readonly
> Snapshot(s):
> Video_rw.20180220_21:03:41
> Video
> 
> 
> Wasn't I supposed to set 2afc7a5e-107f-d54b-8929-197b80b70828 onto the 
> destination?
> 
> Doesn't that look ok now? Is there something else I'm missing?
> gargamel:/mnt/btrfs_pool1# btrfs subvolume show 
> /mnt/btrfs_bigbackup/DS1/Video_ro.20180220_21:03:41
> DS1/Video_ro.20180220_21:03:41
> Name:   Video_ro.20180220_21:03:41
> UUID:   cb4f343c-5e79-7f49-adf0-7ce0b29f23b3
> Parent UUID:0e220a4f-6426-4745-8399-0da0084f8b23
> Received UUID:  2afc7a5e-107f-d54b-8929-197b80b70828
> Creation time:  2018-02-20 21:13:36 -0800
> Subvolume ID:   94887
> Generation: 250689
> Gen at creation:250689
> Parent ID:  89160
> Top level ID:   89160
> Flags:  readonly
> Snapshot(s):
> 
> Thanks,
> Marc
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: free-space-cache: Enhance free space cache free space check

2018-03-07 Thread Qu Wenruo
When we found free space difference between free space cache and block
group item, we just discard this free space cache.

Normally such difference is caused by btrfs_reserve_extent() called by
delalloc which is out of a transaction.
And since all btrfs_release_extent() is called with a transaction, under
heavy race free space cache can have less free space than block group
item.

Normally kernel will detect such difference and just discard that cache.

However we must be more careful if free space cache has more free space
cache, and if that happens, paried with above race one invalid free
space cache can be loaded into kernel.

So if we find any free space cache who has more free space then block
group item, we report it as an error other than ignoring it.

Signed-off-by: Qu Wenruo 
---
BTW, invalid free space cache found in dm-log-writes test case just
has 0 free space, so it won't trigger such error.
---
 check/main.c   |  4 +++-
 free-space-cache.c | 30 +++---
 2 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/check/main.c b/check/main.c
index 97baae583f04..bc31f7e32061 100644
--- a/check/main.c
+++ b/check/main.c
@@ -5339,7 +5339,9 @@ static int check_space_cache(struct btrfs_root *root)
error += ret;
} else {
ret = load_free_space_cache(root->fs_info, cache);
-   if (!ret)
+   if (ret < 0)
+   error++;
+   if (ret <= 0)
continue;
}
 
diff --git a/free-space-cache.c b/free-space-cache.c
index f933f9f1cf3f..777c5fc682c5 100644
--- a/free-space-cache.c
+++ b/free-space-cache.c
@@ -438,7 +438,8 @@ int load_free_space_cache(struct btrfs_fs_info *fs_info,
struct btrfs_path *path;
u64 used = btrfs_block_group_used(_group->item);
int ret = 0;
-   int matched;
+   u64 bg_free;
+   s64 diff;
 
path = btrfs_alloc_path();
if (!path)
@@ -448,18 +449,33 @@ int load_free_space_cache(struct btrfs_fs_info *fs_info,
  block_group->key.objectid);
btrfs_free_path(path);
 
-   matched = (ctl->free_space == (block_group->key.offset - used -
-  block_group->bytes_super));
-   if (ret == 1 && !matched) {
+   bg_free = block_group->key.offset - used - block_group->bytes_super;
+   diff = ctl->free_space - bg_free;
+   if (ret == 1 && diff) {
__btrfs_remove_free_space_cache(ctl);
fprintf(stderr,
-  "block group %llu has wrong amount of free space\n",
-  block_group->key.objectid);
+  "block group %llu has wrong amount of free space, free 
space cache has %llu block group has %llu\n",
+  block_group->key.objectid, ctl->free_space, bg_free);
+   /*
+* Due to btrfs_reserve_extent() can happen out of a
+* transaction, but all btrfs_release_extent() happens inside
+* a transaction, so under heavy race it's possible that free
+* space cache has less free space, and both kernel just discard
+* such cache. But if we find some case where free space cache
+* has more free space, this means under certain case such
+* cache can be loaded and cause double allocate.
+*
+* Detect such possibility here.
+*/
+   if (diff > 0)
+   error(
+"free space cache has more free space than block group item, this could leads 
to serious corruption, please contact btrfs developers");
ret = -1;
}
 
if (ret < 0) {
-   ret = 0;
+   if (diff <= 0)
+   ret = 0;
 
fprintf(stderr,
   "failed to load free space cache for block group %llu\n",
-- 
2.16.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Inconsistence between sender and receiver

2018-03-07 Thread Andrei Borzenkov
07.03.2018 21:49, Liu Bo пишет:
> Hi,
> 
> In the following steps[1], if  on receiver side has got
> changed via 'btrfs property set', then after doing incremental
> updates, receiver gets a different snapshot from what sender has sent.
> 
> The reason behind it is that there is no change about file 'foo' in
> the send stream, such that receiver simply creates a snapshot of
>  on its side with nothing to apply from the send stream.
> 
> A possible way to avoid this is to check rtransid and ctranid of
>  on receiver side, but I'm not very sure whether the current
> behavior is made deliberately, does anyone have an idea? 
> 
> Thanks,
> 
> -liubo
> 
> [1]:
> $ btrfs sub create /mnt/send/sub
> $ touch /mnt/send/sub/foo
> $ btrfs sub snap -r /mnt/send/sub /mnt/send/parent
> 
> # send parent out
> $ btrfs send /mnt/send/parent | btrfs receive /mnt/recv/
> 
> # change parent and file under it
> $ btrfs property set -t subvol /mnt/recv/parent ro false

Is removing the ability to modify read-only property an option? What are
use cases for it? What can it do that "btrfs sub snap read-only
writable" cannot?

> $ truncate -s 4096 /mnt/recv/parent/foo
> 
> $ btrfs sub snap -r /mnt/send/sub /mnt/send/update
> $ btrfs send -p /mnt/send/parent /mnt/send/update | btrfs receive /mnt/recv
> 

This should fail right away because /mnt/send/parent is not read-only.
If it does not, this is really a bug.

Of course one may go one step further and set /mnt/send/parent read-only
again, then we get this issue.

> $ ls -l /mnt/send/update
> total 0
> -rw-r--r-- 1 root root 0 Mar  6 11:13 foo
> 
> $ ls -l /mnt/recv/update
> total 0
> -rw-r--r-- 1 root root 4096 Mar  6 11:14 foo
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] fstests: btrfs/146: make sure hit all stripes in the case of compression

2018-03-07 Thread Eryu Guan
On Thu, Mar 08, 2018 at 01:56:45PM +0800, Lu Fengqi wrote:
> In the case of compression, each 128K input data chunk will be compressed
> to 4K (because of the characters written are duplicate). Therefore we have
> to write (128K * 16) to make sure every stripe can be hit.
> 
> Signed-off-by: Lu Fengqi 
> ---
> 
> V2: Modify the regular expression to ensure that it matches various
> compress mount options.
> 
>  tests/btrfs/146 | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/btrfs/146 b/tests/btrfs/146
> index 7071c128ca0a..a51eda68eaf3 100755
> --- a/tests/btrfs/146
> +++ b/tests/btrfs/146
> @@ -74,9 +74,16 @@ _scratch_pool_mkfs "-d raid0 -m raid1" > $seqres.full 2>&1
>  _scratch_mount
>  
>  # How much do we need to write? We need to hit all of the stripes. btrfs uses
> -# a fixed 64k stripesize, so write enough to hit each one
> +# a fixed 64k stripesize, so write enough to hit each one. In the case of
> +# compression, each 128K input data chunk will be compressed to 4K (because 
> of
> +# the characters written are duplicate). Therefore we have to write (128K * 
> 16)
> +# to make sure every stripe can be hit.
>  number_of_devices=`echo $SCRATCH_DEV_POOL | wc -w`
> -write_kb=$(($number_of_devices * 64))
> +if ! echo $MOUNT_OPTIONS | grep -qoP 'compress(-force)?(=(?!no)|,|$)'; then

I'm wondering if we could just write ($number_of_devices * 2048)K data
unconditionally, so we could get rid of this case switch and the fancy
perl style regexp?

Thanks,
Eryu

> + write_kb=$(($number_of_devices * 64))
> +else
> + write_kb=$(($number_of_devices * 2048))
> +fi
>  _require_fs_space $SCRATCH_MNT $write_kb
>  
>  testfile=$SCRATCH_MNT/fsync-err-test
> -- 
> 2.16.2
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to change/fix 'Received UUID'

2018-03-07 Thread Marc MERLIN
On Tue, Mar 06, 2018 at 12:02:47PM -0800, Marc MERLIN wrote:
> > https://github.com/knorrie/python-btrfs/commit/1ace623f95300ecf581b1182780fd6432a46b24d
> 
> Well, I had never heard about it until now, thank you.
> 
> I'll see if I can make it work when I get a bit of time.

Sorry, I missed the fact that there was no code to write at all.
gargamel:/var/local/src/python-btrfs/examples# ./set_received_uuid.py 
2afc7a5e-107f-d54b-8929-197b80b70828 31337 1234.5678 
/mnt/btrfs_bigbackup/DS1/Video_ro.20180220_21:03:41
Current subvolume information:
  subvol_id: 94887
  received_uuid: ----
  stime: 0.0 (1970-01-01T00:00:00)
  stransid: 0  
  rtime: 0.0 (1970-01-01T00:00:00)
  rtransid: 0  

Setting received subvolume...

Resulting subvolume information:
  subvol_id: 94887
  received_uuid: 2afc7a5e-107f-d54b-8929-197b80b70828
  stime: 1234.5678 (1970-01-01T00:20:34.567800)
  stransid: 31337
  rtime: 1520488877.415709329 (2018-03-08T06:01:17.415709)
  rtransid: 255755

gargamel:/var/local/src/python-btrfs/examples# btrfs property set -ts 
/mnt/btrfs_bigbackup/DS1/Video_ro.20180220_21:03:41 ro true


ABORT: btrfs send -p /mnt/btrfs_pool1/Video_ro.20180205_21:05:15 
Video_ro.20180307_22:03:03 |  btrfs receive /mnt/btrfs_bigbackup/DS1//. failed
At subvol Video_ro.20180307_22:03:03
At snapshot Video_ro.20180307_22:03:03
ERROR: cannot find parent subvolume

gargamel:/mnt/btrfs_pool1# btrfs subvolume show 
/mnt/btrfs_pool1/Video_ro.20180220_21\:03\:41/
Video_ro.20180220_21:03:41
Name:   Video_ro.20180220_21:03:41
UUID:   2afc7a5e-107f-d54b-8929-197b80b70828
Parent UUID:e5ec5c1e-6b49-084e-8820-5a8cfaa1b089
Received UUID:  0e220a4f-6426-4745-8399-0da0084f8b23
Creation time:  2018-02-20 21:03:42 -0800
Subvolume ID:   11228
Generation: 4174
Gen at creation:4150
Parent ID:  5
Top level ID:   5
Flags:  readonly
Snapshot(s):
Video_rw.20180220_21:03:41
Video


Wasn't I supposed to set 2afc7a5e-107f-d54b-8929-197b80b70828 onto the 
destination?

Doesn't that look ok now? Is there something else I'm missing?
gargamel:/mnt/btrfs_pool1# btrfs subvolume show 
/mnt/btrfs_bigbackup/DS1/Video_ro.20180220_21:03:41
DS1/Video_ro.20180220_21:03:41
Name:   Video_ro.20180220_21:03:41
UUID:   cb4f343c-5e79-7f49-adf0-7ce0b29f23b3
Parent UUID:0e220a4f-6426-4745-8399-0da0084f8b23
Received UUID:  2afc7a5e-107f-d54b-8929-197b80b70828
Creation time:  2018-02-20 21:13:36 -0800
Subvolume ID:   94887
Generation: 250689
Gen at creation:250689
Parent ID:  89160
Top level ID:   89160
Flags:  readonly
Snapshot(s):

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/   | PGP 7F55D5F27AAF9D08
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] fstests: btrfs/146: make sure hit all stripes in the case of compression

2018-03-07 Thread Lu Fengqi
In the case of compression, each 128K input data chunk will be compressed
to 4K (because of the characters written are duplicate). Therefore we have
to write (128K * 16) to make sure every stripe can be hit.

Signed-off-by: Lu Fengqi 
---

V2: Modify the regular expression to ensure that it matches various
compress mount options.

 tests/btrfs/146 | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tests/btrfs/146 b/tests/btrfs/146
index 7071c128ca0a..a51eda68eaf3 100755
--- a/tests/btrfs/146
+++ b/tests/btrfs/146
@@ -74,9 +74,16 @@ _scratch_pool_mkfs "-d raid0 -m raid1" > $seqres.full 2>&1
 _scratch_mount
 
 # How much do we need to write? We need to hit all of the stripes. btrfs uses
-# a fixed 64k stripesize, so write enough to hit each one
+# a fixed 64k stripesize, so write enough to hit each one. In the case of
+# compression, each 128K input data chunk will be compressed to 4K (because of
+# the characters written are duplicate). Therefore we have to write (128K * 16)
+# to make sure every stripe can be hit.
 number_of_devices=`echo $SCRATCH_DEV_POOL | wc -w`
-write_kb=$(($number_of_devices * 64))
+if ! echo $MOUNT_OPTIONS | grep -qoP 'compress(-force)?(=(?!no)|,|$)'; then
+   write_kb=$(($number_of_devices * 64))
+else
+   write_kb=$(($number_of_devices * 2048))
+fi
 _require_fs_space $SCRATCH_MNT $write_kb
 
 testfile=$SCRATCH_MNT/fsync-err-test
-- 
2.16.2



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/20] btrfs-progs: help: convert ints used as bools to bool

2018-03-07 Thread Qu Wenruo


On 2018年03月08日 10:40, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> We use an int for 'full', 'all', and 'err' when we really mean a boolean.
> 
> Signed-off-by: Jeff Mahoney 

Reviewed-by: Qu Wenruo 

Thanks,
Qu

> ---
>  btrfs.c | 14 +++---
>  help.c  | 25 +
>  help.h  |  4 ++--
>  3 files changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/btrfs.c b/btrfs.c
> index 2d39f2ce..fec1a135 100644
> --- a/btrfs.c
> +++ b/btrfs.c
> @@ -109,7 +109,7 @@ static void handle_help_options_next_level(const struct 
> cmd_struct *cmd,
>   argv++;
>   help_command_group(cmd->next, argc, argv);
>   } else {
> - usage_command(cmd, 1, 0);
> + usage_command(cmd, true, false);
>   }
>  
>   exit(0);
> @@ -125,7 +125,7 @@ int handle_command_group(const struct cmd_group *grp, int 
> argc,
>   argc--;
>   argv++;
>   if (argc < 1) {
> - usage_command_group(grp, 0, 0);
> + usage_command_group(grp, false, false);
>   exit(1);
>   }
>  
> @@ -212,20 +212,20 @@ static int handle_global_options(int argc, char **argv)
>  
>  void handle_special_globals(int shift, int argc, char **argv)
>  {
> - int has_help = 0;
> - int has_full = 0;
> + bool has_help = false;
> + bool has_full = false;
>   int i;
>  
>   for (i = 0; i < shift; i++) {
>   if (strcmp(argv[i], "--help") == 0)
> - has_help = 1;
> + has_help = true;
>   else if (strcmp(argv[i], "--full") == 0)
> - has_full = 1;
> + has_full = true;
>   }
>  
>   if (has_help) {
>   if (has_full)
> - usage_command_group(_cmd_group, 1, 0);
> + usage_command_group(_cmd_group, true, false);
>   else
>   cmd_help(argc, argv);
>   exit(0);
> diff --git a/help.c b/help.c
> index 311a4320..ef7986b4 100644
> --- a/help.c
> +++ b/help.c
> @@ -196,8 +196,8 @@ static int do_usage_one_command(const char * const 
> *usagestr,
>  }
>  
>  static int usage_command_internal(const char * const *usagestr,
> -   const char *token, int full, int lst,
> -   int alias, FILE *outf)
> +   const char *token, bool full, bool lst,
> +   bool alias, FILE *outf)
>  {
>   unsigned int flags = 0;
>   int ret;
> @@ -223,17 +223,17 @@ static int usage_command_internal(const char * const 
> *usagestr,
>  }
>  
>  static void usage_command_usagestr(const char * const *usagestr,
> -const char *token, int full, int err)
> +const char *token, bool full, bool err)
>  {
>   FILE *outf = err ? stderr : stdout;
>   int ret;
>  
> - ret = usage_command_internal(usagestr, token, full, 0, 0, outf);
> + ret = usage_command_internal(usagestr, token, full, false, false, outf);
>   if (!ret)
>   fputc('\n', outf);
>  }
>  
> -void usage_command(const struct cmd_struct *cmd, int full, int err)
> +void usage_command(const struct cmd_struct *cmd, bool full, bool err)
>  {
>   usage_command_usagestr(cmd->usagestr, cmd->token, full, err);
>  }
> @@ -241,11 +241,11 @@ void usage_command(const struct cmd_struct *cmd, int 
> full, int err)
>  __attribute__((noreturn))
>  void usage(const char * const *usagestr)
>  {
> - usage_command_usagestr(usagestr, NULL, 1, 1);
> + usage_command_usagestr(usagestr, NULL, true, true);
>   exit(1);
>  }
>  
> -static void usage_command_group_internal(const struct cmd_group *grp, int 
> full,
> +static void usage_command_group_internal(const struct cmd_group *grp, bool 
> full,
>FILE *outf)
>  {
>   const struct cmd_struct *cmd = grp->commands;
> @@ -265,7 +265,8 @@ static void usage_command_group_internal(const struct 
> cmd_group *grp, int full,
>   }
>  
>   usage_command_internal(cmd->usagestr, cmd->token, full,
> -1, cmd->flags & CMD_ALIAS, outf);
> +true, cmd->flags & CMD_ALIAS,
> +outf);
>   if (cmd->flags & CMD_ALIAS)
>   putchar('\n');
>   continue;
> @@ -327,7 +328,7 @@ void usage_command_group_short(const struct cmd_group 
> *grp)
>   fprintf(stderr, "All command groups have their manual page named 
> 'btrfs-'.\n");
>  }
>  
> -void usage_command_group(const struct cmd_group *grp, int full, int err)
> +void usage_command_group(const struct cmd_group *grp, bool full, bool err)
>  {
>   const char * const 

Re: [PATCH 08/20] btrfs-progs: qgroups: introduce btrfs_qgroup_query

2018-03-07 Thread Qu Wenruo


On 2018年03月08日 10:40, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> The only mechanism we have in the progs for searching qgroups is to load
> all of them and filter the results.  This works for qgroup show but
> to add quota information to 'btrfs subvoluem show' it's pretty wasteful.
> 
> This patch splits out setting up the search and performing the search so
> we can search for a single qgroupid more easily.  Since TREE_SEARCH
> will give results that don't strictly match the search terms, we add
> a filter to match only the results we care about.
> 
> Signed-off-by: Jeff Mahoney 
> ---
>  qgroup.c | 143 
> ---
>  qgroup.h |   7 
>  2 files changed, 116 insertions(+), 34 deletions(-)
> 
> diff --git a/qgroup.c b/qgroup.c
> index 57815718..d076b1de 100644
> --- a/qgroup.c
> +++ b/qgroup.c
> @@ -1165,11 +1165,30 @@ static inline void print_status_flag_warning(u64 
> flags)
>   warning("qgroup data inconsistent, rescan recommended");
>  }
>  
> -static int __qgroups_search(int fd, struct qgroup_lookup *qgroup_lookup)
> +static bool key_in_range(const struct btrfs_key *key,
> +  const struct btrfs_ioctl_search_key *sk)
> +{
> + if (key->objectid < sk->min_objectid ||
> + key->objectid > sk->max_objectid)
> + return false;
> +
> + if (key->type < sk->min_type ||
> + key->type > sk->max_type)
> + return false;
> +
> + if (key->offset < sk->min_offset ||
> + key->offset > sk->max_offset)
> + return false;
> +
> + return true;
> +}

Even with the key_in_range() check here, we are still following the tree
search slice behavior:

tree search will still gives us all the items in key range from
(min_objectid, min_type, min_offset) to
(max_objectid, max_type, max_offset).

I don't see much different between the tree search ioctl and this one.

> +
> +static int __qgroups_search(int fd, struct btrfs_ioctl_search_args *args,
> + struct qgroup_lookup *qgroup_lookup)
>  {
>   int ret;
> - struct btrfs_ioctl_search_args args;
> - struct btrfs_ioctl_search_key *sk = 
> + struct btrfs_ioctl_search_key *sk = >key;
> + struct btrfs_ioctl_search_key filter_key = args->key;
>   struct btrfs_ioctl_search_header *sh;
>   unsigned long off = 0;
>   unsigned int i;
> @@ -1180,30 +1199,15 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   u64 qgroupid;
>   u64 qgroupid1;
>  
> - memset(, 0, sizeof(args));
> -
> - sk->tree_id = BTRFS_QUOTA_TREE_OBJECTID;
> - sk->max_type = BTRFS_QGROUP_RELATION_KEY;
> - sk->min_type = BTRFS_QGROUP_STATUS_KEY;
> - sk->max_objectid = (u64)-1;
> - sk->max_offset = (u64)-1;
> - sk->max_transid = (u64)-1;
> - sk->nr_items = 4096;
> -
>   qgroup_lookup_init(qgroup_lookup);
>  
>   while (1) {
> - ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, );
> + ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, args);
>   if (ret < 0) {
> - if (errno == ENOENT) {
> - error("can't list qgroups: quotas not enabled");
> + if (errno == ENOENT)
>   ret = -ENOTTY;
> - } else {
> - error("can't list qgroups: %s",
> -strerror(errno));
> + else
>   ret = -errno;
> - }
> -
>   break;
>   }
>  
> @@ -1217,37 +1221,46 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>* read the root_ref item it contains
>*/
>   for (i = 0; i < sk->nr_items; i++) {
> - sh = (struct btrfs_ioctl_search_header *)(args.buf +
> + struct btrfs_key key;
> +
> + sh = (struct btrfs_ioctl_search_header *)(args->buf +
> off);
>   off += sizeof(*sh);
>  
> - switch (btrfs_search_header_type(sh)) {
> + key.objectid = btrfs_search_header_objectid(sh);
> + key.type = btrfs_search_header_type(sh);
> + key.offset = btrfs_search_header_offset(sh);
> +
> + if (!key_in_range(, _key))
> + goto next;

It still looks like that most other qgroup info will get calculated.

> +
> + switch (key.type) {
>   case BTRFS_QGROUP_STATUS_KEY:
>   si = (struct btrfs_qgroup_status_item *)
> -  (args.buf + off);
> +  (args->buf + off);
>   flags = 

Re: [PATCH 07/20] btrfs-progs: qgroups: introduce and use info and limit structures

2018-03-07 Thread Qu Wenruo


On 2018年03月08日 10:40, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> We use structures to pass the info and limit from the kernel as items
> but store the individual values separately in btrfs_qgroup.  We already
> have a btrfs_qgroup_limit structure that's used for setting the limit.
> 
> This patch introduces a btrfs_qgroup_info structure and uses that and
> btrfs_qgroup_limit in btrfs_qgroup.
> 
> Signed-off-by: Jeff Mahoney 

Reviewed-by: Qu Wenruo 

Thanks,
Qu

> ---
>  qgroup.c | 82 
> ++--
>  qgroup.h |  8 +++
>  2 files changed, 52 insertions(+), 38 deletions(-)
> 
> diff --git a/qgroup.c b/qgroup.c
> index 5600da99..57815718 100644
> --- a/qgroup.c
> +++ b/qgroup.c
> @@ -46,20 +46,12 @@ struct btrfs_qgroup {
>   /*
>* info_item
>*/
> - u64 generation;
> - u64 rfer;   /*referenced*/
> - u64 rfer_cmpr;  /*referenced compressed*/
> - u64 excl;   /*exclusive*/
> - u64 excl_cmpr;  /*exclusive compressed*/
> + struct btrfs_qgroup_info info;
>  
>   /*
>*limit_item
>*/
> - u64 flags;  /*which limits are set*/
> - u64 max_rfer;
> - u64 max_excl;
> - u64 rsv_rfer;
> - u64 rsv_excl;
> + struct btrfs_qgroup_limit limit;
>  
>   /*qgroups this group is member of*/
>   struct list_head qgroups;
> @@ -260,6 +252,11 @@ void print_pathname_column(struct btrfs_qgroup *qgroup, 
> bool verbose)
>   fputs("", stdout);
>  }
>  
> +static int print_u64(u64 value, int unit_mode, int max_len)
> +{
> + return printf("%*s", max_len, pretty_size_mode(value, unit_mode));
> +}
> +
>  static void print_qgroup_column(struct btrfs_qgroup *qgroup,
>   enum btrfs_qgroup_column_enum column,
>   bool verbose)
> @@ -279,24 +276,26 @@ static void print_qgroup_column(struct btrfs_qgroup 
> *qgroup,
>   print_qgroup_column_add_blank(BTRFS_QGROUP_QGROUPID, len);
>   break;
>   case BTRFS_QGROUP_RFER:
> - len = printf("%*s", max_len, pretty_size_mode(qgroup->rfer, 
> unit_mode));
> + len = print_u64(qgroup->info.referenced, unit_mode, max_len);
>   break;
>   case BTRFS_QGROUP_EXCL:
> - len = printf("%*s", max_len, pretty_size_mode(qgroup->excl, 
> unit_mode));
> + len = print_u64(qgroup->info.exclusive, unit_mode, max_len);
>   break;
>   case BTRFS_QGROUP_PARENT:
>   len = print_parent_column(qgroup);
>   print_qgroup_column_add_blank(BTRFS_QGROUP_PARENT, len);
>   break;
>   case BTRFS_QGROUP_MAX_RFER:
> - if (qgroup->flags & BTRFS_QGROUP_LIMIT_MAX_RFER)
> - len = printf("%*s", max_len, 
> pretty_size_mode(qgroup->max_rfer, unit_mode));
> + if (qgroup->limit.flags & BTRFS_QGROUP_LIMIT_MAX_RFER)
> + len = print_u64(qgroup->limit.max_referenced,
> + unit_mode, max_len);
>   else
>   len = printf("%*s", max_len, "none");
>   break;
>   case BTRFS_QGROUP_MAX_EXCL:
> - if (qgroup->flags & BTRFS_QGROUP_LIMIT_MAX_EXCL)
> - len = printf("%*s", max_len, 
> pretty_size_mode(qgroup->max_excl, unit_mode));
> + if (qgroup->limit.flags & BTRFS_QGROUP_LIMIT_MAX_EXCL)
> + len = print_u64(qgroup->limit.max_exclusive,
> + unit_mode, max_len);
>   else
>   len = printf("%*s", max_len, "none");
>   break;
> @@ -439,9 +438,9 @@ static int comp_entry_with_rfer(struct btrfs_qgroup 
> *entry1,
>  {
>   int ret;
>  
> - if (entry1->rfer > entry2->rfer)
> + if (entry1->info.referenced > entry2->info.referenced)
>   ret = 1;
> - else if (entry1->rfer < entry2->rfer)
> + else if (entry1->info.referenced < entry2->info.referenced)
>   ret = -1;
>   else
>   ret = 0;
> @@ -455,9 +454,9 @@ static int comp_entry_with_excl(struct btrfs_qgroup 
> *entry1,
>  {
>   int ret;
>  
> - if (entry1->excl > entry2->excl)
> + if (entry1->info.exclusive > entry2->info.exclusive)
>   ret = 1;
> - else if (entry1->excl < entry2->excl)
> + else if (entry1->info.exclusive < entry2->info.exclusive)
>   ret = -1;
>   else
>   ret = 0;
> @@ -471,9 +470,9 @@ static int comp_entry_with_max_rfer(struct btrfs_qgroup 
> *entry1,
>  {
>   int ret;
>  
> - if (entry1->max_rfer > entry2->max_rfer)
> + if (entry1->limit.max_referenced > entry2->limit.max_referenced)
>   ret = 1;
> - else if (entry1->max_rfer < entry2->max_rfer)
> + else if (entry1->limit.max_referenced < entry2->limit.max_referenced)
>   ret = 

Re: [PATCH 06/20] btrfs-progs: qgroups: add pathname to show output

2018-03-07 Thread Qu Wenruo


On 2018年03月08日 10:40, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> The btrfs qgroup show command currently only exports qgroup IDs,
> forcing the user to resolve which subvolume each corresponds to.
> 
> This patch adds pathname resolution to qgroup show so that when
> the -P option is used, the last column contains the pathname of
> the root of the subvolume it describes.  In the case of nested
> qgroups, it will show the number of member qgroups or the paths
> of the members if the -v option is used.
> 
> Pathname can also be used as a sort parameter.
> 
> Signed-off-by: Jeff Mahoney 

Reviewed-by: Qu Wenruo 

Except one nitpick inlined below.

[snip]
>   }
> + if (bq->pathname)
> + free((void *)bq->pathname);

What about just free(bq->pathname);?

Is this (void *) used to get around the const prefix?

Thanks,
Qu

>   free(bq);
>  }
>  
> @@ -1107,7 +1228,7 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   info = (struct btrfs_qgroup_info_item *)
>  (args.buf + off);
>  
> - ret = update_qgroup_info(qgroup_lookup,
> + ret = update_qgroup_info(fd, qgroup_lookup,
>qgroupid, info);
>   break;
>   case BTRFS_QGROUP_LIMIT_KEY:
> @@ -1115,7 +1236,7 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   limit = (struct btrfs_qgroup_limit_item *)
>   (args.buf + off);
>  
> - ret = update_qgroup_limit(qgroup_lookup,
> + ret = update_qgroup_limit(fd, qgroup_lookup,
> qgroupid, limit);
>   break;
>   case BTRFS_QGROUP_RELATION_KEY:
> @@ -1159,7 +1280,7 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   return ret;
>  }
>  
> -static void print_all_qgroups(struct qgroup_lookup *qgroup_lookup)
> +static void print_all_qgroups(struct qgroup_lookup *qgroup_lookup, bool 
> verbose)
>  {
>  
>   struct rb_node *n;
> @@ -1170,14 +1291,15 @@ static void print_all_qgroups(struct qgroup_lookup 
> *qgroup_lookup)
>   n = rb_first(_lookup->root);
>   while (n) {
>   entry = rb_entry(n, struct btrfs_qgroup, sort_node);
> - print_single_qgroup_table(entry);
> + print_single_qgroup_table(entry, verbose);
>   n = rb_next(n);
>   }
>  }
>  
>  int btrfs_show_qgroups(int fd,
>  struct btrfs_qgroup_filter_set *filter_set,
> -struct btrfs_qgroup_comparer_set *comp_set)
> +struct btrfs_qgroup_comparer_set *comp_set,
> +bool verbose)
>  {
>  
>   struct qgroup_lookup qgroup_lookup;
> @@ -1189,7 +1311,7 @@ int btrfs_show_qgroups(int fd,
>   return ret;
>   __filter_and_sort_qgroups(_lookup, _tree,
> filter_set, comp_set);
> - print_all_qgroups(_tree);
> + print_all_qgroups(_tree, verbose);
>  
>   __free_all_qgroups(_lookup);
>   return ret;
> diff --git a/qgroup.h b/qgroup.h
> index 875fbdf3..f7ab7de5 100644
> --- a/qgroup.h
> +++ b/qgroup.h
> @@ -59,11 +59,13 @@ enum btrfs_qgroup_column_enum {
>   BTRFS_QGROUP_MAX_EXCL,
>   BTRFS_QGROUP_PARENT,
>   BTRFS_QGROUP_CHILD,
> + BTRFS_QGROUP_PATHNAME,
>   BTRFS_QGROUP_ALL,
>  };
>  
>  enum btrfs_qgroup_comp_enum {
>   BTRFS_QGROUP_COMP_QGROUPID,
> + BTRFS_QGROUP_COMP_PATHNAME,
>   BTRFS_QGROUP_COMP_RFER,
>   BTRFS_QGROUP_COMP_EXCL,
>   BTRFS_QGROUP_COMP_MAX_RFER,
> @@ -80,7 +82,7 @@ enum btrfs_qgroup_filter_enum {
>  int btrfs_qgroup_parse_sort_string(const char *opt_arg,
>   struct btrfs_qgroup_comparer_set **comps);
>  int btrfs_show_qgroups(int fd, struct btrfs_qgroup_filter_set *,
> -struct btrfs_qgroup_comparer_set *);
> +struct btrfs_qgroup_comparer_set *, bool verbose);
>  void btrfs_qgroup_setup_print_column(enum btrfs_qgroup_column_enum column);
>  void btrfs_qgroup_setup_units(unsigned unit_mode);
>  struct btrfs_qgroup_filter_set *btrfs_qgroup_alloc_filter_set(void);
> diff --git a/utils.c b/utils.c
> index e9cb3a82..7b7f87f1 100644
> --- a/utils.c
> +++ b/utils.c
> @@ -2556,15 +2556,9 @@ out:
>   return ret;
>  }
>  
> -int get_subvol_info_by_rootid(const char *mnt, struct root_info *get_ri, u64 
> r_id)
> +int get_subvol_info_by_rootid_fd(int fd, struct root_info *get_ri, u64 r_id)
>  {
> - int fd;
>   int ret;
> - DIR *dirstream = NULL;
> -
> - fd = btrfs_open_dir(mnt, , 1);
> - if (fd < 0)
> - return -EINVAL;
>  
>  

[PATCH 2/3] net: Remove accidental VLAs from proc buffers

2018-03-07 Thread Kees Cook
In the quest to remove all stack VLAs from the kernel[1], this refactors
the stack array size calculation to avoid using max(), which makes the
compiler think the size isn't fixed.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Kees Cook 
---
 net/ipv4/proc.c | 10 --
 net/ipv6/proc.c | 10 --
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index dc5edc8f7564..c23c43803435 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -46,8 +46,6 @@
 #include 
 #include 
 
-#define TCPUDP_MIB_MAX max_t(u32, UDP_MIB_MAX, TCP_MIB_MAX)
-
 /*
  * Report socket allocation statistics [m...@utu.fi]
  */
@@ -400,11 +398,11 @@ static int snmp_seq_show_ipstats(struct seq_file *seq, 
void *v)
 
 static int snmp_seq_show_tcp_udp(struct seq_file *seq, void *v)
 {
-   unsigned long buff[TCPUDP_MIB_MAX];
+   unsigned long buff[SIMPLE_MAX(UDP_MIB_MAX, TCP_MIB_MAX)];
struct net *net = seq->private;
int i;
 
-   memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
+   memset(buff, 0, sizeof(buff));
 
seq_puts(seq, "\nTcp:");
for (i = 0; snmp4_tcp_list[i].name; i++)
@@ -421,7 +419,7 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, void 
*v)
seq_printf(seq, " %lu", buff[i]);
}
 
-   memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
+   memset(buff, 0, sizeof(buff));
 
snmp_get_cpu_field_batch(buff, snmp4_udp_list,
 net->mib.udp_statistics);
@@ -432,7 +430,7 @@ static int snmp_seq_show_tcp_udp(struct seq_file *seq, void 
*v)
for (i = 0; snmp4_udp_list[i].name; i++)
seq_printf(seq, " %lu", buff[i]);
 
-   memset(buff, 0, TCPUDP_MIB_MAX * sizeof(unsigned long));
+   memset(buff, 0, sizeof(buff));
 
/* the UDP and UDP-Lite MIBs are the same */
seq_puts(seq, "\nUdpLite:");
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index b67814242f78..5b0874c26802 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -30,10 +30,8 @@
 #include 
 #include 
 
-#define MAX4(a, b, c, d) \
-   max_t(u32, max_t(u32, a, b), max_t(u32, c, d))
-#define SNMP_MIB_MAX MAX4(UDP_MIB_MAX, TCP_MIB_MAX, \
-   IPSTATS_MIB_MAX, ICMP_MIB_MAX)
+#define SNMP_MIB_MAX SIMPLE_MAX(SIMPLE_MAX(UDP_MIB_MAX, TCP_MIB_MAX), \
+   SIMPLE_MAX(IPSTATS_MIB_MAX, ICMP_MIB_MAX))
 
 static int sockstat6_seq_show(struct seq_file *seq, void *v)
 {
@@ -199,7 +197,7 @@ static void snmp6_seq_show_item(struct seq_file *seq, void 
__percpu *pcpumib,
int i;
 
if (pcpumib) {
-   memset(buff, 0, sizeof(unsigned long) * SNMP_MIB_MAX);
+   memset(buff, 0, sizeof(buff));
 
snmp_get_cpu_field_batch(buff, itemlist, pcpumib);
for (i = 0; itemlist[i].name; i++)
@@ -218,7 +216,7 @@ static void snmp6_seq_show_item64(struct seq_file *seq, 
void __percpu *mib,
u64 buff64[SNMP_MIB_MAX];
int i;
 
-   memset(buff64, 0, sizeof(u64) * SNMP_MIB_MAX);
+   memset(buff64, 0, sizeof(buff64));
 
snmp_get_cpu_field64_batch(buff64, itemlist, mib, syncpoff);
for (i = 0; itemlist[i].name; i++)
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] btrfs: tree-checker: Avoid accidental stack VLA

2018-03-07 Thread Kees Cook
In the quest to remove all stack VLAs from the kernel[1], this refactors
the stack array size calculation to avoid using max(), which makes the
compiler think the size isn't fixed.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Kees Cook 
---
 fs/btrfs/tree-checker.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index c3c8d48f6618..59bd07694118 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -341,7 +341,8 @@ static int check_dir_item(struct btrfs_root *root,
 */
if (key->type == BTRFS_DIR_ITEM_KEY ||
key->type == BTRFS_XATTR_ITEM_KEY) {
-   char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
+   char namebuf[SIMPLE_MAX(BTRFS_NAME_LEN,
+   XATTR_NAME_MAX)];
 
read_extent_buffer(leaf, namebuf,
(unsigned long)(di + 1), name_len);
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] vsprintf: Remove accidental VLA usage

2018-03-07 Thread Kees Cook
In the quest to remove all stack VLAs from the kernel[1], this introduces
a new "simple max" macro, and changes the "sym" array size calculation to
use it. The value is actually a fixed size, but since the max() macro uses
some extensive tricks for safety, it ends up looking like a variable size
to the compiler.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Kees Cook 
---
 include/linux/kernel.h | 11 +++
 lib/vsprintf.c |  4 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 3fd291503576..1da554e9997f 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -820,6 +820,17 @@ static inline void ftrace_dump(enum ftrace_dump_mode 
oops_dump_mode) { }
  x, y)
 
 /**
+ * SIMPLE_MAX - return maximum of two values without any type checking
+ * @x: first value
+ * @y: second value
+ *
+ * This should only be used in stack array sizes, since the type-checking
+ * from max() confuses the compiler into thinking a VLA is being used.
+ */
+#define SIMPLE_MAX(x, y)   ((size_t)(x) > (size_t)(y) ? (size_t)(x) \
+  : (size_t)(y))
+
+/**
  * min3 - return minimum of three values
  * @x: first value
  * @y: second value
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index d7a708f82559..50cce36e1cdc 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -744,8 +744,8 @@ char *resource_string(char *buf, char *end, struct resource 
*res,
 #define FLAG_BUF_SIZE  (2 * sizeof(res->flags))
 #define DECODED_BUF_SIZE   sizeof("[mem - 64bit pref window disabled]")
 #define RAW_BUF_SIZE   sizeof("[mem - flags 0x]")
-   char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE,
-2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + RAW_BUF_SIZE)];
+   char sym[SIMPLE_MAX(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE,
+   2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + RAW_BUF_SIZE)];
 
char *p = sym, *pend = sym + sizeof(sym);
int decode = (fmt[0] == 'R') ? 1 : 0;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/3] Remove accidental VLA usage

2018-03-07 Thread Kees Cook
This series adds SIMPLE_MAX() to be used in places where a stack array
is actually fixed, but the compiler still warns about VLA usage due to
confusion caused by the safety checks in the max() macro.

I'm sending these via -mm since that's where I've introduced SIMPLE_MAX(),
and they should all have no operational differences.

-Kees

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/20] btrfs-progs: qgroups: add pathname to show output

2018-03-07 Thread jeffm
From: Jeff Mahoney 

The btrfs qgroup show command currently only exports qgroup IDs,
forcing the user to resolve which subvolume each corresponds to.

This patch adds pathname resolution to qgroup show so that when
the -P option is used, the last column contains the pathname of
the root of the subvolume it describes.  In the case of nested
qgroups, it will show the number of member qgroups or the paths
of the members if the -v option is used.

Pathname can also be used as a sort parameter.

Signed-off-by: Jeff Mahoney 
---
 Documentation/btrfs-qgroup.asciidoc |   4 +
 cmds-qgroup.c   |  18 -
 kerncompat.h|   1 +
 qgroup.c| 152 
 qgroup.h|   4 +-
 utils.c |  23 --
 utils.h |   2 +
 7 files changed, 178 insertions(+), 26 deletions(-)

diff --git a/Documentation/btrfs-qgroup.asciidoc 
b/Documentation/btrfs-qgroup.asciidoc
index 3108457c..360b3269 100644
--- a/Documentation/btrfs-qgroup.asciidoc
+++ b/Documentation/btrfs-qgroup.asciidoc
@@ -97,10 +97,14 @@ print child qgroup id.
 print limit of referenced size of qgroup.
 -e
 print limit of exclusive size of qgroup.
+-P
+print pathname to the root of the subvolume managed by qgroup.  For nested 
qgroups, the number of members will be printed unless -v is specified.
 -F
 list all qgroups which impact the given path(include ancestral qgroups)
 -f
 list all qgroups which impact the given path(exclude ancestral qgroups)
+-v
+Be more verbose.  Print pathnames of member qgroups when nested.
 --raw
 raw numbers in bytes, without the 'B' suffix.
 --human-readable
diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index 48686436..d704aeaf 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -280,8 +280,11 @@ static const char * const cmd_qgroup_show_usage[] = {
"   (including ancestral qgroups)",
"-f list all qgroups which impact the given path",
"   (excluding ancestral qgroups)",
+   "-P print first-level qgroups using pathname",
+   "   - nested qgroups will be reported as a count",
+   "-v verbose, prints pathnames for all nested qgroups",
HELPINFO_UNITS_LONG,
-   "--sort=qgroupid,rfer,excl,max_rfer,max_excl",
+   "--sort=qgroupid,rfer,excl,max_rfer,max_excl,pathname",
"   list qgroups sorted by specified items",
"   you can use '+' or '-' in front of each item.",
"   (+:ascending, -:descending, ascending default)",
@@ -299,6 +302,7 @@ static int cmd_qgroup_show(int argc, char **argv)
int filter_flag = 0;
unsigned unit_mode;
int sync = 0;
+   bool verbose = false;
 
struct btrfs_qgroup_comparer_set *comparer_set;
struct btrfs_qgroup_filter_set *filter_set;
@@ -316,10 +320,11 @@ static int cmd_qgroup_show(int argc, char **argv)
static const struct option long_options[] = {
{"sort", required_argument, NULL, GETOPT_VAL_SORT},
{"sync", no_argument, NULL, GETOPT_VAL_SYNC},
+   {"verbose", no_argument, NULL, 'v'},
{ NULL, 0, NULL, 0 }
};
 
-   c = getopt_long(argc, argv, "pcreFf", long_options, NULL);
+   c = getopt_long(argc, argv, "pPcreFfv", long_options, NULL);
if (c < 0)
break;
switch (c) {
@@ -327,6 +332,10 @@ static int cmd_qgroup_show(int argc, char **argv)
btrfs_qgroup_setup_print_column(
BTRFS_QGROUP_PARENT);
break;
+   case 'P':
+   btrfs_qgroup_setup_print_column(
+   BTRFS_QGROUP_PATHNAME);
+   break;
case 'c':
btrfs_qgroup_setup_print_column(
BTRFS_QGROUP_CHILD);
@@ -354,6 +363,9 @@ static int cmd_qgroup_show(int argc, char **argv)
case GETOPT_VAL_SYNC:
sync = 1;
break;
+   case 'v':
+   verbose = true;
+   break;
default:
usage(cmd_qgroup_show_usage);
}
@@ -394,7 +406,7 @@ static int cmd_qgroup_show(int argc, char **argv)
BTRFS_QGROUP_FILTER_PARENT,
qgroupid);
}
-   ret = btrfs_show_qgroups(fd, filter_set, comparer_set);
+   ret = btrfs_show_qgroups(fd, filter_set, comparer_set, verbose);
close_file_or_dir(fd, dirstream);
free(filter_set);
free(comparer_set);
diff --git 

[PATCH 02/20] btrfs-progs: qgroups: fix misleading index check

2018-03-07 Thread jeffm
From: Jeff Mahoney 

In print_single_qgroup_table we check the loop index against
BTRFS_QGROUP_CHILD, but what we really mean is "last column."  Since
we have an enum value to indicate the last value, use that instead
of assuming that BTRFS_QGROUP_CHILD is always last.

Reviewed-by: Qu Wenruo 
Reviewed-by: Nikolay Borisov 
Signed-off-by: Jeff Mahoney 
---
 qgroup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qgroup.c b/qgroup.c
index 11659e83..67bc0738 100644
--- a/qgroup.c
+++ b/qgroup.c
@@ -267,7 +267,7 @@ static void print_single_qgroup_table(struct btrfs_qgroup 
*qgroup)
continue;
print_qgroup_column(qgroup, i);
 
-   if (i != BTRFS_QGROUP_CHILD)
+   if (i != BTRFS_QGROUP_ALL - 1)
printf(" ");
}
printf("\n");
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/20] btrfs-progs: btrfs-list: add btrfs_cleanup_root_info

2018-03-07 Thread jeffm
From: Jeff Mahoney 

Currently we can pass back root_info structures to callers but
have to free the strings manually.  This adds a helper to do it
and uses it in cmd_subvol_show.

Signed-off-by: Jeff Mahoney 
---
 btrfs-list.c | 18 +++---
 btrfs-list.h |  1 +
 cmds-subvolume.c |  5 +
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index 90c98be1..2fe31e9c 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -533,15 +533,27 @@ static int add_root_backref(struct root_lookup 
*root_lookup, u64 root_id,
name_len, 0, 0, 0, NULL, NULL, NULL);
 }
 
+static void __btrfs_free_root_info_strings(struct root_info *ri)
+{
+   free(ri->name);
+   free(ri->path);
+   free(ri->full_path);
+}
+
+void btrfs_cleanup_root_info(struct root_info *ri)
+{
+   __btrfs_free_root_info_strings(ri);
+   ri->name = NULL;
+   ri->path = NULL;
+   ri->full_path = NULL;
+}
 
 static void free_root_info(struct rb_node *node)
 {
struct root_info *ri;
 
ri = to_root_info(node);
-   free(ri->name);
-   free(ri->path);
-   free(ri->full_path);
+   __btrfs_free_root_info_strings(ri);
free(ri);
 }
 
diff --git a/btrfs-list.h b/btrfs-list.h
index 6e5fc778..9d0478b8 100644
--- a/btrfs-list.h
+++ b/btrfs-list.h
@@ -176,5 +176,6 @@ char *btrfs_list_path_for_root(int fd, u64 root);
 int btrfs_list_get_path_rootid(int fd, u64 *treeid);
 int btrfs_get_subvol(int fd, struct root_info *the_ri);
 int btrfs_get_toplevel_subvol(int fd, struct root_info *the_ri);
+void btrfs_cleanup_root_info(struct root_info *ri);
 
 #endif
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 8a473f7a..769d2a76 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -1113,10 +1113,7 @@ static int cmd_subvol_show(int argc, char **argv)
1, raw_prefix);
 
 out:
-   /* clean up */
-   free(get_ri.path);
-   free(get_ri.name);
-   free(get_ri.full_path);
+   btrfs_cleanup_root_info(_ri);
free(filter_set);
 
close_file_or_dir(fd, dirstream1);
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/20] btrfs-progs: filesystem balance: split out special handling

2018-03-07 Thread jeffm
From: Jeff Mahoney 

In preparation to use cmd_struct as the command entry point, we need
to split out the 'filesystem balance' handling to not call cmd_balance
directly.  The reason is that the flags that indicate a command is
hidden are a part of cmd_struct and so we can use a cmd_struct as a
direct alias in another command group and ALSO have it be hidden
without declaring another cmd_struct.

This change has no immediate impact since cmd_balance will still
use its usage information directly from cmds-balance.c.  It will
take effect once we start passing cmd_structs around for usage
information.

Signed-off-by: Jeff Mahoney 
---
 cmds-filesystem.c | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 467aff11..62112705 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -1184,6 +1184,18 @@ static int cmd_filesystem_label(int argc, char **argv)
}
 }
 
+static const char * const cmd_filesystem_balance_usage[] = {
+   "btrfs filesystem balance [args...] (alias of \"btrfs balance\")",
+   "Please see \"btrfs balance --help\" for more information.",
+   NULL
+};
+
+/* Compatible old "btrfs filesystem balance" command */
+static int cmd_filesystem_balance(int argc, char **argv)
+{
+   return cmd_balance(argc, argv);
+}
+
 static const char filesystem_cmd_group_info[] =
 "overall filesystem tasks and information";
 
@@ -1197,8 +1209,9 @@ const struct cmd_group filesystem_cmd_group = {
0 },
{ "defragment", cmd_filesystem_defrag,
cmd_filesystem_defrag_usage, NULL, 0 },
-   { "balance", cmd_balance, NULL, _cmd_group,
-   CMD_HIDDEN },
+   { "balance", cmd_filesystem_balance,
+  cmd_filesystem_balance_usage, _cmd_group,
+  CMD_HIDDEN },
{ "resize", cmd_filesystem_resize, cmd_filesystem_resize_usage,
NULL, 0 },
{ "label", cmd_filesystem_label, cmd_filesystem_label_usage,
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/20] btrfs-progs: reorder placement of help declarations for send/receive

2018-03-07 Thread jeffm
From: Jeff Mahoney 

The usage definitions for send and receive follow the command
definitions, which use them.  This works because we declare them
in commands.h.  When we move to using cmd_struct as the entry point,
these declarations will be removed, breaking the commands.  Since
that would be an otherwise unrelated change, this patch reorders
them separately.

Signed-off-by: Jeff Mahoney 
---
 cmds-receive.c | 62 ++--
 cmds-send.c| 69 +-
 2 files changed, 66 insertions(+), 65 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 68123a31..b3709f36 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -1248,6 +1248,37 @@ out:
return ret;
 }
 
+const char * const cmd_receive_usage[] = {
+   "btrfs receive [options] \n"
+   "btrfs receive --dump [options]",
+   "Receive subvolumes from a stream",
+   "Receives one or more subvolumes that were previously",
+   "sent with btrfs send. The received subvolumes are stored",
+   "into MOUNT.",
+   "The receive will fail in case the receiving subvolume",
+   "already exists. It will also fail in case a previously",
+   "received subvolume has been changed after it was received.",
+   "After receiving a subvolume, it is immediately set to",
+   "read-only.",
+   "",
+   "-v   increase verbosity about performed actions",
+   "-f FILE  read the stream from FILE instead of stdin",
+   "-e   terminate after receiving an  marker in the 
stream.",
+   " Without this option the receiver side terminates only 
in case",
+   " of an error on end of file.",
+   "-C|--chroot  confine the process to  using chroot",
+   "-E|--max-errors NERR",
+   " terminate as soon as NERR errors occur while",
+   " stream processing commands from the stream.",
+   " Default value is 1. A value of 0 means no limit.",
+   "-m ROOTMOUNT the root mount point of the destination filesystem.",
+   " If /proc is not accessible, use this to tell us 
where",
+   " this file system is mounted.",
+   "--dump   dump stream metadata, one line per operation,",
+   " does not require the MOUNT parameter",
+   NULL
+};
+
 int cmd_receive(int argc, char **argv)
 {
char *tomnt = NULL;
@@ -1357,34 +1388,3 @@ out:
 
return !!ret;
 }
-
-const char * const cmd_receive_usage[] = {
-   "btrfs receive [options] \n"
-   "btrfs receive --dump [options]",
-   "Receive subvolumes from a stream",
-   "Receives one or more subvolumes that were previously",
-   "sent with btrfs send. The received subvolumes are stored",
-   "into MOUNT.",
-   "The receive will fail in case the receiving subvolume",
-   "already exists. It will also fail in case a previously",
-   "received subvolume has been changed after it was received.",
-   "After receiving a subvolume, it is immediately set to",
-   "read-only.",
-   "",
-   "-v   increase verbosity about performed actions",
-   "-f FILE  read the stream from FILE instead of stdin",
-   "-e   terminate after receiving an  marker in the 
stream.",
-   " Without this option the receiver side terminates only 
in case",
-   " of an error on end of file.",
-   "-C|--chroot  confine the process to  using chroot",
-   "-E|--max-errors NERR",
-   " terminate as soon as NERR errors occur while",
-   " stream processing commands from the stream.",
-   " Default value is 1. A value of 0 means no limit.",
-   "-m ROOTMOUNT the root mount point of the destination filesystem.",
-   " If /proc is not accessible, use this to tell us 
where",
-   " this file system is mounted.",
-   "--dump   dump stream metadata, one line per operation,",
-   " does not require the MOUNT parameter",
-   NULL
-};
diff --git a/cmds-send.c b/cmds-send.c
index c5ecdaa1..8365e9c9 100644
--- a/cmds-send.c
+++ b/cmds-send.c
@@ -489,6 +489,41 @@ static void free_send_info(struct btrfs_send *sctx)
subvol_uuid_search_finit(>sus);
 }
 
+
+const char * const cmd_send_usage[] = {
+   "btrfs send [-ve] [-p ] [-c ] [-f ] 
 [...]",
+   "Send the subvolume(s) to stdout.",
+   "Sends the subvolume(s) specified by  to stdout.",
+   " should be read-only here.",
+   "By default, this will send the whole subvolume. To do an incremental",
+   "send, use '-p '. If you want to allow btrfs to clone from",
+   "any additional local snapshots, use '-c ' (multiple times",
+   

[PATCH 03/20] btrfs-progs: constify pathnames passed as arguments

2018-03-07 Thread jeffm
From: Jeff Mahoney 

It's unlikely we're going to modify a pathname argument, so codify that
and use const.

Reviewed-by: Qu Wenruo 
Signed-off-by: Jeff Mahoney 
---
 chunk-recover.c | 4 ++--
 cmds-device.c   | 2 +-
 cmds-fi-usage.c | 6 +++---
 cmds-rescue.c   | 4 ++--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/chunk-recover.c b/chunk-recover.c
index 705bcf52..1d30db51 100644
--- a/chunk-recover.c
+++ b/chunk-recover.c
@@ -1492,7 +1492,7 @@ out:
return ERR_PTR(ret);
 }
 
-static int recover_prepare(struct recover_control *rc, char *path)
+static int recover_prepare(struct recover_control *rc, const char *path)
 {
int ret;
int fd;
@@ -2296,7 +2296,7 @@ static void validate_rebuild_chunks(struct 
recover_control *rc)
 /*
  * Return 0 when successful, < 0 on error and > 0 if aborted by user
  */
-int btrfs_recover_chunk_tree(char *path, int verbose, int yes)
+int btrfs_recover_chunk_tree(const char *path, int verbose, int yes)
 {
int ret = 0;
struct btrfs_root *root = NULL;
diff --git a/cmds-device.c b/cmds-device.c
index 86459d1b..a49c9d9d 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -526,7 +526,7 @@ static const char * const cmd_device_usage_usage[] = {
NULL
 };
 
-static int _cmd_device_usage(int fd, char *path, unsigned unit_mode)
+static int _cmd_device_usage(int fd, const char *path, unsigned unit_mode)
 {
int i;
int ret = 0;
diff --git a/cmds-fi-usage.c b/cmds-fi-usage.c
index de7ad668..9a1c76ab 100644
--- a/cmds-fi-usage.c
+++ b/cmds-fi-usage.c
@@ -227,7 +227,7 @@ static int cmp_btrfs_ioctl_space_info(const void *a, const 
void *b)
 /*
  * This function load all the information about the space usage
  */
-static struct btrfs_ioctl_space_args *load_space_info(int fd, char *path)
+static struct btrfs_ioctl_space_args *load_space_info(int fd, const char *path)
 {
struct btrfs_ioctl_space_args *sargs = NULL, *sargs_orig = NULL;
int ret, count;
@@ -305,7 +305,7 @@ static void get_raid56_used(struct chunk_info *chunks, int 
chunkcount,
 #defineMIN_UNALOCATED_THRESH   SZ_16M
 static int print_filesystem_usage_overall(int fd, struct chunk_info *chunkinfo,
int chunkcount, struct device_info *devinfo, int devcount,
-   char *path, unsigned unit_mode)
+   const char *path, unsigned unit_mode)
 {
struct btrfs_ioctl_space_args *sargs = NULL;
int i;
@@ -931,7 +931,7 @@ static void _cmd_filesystem_usage_linear(unsigned unit_mode,
 static int print_filesystem_usage_by_chunk(int fd,
struct chunk_info *chunkinfo, int chunkcount,
struct device_info *devinfo, int devcount,
-   char *path, unsigned unit_mode, int tabular)
+   const char *path, unsigned unit_mode, int tabular)
 {
struct btrfs_ioctl_space_args *sargs;
int ret = 0;
diff --git a/cmds-rescue.c b/cmds-rescue.c
index c40088ad..c61145bc 100644
--- a/cmds-rescue.c
+++ b/cmds-rescue.c
@@ -32,8 +32,8 @@ static const char * const rescue_cmd_group_usage[] = {
NULL
 };
 
-int btrfs_recover_chunk_tree(char *path, int verbose, int yes);
-int btrfs_recover_superblocks(char *path, int verbose, int yes);
+int btrfs_recover_chunk_tree(const char *path, int verbose, int yes);
+int btrfs_recover_superblocks(const char *path, int verbose, int yes);
 
 static const char * const cmd_rescue_chunk_recover_usage[] = {
"btrfs rescue chunk-recover [options] ",
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/20] btrfs-progs: qgroups: introduce and use info and limit structures

2018-03-07 Thread jeffm
From: Jeff Mahoney 

We use structures to pass the info and limit from the kernel as items
but store the individual values separately in btrfs_qgroup.  We already
have a btrfs_qgroup_limit structure that's used for setting the limit.

This patch introduces a btrfs_qgroup_info structure and uses that and
btrfs_qgroup_limit in btrfs_qgroup.

Signed-off-by: Jeff Mahoney 
---
 qgroup.c | 82 ++--
 qgroup.h |  8 +++
 2 files changed, 52 insertions(+), 38 deletions(-)

diff --git a/qgroup.c b/qgroup.c
index 5600da99..57815718 100644
--- a/qgroup.c
+++ b/qgroup.c
@@ -46,20 +46,12 @@ struct btrfs_qgroup {
/*
 * info_item
 */
-   u64 generation;
-   u64 rfer;   /*referenced*/
-   u64 rfer_cmpr;  /*referenced compressed*/
-   u64 excl;   /*exclusive*/
-   u64 excl_cmpr;  /*exclusive compressed*/
+   struct btrfs_qgroup_info info;
 
/*
 *limit_item
 */
-   u64 flags;  /*which limits are set*/
-   u64 max_rfer;
-   u64 max_excl;
-   u64 rsv_rfer;
-   u64 rsv_excl;
+   struct btrfs_qgroup_limit limit;
 
/*qgroups this group is member of*/
struct list_head qgroups;
@@ -260,6 +252,11 @@ void print_pathname_column(struct btrfs_qgroup *qgroup, 
bool verbose)
fputs("", stdout);
 }
 
+static int print_u64(u64 value, int unit_mode, int max_len)
+{
+   return printf("%*s", max_len, pretty_size_mode(value, unit_mode));
+}
+
 static void print_qgroup_column(struct btrfs_qgroup *qgroup,
enum btrfs_qgroup_column_enum column,
bool verbose)
@@ -279,24 +276,26 @@ static void print_qgroup_column(struct btrfs_qgroup 
*qgroup,
print_qgroup_column_add_blank(BTRFS_QGROUP_QGROUPID, len);
break;
case BTRFS_QGROUP_RFER:
-   len = printf("%*s", max_len, pretty_size_mode(qgroup->rfer, 
unit_mode));
+   len = print_u64(qgroup->info.referenced, unit_mode, max_len);
break;
case BTRFS_QGROUP_EXCL:
-   len = printf("%*s", max_len, pretty_size_mode(qgroup->excl, 
unit_mode));
+   len = print_u64(qgroup->info.exclusive, unit_mode, max_len);
break;
case BTRFS_QGROUP_PARENT:
len = print_parent_column(qgroup);
print_qgroup_column_add_blank(BTRFS_QGROUP_PARENT, len);
break;
case BTRFS_QGROUP_MAX_RFER:
-   if (qgroup->flags & BTRFS_QGROUP_LIMIT_MAX_RFER)
-   len = printf("%*s", max_len, 
pretty_size_mode(qgroup->max_rfer, unit_mode));
+   if (qgroup->limit.flags & BTRFS_QGROUP_LIMIT_MAX_RFER)
+   len = print_u64(qgroup->limit.max_referenced,
+   unit_mode, max_len);
else
len = printf("%*s", max_len, "none");
break;
case BTRFS_QGROUP_MAX_EXCL:
-   if (qgroup->flags & BTRFS_QGROUP_LIMIT_MAX_EXCL)
-   len = printf("%*s", max_len, 
pretty_size_mode(qgroup->max_excl, unit_mode));
+   if (qgroup->limit.flags & BTRFS_QGROUP_LIMIT_MAX_EXCL)
+   len = print_u64(qgroup->limit.max_exclusive,
+   unit_mode, max_len);
else
len = printf("%*s", max_len, "none");
break;
@@ -439,9 +438,9 @@ static int comp_entry_with_rfer(struct btrfs_qgroup *entry1,
 {
int ret;
 
-   if (entry1->rfer > entry2->rfer)
+   if (entry1->info.referenced > entry2->info.referenced)
ret = 1;
-   else if (entry1->rfer < entry2->rfer)
+   else if (entry1->info.referenced < entry2->info.referenced)
ret = -1;
else
ret = 0;
@@ -455,9 +454,9 @@ static int comp_entry_with_excl(struct btrfs_qgroup *entry1,
 {
int ret;
 
-   if (entry1->excl > entry2->excl)
+   if (entry1->info.exclusive > entry2->info.exclusive)
ret = 1;
-   else if (entry1->excl < entry2->excl)
+   else if (entry1->info.exclusive < entry2->info.exclusive)
ret = -1;
else
ret = 0;
@@ -471,9 +470,9 @@ static int comp_entry_with_max_rfer(struct btrfs_qgroup 
*entry1,
 {
int ret;
 
-   if (entry1->max_rfer > entry2->max_rfer)
+   if (entry1->limit.max_referenced > entry2->limit.max_referenced)
ret = 1;
-   else if (entry1->max_rfer < entry2->max_rfer)
+   else if (entry1->limit.max_referenced < entry2->limit.max_referenced)
ret = -1;
else
ret = 0;
@@ -487,9 +486,9 @@ static int comp_entry_with_max_excl(struct btrfs_qgroup 
*entry1,
 {
int ret;
 
-   if (entry1->max_excl > entry2->max_excl)
+

[PATCH 01/20] btrfs-progs: quota: Add -W option to rescan to wait without starting rescan

2018-03-07 Thread jeffm
From: Jeff Mahoney 

This patch adds a new -W option to wait for a rescan without starting a
new operation.  This is useful for things like xfstests where we want
do to do a "btrfs quota enable" and not continue until the subsequent
rescan has finished.

In addition to documenting the new option in the man page, I've cleaned
up the rescan entry to document the -w option a bit better.

Reviewed-by: Qu Wenruo 
Signed-off-by: Jeff Mahoney 
---
 Documentation/btrfs-quota.asciidoc | 10 +++---
 cmds-quota.c   | 20 ++--
 2 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/Documentation/btrfs-quota.asciidoc 
b/Documentation/btrfs-quota.asciidoc
index 85ebf729..0b64a69b 100644
--- a/Documentation/btrfs-quota.asciidoc
+++ b/Documentation/btrfs-quota.asciidoc
@@ -238,15 +238,19 @@ Disable subvolume quota support for a filesystem.
 *enable* ::
 Enable subvolume quota support for a filesystem.
 
-*rescan* [-s] ::
+*rescan* [-s|-w|-W] ::
 Trash all qgroup numbers and scan the metadata again with the current config.
 +
 `Options`
 +
 -s
-show status of a running rescan operation.
+Show status of a running rescan operation.
+
 -w
-wait for rescan operation to finish(can be already in progress).
+Start rescan operation and wait until it has finished before exiting.  If a 
rescan is already running, wait until it finishes and then exit without 
starting a new one.
+
+-W
+Wait for rescan operation to finish and then exit.  If a rescan is not already 
running, exit silently.
 
 EXIT STATUS
 ---
diff --git a/cmds-quota.c b/cmds-quota.c
index 745889d1..7f933495 100644
--- a/cmds-quota.c
+++ b/cmds-quota.c
@@ -120,14 +120,20 @@ static int cmd_quota_rescan(int argc, char **argv)
int wait_for_completion = 0;
 
while (1) {
-   int c = getopt(argc, argv, "sw");
+   int c = getopt(argc, argv, "swW");
if (c < 0)
break;
switch (c) {
case 's':
ioctlnum = BTRFS_IOC_QUOTA_RESCAN_STATUS;
break;
+   case 'W':
+   ioctlnum = 0;
+   wait_for_completion = 1;
+   break;
case 'w':
+   /* Reset it in case the user did both -W and -w */
+   ioctlnum = BTRFS_IOC_QUOTA_RESCAN;
wait_for_completion = 1;
break;
default:
@@ -135,8 +141,8 @@ static int cmd_quota_rescan(int argc, char **argv)
}
}
 
-   if (ioctlnum != BTRFS_IOC_QUOTA_RESCAN && wait_for_completion) {
-   error("switch -w cannot be used with -s");
+   if (ioctlnum == BTRFS_IOC_QUOTA_RESCAN_STATUS && wait_for_completion) {
+   error("switches -w/-W cannot be used with -s");
return 1;
}
 
@@ -150,8 +156,10 @@ static int cmd_quota_rescan(int argc, char **argv)
if (fd < 0)
return 1;
 
-   ret = ioctl(fd, ioctlnum, );
-   e = errno;
+   if (ioctlnum) {
+   ret = ioctl(fd, ioctlnum, );
+   e = errno;
+   }
 
if (ioctlnum == BTRFS_IOC_QUOTA_RESCAN_STATUS) {
close_file_or_dir(fd, dirstream);
@@ -167,7 +175,7 @@ static int cmd_quota_rescan(int argc, char **argv)
return 0;
}
 
-   if (ret == 0) {
+   if (ioctlnum == BTRFS_IOC_QUOTA_RESCAN && ret == 0) {
printf("quota rescan started\n");
fflush(stdout);
} else if (ret < 0 && (!wait_for_completion || e != EINPROGRESS)) {
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/20] btrfs-progs: subvolume: add quota info to btrfs sub show

2018-03-07 Thread jeffm
From: Jeff Mahoney 

This patch reports on the first-level qgroup, if any, associated with
a particular subvolume.  It displays the usage and limit, subject
to the usual unit parameters.

Signed-off-by: Jeff Mahoney 
---
 cmds-subvolume.c | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 769d2a76..96fb7b06 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -972,6 +972,7 @@ static const char * const cmd_subvol_show_usage[] = {
"Show more information about the subvolume",
"-r|--rootid   rootid of the subvolume",
"-u|--uuid uuid of the subvolume",
+   HELPINFO_UNITS_SHORT_LONG,
"",
"If no option is specified,  will be shown, otherwise",
"the rootid or uuid are resolved relative to the  path.",
@@ -993,6 +994,13 @@ static int cmd_subvol_show(int argc, char **argv)
int by_uuid = 0;
u64 rootid_arg;
u8 uuid_arg[BTRFS_UUID_SIZE];
+   struct btrfs_qgroup_stats stats;
+   unsigned int unit_mode;
+   const char *referenced_size;
+   const char *referenced_limit_size = "-";
+   unsigned int field_width = 0;
+
+   unit_mode = get_unit_mode_from_arg(, argv, 1);
 
while (1) {
int c;
@@ -1112,6 +1120,48 @@ static int cmd_subvol_show(int argc, char **argv)
btrfs_list_subvols_print(fd, filter_set, NULL, BTRFS_LIST_LAYOUT_RAW,
1, raw_prefix);
 
+   ret = btrfs_qgroup_query(fd, get_ri.root_id, );
+   if (ret && ret != -ENOTTY && ret != -ENODATA) {
+   fprintf(stderr,
+   "\nERROR: BTRFS_IOC_QUOTA_QUERY failed: %s\n",
+   strerror(-ret));
+   goto out;
+   }
+
+   printf("\tQuota Usage:\t\t");
+   fflush(stdout);
+   if (ret) {
+   if (ret == -ENOTTY)
+   printf("quotas not enabled\n");
+   else
+   printf("quotas not available\n");
+   goto out;
+   }
+
+   referenced_size = pretty_size_mode(stats.info.referenced, unit_mode);
+   if (stats.limit.max_referenced)
+   referenced_limit_size = pretty_size_mode(
+   stats.limit.max_referenced,
+   unit_mode);
+   field_width = max(strlen(referenced_size),
+ strlen(referenced_limit_size));
+
+   printf("%-*s referenced, %s exclusive\n ", field_width,
+  referenced_size,
+  pretty_size_mode(stats.info.exclusive, unit_mode));
+
+   printf("\tQuota Limits:\t\t");
+   if (stats.limit.max_referenced || stats.limit.max_exclusive) {
+   const char *excl = "-";
+
+   if (stats.limit.max_exclusive)
+   excl = pretty_size_mode(stats.limit.max_exclusive,
+   unit_mode);
+   printf("%-*s referenced, %s exclusive\n", field_width,
+  referenced_limit_size, excl);
+   } else
+   printf("None\n");
+
 out:
btrfs_cleanup_root_info(_ri);
free(filter_set);
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/20] btrfs-progs: btrfs-list: add rb_entry helpers for root_info

2018-03-07 Thread jeffm
From: Jeff Mahoney 

We use rb_entry all over the place for the root_info pointers.  Add
a helper to make the code more readable.

Signed-off-by: Jeff Mahoney 
---
 btrfs-list.c | 30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/btrfs-list.c b/btrfs-list.c
index e01c5899..90c98be1 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -44,6 +44,16 @@ struct root_lookup {
struct rb_root root;
 };
 
+static inline struct root_info *to_root_info(struct rb_node *node)
+{
+   return rb_entry(node, struct root_info, rb_node);
+}
+
+static inline struct root_info *to_root_info_sorted(struct rb_node *node)
+{
+   return rb_entry(node, struct root_info, sort_node);
+}
+
 static struct {
char*name;
char*column_name;
@@ -309,7 +319,7 @@ static int sort_tree_insert(struct root_lookup *sort_tree,
 
while (*p) {
parent = *p;
-   curr = rb_entry(parent, struct root_info, sort_node);
+   curr = to_root_info_sorted(parent);
 
ret = sort_comp(ins, curr, comp_set);
if (ret < 0)
@@ -340,7 +350,7 @@ static int root_tree_insert(struct root_lookup *root_tree,
 
while(*p) {
parent = *p;
-   curr = rb_entry(parent, struct root_info, rb_node);
+   curr = to_root_info(parent);
 
ret = comp_entry_with_rootid(ins, curr, 0);
if (ret < 0)
@@ -371,7 +381,7 @@ static struct root_info *root_tree_search(struct 
root_lookup *root_tree,
tmp.root_id = root_id;
 
while(n) {
-   entry = rb_entry(n, struct root_info, rb_node);
+   entry = to_root_info(n);
 
ret = comp_entry_with_rootid(, entry, 0);
if (ret < 0)
@@ -528,7 +538,7 @@ static void free_root_info(struct rb_node *node)
 {
struct root_info *ri;
 
-   ri = rb_entry(node, struct root_info, rb_node);
+   ri = to_root_info(node);
free(ri->name);
free(ri->path);
free(ri->full_path);
@@ -1268,7 +1278,7 @@ static void filter_and_sort_subvol(struct root_lookup 
*all_subvols,
 
n = rb_last(_subvols->root);
while (n) {
-   entry = rb_entry(n, struct root_info, rb_node);
+   entry = to_root_info(n);
 
ret = resolve_root(all_subvols, entry, top_id);
if (ret == -ENOENT) {
@@ -1300,7 +1310,7 @@ static int list_subvol_fill_paths(int fd, struct 
root_lookup *root_lookup)
while (n) {
struct root_info *entry;
int ret;
-   entry = rb_entry(n, struct root_info, rb_node);
+   entry = to_root_info(n);
ret = lookup_ino_path(fd, entry);
if (ret && ret != -ENOENT)
return ret;
@@ -1467,7 +1477,7 @@ static void print_all_subvol_info(struct root_lookup 
*sorted_tree,
 
n = rb_first(_tree->root);
while (n) {
-   entry = rb_entry(n, struct root_info, sort_node);
+   entry = to_root_info_sorted(n);
 
/* The toplevel subvolume is not listed by default */
if (entry->root_id == BTRFS_FS_TREE_OBJECTID)
@@ -1558,7 +1568,7 @@ int btrfs_get_toplevel_subvol(int fd, struct root_info 
*the_ri)
return ret;
 
rbn = rb_first();
-   ri = rb_entry(rbn, struct root_info, rb_node);
+   ri = to_root_info(rbn);
 
if (ri->root_id != BTRFS_FS_TREE_OBJECTID)
return -ENOENT;
@@ -1590,7 +1600,7 @@ int btrfs_get_subvol(int fd, struct root_info *the_ri)
 
rbn = rb_first();
while(rbn) {
-   ri = rb_entry(rbn, struct root_info, rb_node);
+   ri = to_root_info(rbn);
rr = resolve_root(, ri, root_id);
if (rr == -ENOENT) {
ret = -ENOENT;
@@ -1814,7 +1824,7 @@ char *btrfs_list_path_for_root(int fd, u64 root)
while (n) {
struct root_info *entry;
 
-   entry = rb_entry(n, struct root_info, rb_node);
+   entry = to_root_info(n);
ret = resolve_root(_lookup, entry, top_id);
if (ret == -ENOENT && entry->root_id == root) {
ret_path = NULL;
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/20] btrfs-progs: qgroups: introduce btrfs_qgroup_query

2018-03-07 Thread jeffm
From: Jeff Mahoney 

The only mechanism we have in the progs for searching qgroups is to load
all of them and filter the results.  This works for qgroup show but
to add quota information to 'btrfs subvoluem show' it's pretty wasteful.

This patch splits out setting up the search and performing the search so
we can search for a single qgroupid more easily.  Since TREE_SEARCH
will give results that don't strictly match the search terms, we add
a filter to match only the results we care about.

Signed-off-by: Jeff Mahoney 
---
 qgroup.c | 143 ---
 qgroup.h |   7 
 2 files changed, 116 insertions(+), 34 deletions(-)

diff --git a/qgroup.c b/qgroup.c
index 57815718..d076b1de 100644
--- a/qgroup.c
+++ b/qgroup.c
@@ -1165,11 +1165,30 @@ static inline void print_status_flag_warning(u64 flags)
warning("qgroup data inconsistent, rescan recommended");
 }
 
-static int __qgroups_search(int fd, struct qgroup_lookup *qgroup_lookup)
+static bool key_in_range(const struct btrfs_key *key,
+const struct btrfs_ioctl_search_key *sk)
+{
+   if (key->objectid < sk->min_objectid ||
+   key->objectid > sk->max_objectid)
+   return false;
+
+   if (key->type < sk->min_type ||
+   key->type > sk->max_type)
+   return false;
+
+   if (key->offset < sk->min_offset ||
+   key->offset > sk->max_offset)
+   return false;
+
+   return true;
+}
+
+static int __qgroups_search(int fd, struct btrfs_ioctl_search_args *args,
+   struct qgroup_lookup *qgroup_lookup)
 {
int ret;
-   struct btrfs_ioctl_search_args args;
-   struct btrfs_ioctl_search_key *sk = 
+   struct btrfs_ioctl_search_key *sk = >key;
+   struct btrfs_ioctl_search_key filter_key = args->key;
struct btrfs_ioctl_search_header *sh;
unsigned long off = 0;
unsigned int i;
@@ -1180,30 +1199,15 @@ static int __qgroups_search(int fd, struct 
qgroup_lookup *qgroup_lookup)
u64 qgroupid;
u64 qgroupid1;
 
-   memset(, 0, sizeof(args));
-
-   sk->tree_id = BTRFS_QUOTA_TREE_OBJECTID;
-   sk->max_type = BTRFS_QGROUP_RELATION_KEY;
-   sk->min_type = BTRFS_QGROUP_STATUS_KEY;
-   sk->max_objectid = (u64)-1;
-   sk->max_offset = (u64)-1;
-   sk->max_transid = (u64)-1;
-   sk->nr_items = 4096;
-
qgroup_lookup_init(qgroup_lookup);
 
while (1) {
-   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, );
+   ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, args);
if (ret < 0) {
-   if (errno == ENOENT) {
-   error("can't list qgroups: quotas not enabled");
+   if (errno == ENOENT)
ret = -ENOTTY;
-   } else {
-   error("can't list qgroups: %s",
-  strerror(errno));
+   else
ret = -errno;
-   }
-
break;
}
 
@@ -1217,37 +1221,46 @@ static int __qgroups_search(int fd, struct 
qgroup_lookup *qgroup_lookup)
 * read the root_ref item it contains
 */
for (i = 0; i < sk->nr_items; i++) {
-   sh = (struct btrfs_ioctl_search_header *)(args.buf +
+   struct btrfs_key key;
+
+   sh = (struct btrfs_ioctl_search_header *)(args->buf +
  off);
off += sizeof(*sh);
 
-   switch (btrfs_search_header_type(sh)) {
+   key.objectid = btrfs_search_header_objectid(sh);
+   key.type = btrfs_search_header_type(sh);
+   key.offset = btrfs_search_header_offset(sh);
+
+   if (!key_in_range(, _key))
+   goto next;
+
+   switch (key.type) {
case BTRFS_QGROUP_STATUS_KEY:
si = (struct btrfs_qgroup_status_item *)
-(args.buf + off);
+(args->buf + off);
flags = btrfs_stack_qgroup_status_flags(si);
 
print_status_flag_warning(flags);
break;
case BTRFS_QGROUP_INFO_KEY:
-   qgroupid = btrfs_search_header_offset(sh);
+   qgroupid = key.offset;
info = (struct btrfs_qgroup_info_item *)
-  (args.buf + off);
+  (args->buf + off);
 
  

[PATCH 17/20] btrfs-progs: add support for output formats

2018-03-07 Thread jeffm
From: Jeff Mahoney 

This adds a global --format option to request extended output formats
from each command.  Most of it is plumbing a new cmd_context structure
that's established at the beginning of argument parsing into the command
callbacks.  That structure currently only contains the output mode enum.

We currently only support text mode.  Command help reports what
output formats are available for each command.  Global help reports
what valid formats are.

If an invalid format is requested, an error is reported and we global usage
is dumped that lists the valid formats.

Each command sets a bitmask that describes which formats it is capable
of outputting.  If a globally valid format is requested of a command
that doesn't support it, an error is reported and command usage dumped.

Commands don't need to specify that they support text output.  All
commands are required to output text.

Signed-off-by: Jeff Mahoney 
---
 btrfs-debug-tree.c|   3 +-
 btrfs.c   | 110 ++
 check/main.c  |   3 +-
 cmds-balance.c|  16 +--
 cmds-device.c |  31 +
 cmds-fi-du.c  |   1 +
 cmds-fi-usage.c   |   1 +
 cmds-filesystem.c |  14 --
 cmds-inspect-dump-super.c |   1 +
 cmds-inspect-dump-tree.c  |   1 +
 cmds-inspect-tree-stats.c |   1 +
 cmds-inspect.c|  10 -
 cmds-property.c   |   6 ++-
 cmds-qgroup.c |  17 +--
 cmds-quota.c  |  14 --
 cmds-receive.c|   3 +-
 cmds-replace.c|   8 +++-
 cmds-rescue.c |   9 +++-
 cmds-restore.c|   3 +-
 cmds-scrub.c  |  21 ++---
 cmds-send.c   |   3 +-
 cmds-subvolume.c  |  24 +++---
 commands.h|  32 +++---
 help.c|  51 -
 help.h|   2 +
 25 files changed, 301 insertions(+), 84 deletions(-)

diff --git a/btrfs-debug-tree.c b/btrfs-debug-tree.c
index 49a2e949..8cd05d53 100644
--- a/btrfs-debug-tree.c
+++ b/btrfs-debug-tree.c
@@ -26,6 +26,7 @@ int main(int argc, char **argv)
 {
const struct cmd_struct *cmd = _struct_inspect_dump_tree;
int ret;
+   struct cmd_context cmdcxt = {};
 
set_argv0(argv);
 
@@ -34,7 +35,7 @@ int main(int argc, char **argv)
 
radix_tree_init();
 
-   ret = cmd_execute(cmd, argc, argv);
+   ret = cmd_execute(cmd, , argc, argv);
 
btrfs_close_all_devices();
 
diff --git a/btrfs.c b/btrfs.c
index 49128182..32b8e090 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -26,7 +26,7 @@
 #include "help.h"
 
 static const char * const btrfs_cmd_group_usage[] = {
-   "btrfs [--help] [--version]  [...]  []",
+   "btrfs [--help] [--version] [--format ]  [...] 
 []",
NULL
 };
 
@@ -98,13 +98,36 @@ parse_command_token(const char *arg, const struct cmd_group 
*grp)
return cmd;
 }
 
+static bool cmd_provides_output_format(const struct cmd_struct *cmd,
+  const struct cmd_context *cmdcxt)
+{
+   if (!cmdcxt->output_mode)
+   return true;
+
+   return (1 << cmdcxt->output_mode) & cmd->cmd_format_flags;
+}
+
 static void handle_help_options_next_level(const struct cmd_struct *cmd,
-   int argc, char **argv)
+  const struct cmd_context *cmdcxt,
+  int argc, char **argv)
 {
+   int err = 0;
+
if (argc < 2)
return;
 
-   if (!strcmp(argv[1], "--help")) {
+   /* Check if the command can provide the requested output format */
+   if (!cmd->next && !cmd_provides_output_format(cmd, cmdcxt)) {
+   ASSERT(cmdcxt->output_mode >= 0);
+   ASSERT(cmdcxt->output_mode < CMD_OUTPUT_MAX);
+   fprintf(stderr,
+   "error: %s output is unsupported for this command.\n\n",
+   cmd_outputs[cmdcxt->output_mode]);
+
+   err = 1;
+   }
+
+   if (!strcmp(argv[1], "--help") || err) {
if (cmd->next) {
argc--;
argv++;
@@ -113,12 +136,13 @@ static void handle_help_options_next_level(const struct 
cmd_struct *cmd,
usage_command(cmd, true, false);
}
 
-   exit(0);
+   exit(err);
}
 }
 
-int handle_command_group(const struct cmd_group *grp, int argc,
-char **argv)
+int handle_command_group(const struct cmd_group *grp,
+const struct cmd_context *cmdcxt,
+int argc, char **argv)
 
 {
const struct cmd_struct *cmd;
@@ -132,10 +156,10 @@ int handle_command_group(const struct cmd_group *grp, int 
argc,
 
cmd = parse_command_token(argv[0], grp);
 
-   

[PATCH 19/20] btrfs-progs: qgroups: add json output for usage command

2018-03-07 Thread jeffm
From: Jeff Mahoney 

One of the common requests I receive is for 'df' like facilities
for subvolume usage.  Really, the request is for monitoring tools to be
able to understand when subvolumes may be approaching quota in the same
manner traditional file systems approach ENOSPC.

This patch allows us to export the qgroups data in a machine-readable
format so that monitoring tools can parse it easily.

Signed-off-by: Jeff Mahoney 
---
 Documentation/btrfs-qgroup.asciidoc |   3 +
 cmds-qgroup.c   |  22 +++-
 qgroup.c| 222 
 qgroup.h|   8 +-
 4 files changed, 251 insertions(+), 4 deletions(-)

diff --git a/Documentation/btrfs-qgroup.asciidoc 
b/Documentation/btrfs-qgroup.asciidoc
index 360b3269..7863a4d9 100644
--- a/Documentation/btrfs-qgroup.asciidoc
+++ b/Documentation/btrfs-qgroup.asciidoc
@@ -87,6 +87,9 @@ the btrfs filesystem identified by .
 *show* [options] ::
 Show all qgroups in the btrfs filesystem identified by .
 +
+If enabled, this command supports extended output in json and json:compat 
modes.
+Use of either json mode implies -P and --raw.
++
 `Options`
 +
 -p
diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index f9e81e30..fd637a45 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -301,6 +301,10 @@ static const char * const cmd_qgroup_show_usage[] = {
"   you can use '+' or '-' in front of each item.",
"   (+:ascending, -:descending, ascending default)",
"--sync force sync of the filesystem before getting info",
+   "",
+#ifdef HAVE_JSON
+   "json and json:compat output implies -P and --raw.",
+#endif
NULL
 };
 
@@ -386,6 +390,10 @@ static int cmd_qgroup_show(const struct cmd_struct *cmd,
}
btrfs_qgroup_setup_units(unit_mode);
 
+   if (cmdcxt->output_mode == CMD_OUTPUT_JSON ||
+   cmdcxt->output_mode == CMD_OUTPUT_JSON_COMPAT)
+   unit_mode = UNITS_RAW;
+
if (check_argc_exact(argc - optind, 1))
usage(cmd);
 
@@ -420,7 +428,15 @@ static int cmd_qgroup_show(const struct cmd_struct *cmd,
BTRFS_QGROUP_FILTER_PARENT,
qgroupid);
}
-   ret = btrfs_show_qgroups(fd, filter_set, comparer_set, verbose);
+
+   if (cmdcxt->output_mode == CMD_OUTPUT_JSON ||
+   cmdcxt->output_mode == CMD_OUTPUT_JSON_COMPAT) {
+   bool compat = (cmdcxt->output_mode == CMD_OUTPUT_JSON_COMPAT);
+
+   ret = btrfs_qgroup_output_json(fd, filter_set, comparer_set,
+  compat);
+   } else
+   ret = btrfs_show_qgroups(fd, filter_set, comparer_set, verbose);
close_file_or_dir(fd, dirstream);
free(filter_set);
free(comparer_set);
@@ -428,7 +444,9 @@ static int cmd_qgroup_show(const struct cmd_struct *cmd,
 out:
return !!ret;
 }
-static DEFINE_SIMPLE_COMMAND(qgroup_show, "show");
+static DEFINE_COMMAND(qgroup_show, "show", cmd_qgroup_show,
+ cmd_qgroup_show_usage, NULL, 0,
+ CMD_OUTPUT_FLAG(JSON)|CMD_OUTPUT_FLAG(JSON_COMPAT));
 
 static const char * const cmd_qgroup_limit_usage[] = {
"btrfs qgroup limit [options] |none [] ",
diff --git a/qgroup.c b/qgroup.c
index d076b1de..95d443db 100644
--- a/qgroup.c
+++ b/qgroup.c
@@ -16,12 +16,16 @@
  * Boston, MA 021110-1307, USA.
  */
 
+#include "version.h"
 #include "qgroup.h"
 #include 
 #include "ctree.h"
 #include "ioctl.h"
 #include "utils.h"
 #include 
+#ifdef HAVE_JSON
+#include 
+#endif
 
 #define BTRFS_QGROUP_NFILTERS_INCREASE (2 * BTRFS_QGROUP_FILTER_MAX)
 #define BTRFS_QGROUP_NCOMPS_INCREASE (2 * BTRFS_QGROUP_COMP_MAX)
@@ -1398,6 +1402,224 @@ int btrfs_show_qgroups(int fd,
return ret;
 }
 
+#ifdef HAVE_JSON
+#define QGROUPID_FORMAT_BUF_LEN (20 + 20 + 1 + 1)
+static void format_qgroupid(char *buf, size_t size, u64 qgroupid)
+{
+   int ret;
+
+   ret = snprintf(buf, size, "%llu/%llu",
+  btrfs_qgroup_level(qgroupid),
+  btrfs_qgroup_subvid(qgroupid));
+   ASSERT(ret < sizeof(buf));
+}
+
+static json_object *export_one_u64(u64 value, bool compat)
+{
+   json_object *array, *tmp;
+
+   if (!compat)
+   return json_object_new_int64(value);
+
+   array = json_object_new_array();
+   if (!array)
+   return NULL;
+
+   tmp = json_object_new_int(value >> 32);
+   if (!tmp)
+   goto failure;
+   json_object_array_add(array, tmp);
+
+   tmp = json_object_new_int(value & 0x);
+   if (!tmp)
+   goto failure;
+   json_object_array_add(array, tmp);
+
+   return array;
+failure:
+   json_object_put(array);
+   return NULL;
+}
+
+static bool export_one_qgroup(json_object *container,
+

[PATCH 20/20] btrfs-progs: handle command groups directly for common case

2018-03-07 Thread jeffm
From: Jeff Mahoney 

Most command groups just pass their own command group to
handle_command_group.  We can remove the explicit definitions
of command group callbacks by passing the cmd_struct to
handle_command_group and allowing it to resolve the group from it.

Signed-off-by: Jeff Mahoney 
---
 btrfs.c   | 14 +++---
 cmds-balance.c|  6 +++---
 cmds-device.c |  5 -
 cmds-filesystem.c |  6 --
 cmds-inspect.c|  5 -
 cmds-property.c   |  6 --
 cmds-qgroup.c |  5 -
 cmds-quota.c  |  5 -
 cmds-replace.c|  5 -
 cmds-rescue.c |  5 -
 cmds-scrub.c  |  6 --
 cmds-subvolume.c  |  5 -
 commands.h|  7 ---
 13 files changed, 14 insertions(+), 66 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 32b8e090..427e14c8 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -140,26 +140,26 @@ static void handle_help_options_next_level(const struct 
cmd_struct *cmd,
}
 }
 
-int handle_command_group(const struct cmd_group *grp,
+int handle_command_group(const struct cmd_struct *cmd,
 const struct cmd_context *cmdcxt,
 int argc, char **argv)
 
 {
-   const struct cmd_struct *cmd;
+   const struct cmd_struct *subcmd;
 
argc--;
argv++;
if (argc < 1) {
-   usage_command_group(grp, false, false);
+   usage_command_group(cmd->next, false, false);
exit(1);
}
 
-   cmd = parse_command_token(argv[0], grp);
+   subcmd = parse_command_token(argv[0], cmd->next);
 
-   handle_help_options_next_level(cmd, cmdcxt, argc, argv);
+   handle_help_options_next_level(subcmd, cmdcxt, argc, argv);
 
-   fixup_argv0(argv, cmd->token);
-   return cmd_execute(cmd, cmdcxt, argc, argv);
+   fixup_argv0(argv, subcmd->token);
+   return cmd_execute(subcmd, cmdcxt, argc, argv);
 }
 
 static const struct cmd_group btrfs_cmd_group;
diff --git a/cmds-balance.c b/cmds-balance.c
index c17b9ee3..e414ca27 100644
--- a/cmds-balance.c
+++ b/cmds-balance.c
@@ -943,7 +943,7 @@ static const struct cmd_group balance_cmd_group = {
}
 };
 
-static int cmd_balance(const struct cmd_struct *unused,
+static int cmd_balance(const struct cmd_struct *cmd,
   const struct cmd_context *cmdcxt, int argc, char **argv)
 {
if (argc == 2 && strcmp("start", argv[1]) != 0) {
@@ -956,7 +956,7 @@ static int cmd_balance(const struct cmd_struct *unused,
return do_balance(argv[1], , 0);
}
 
-   return handle_command_group(_cmd_group, cmdcxt, argc, argv);
+   return handle_command_group(cmd, cmdcxt, argc, argv);
 }
 
-DEFINE_GROUP_COMMAND(balance, "balance");
+DEFINE_COMMAND(balance, "balance", cmd_balance, NULL, _cmd_group, 0, 
0);
diff --git a/cmds-device.c b/cmds-device.c
index ac9e82b1..f8c0ff20 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -630,9 +630,4 @@ static const struct cmd_group device_cmd_group = {
}
 };
 
-static int cmd_device(const struct cmd_struct *unused,
- const struct cmd_context *cmdcxt, int argc, char **argv)
-{
-   return handle_command_group(_cmd_group, cmdcxt, argc, argv);
-}
 DEFINE_GROUP_COMMAND_TOKEN(device);
diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 27e0865c..e3e54864 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -1242,10 +1242,4 @@ static const struct cmd_group filesystem_cmd_group = {
}
 };
 
-static int cmd_filesystem(const struct cmd_struct *unused,
- const struct cmd_context *cmdcxt,
- int argc, char **argv)
-{
-   return handle_command_group(_cmd_group, cmdcxt, argc, argv);
-}
 DEFINE_GROUP_COMMAND_TOKEN(filesystem);
diff --git a/cmds-inspect.c b/cmds-inspect.c
index ade9db7e..561a0fbd 100644
--- a/cmds-inspect.c
+++ b/cmds-inspect.c
@@ -658,9 +658,4 @@ static const struct cmd_group inspect_cmd_group = {
}
 };
 
-static int cmd_inspect(const struct cmd_struct *unused,
-  const struct cmd_context *cmdcxt, int argc, char **argv)
-{
-   return handle_command_group(_cmd_group, cmdcxt, argc, argv);
-}
 DEFINE_GROUP_COMMAND(inspect, "inspect");
diff --git a/cmds-property.c b/cmds-property.c
index 498fa456..58f6c48a 100644
--- a/cmds-property.c
+++ b/cmds-property.c
@@ -422,10 +422,4 @@ static const struct cmd_group property_cmd_group = {
}
 };
 
-static int cmd_property(const struct cmd_struct *unused,
-   const struct cmd_context *cmdcxt,
-   int argc, char **argv)
-{
-   return handle_command_group(_cmd_group, cmdcxt, argc, argv);
-}
 DEFINE_GROUP_COMMAND_TOKEN(property);
diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index fd637a45..e20c1159 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -558,9 +558,4 @@ static const struct cmd_group qgroup_cmd_group = {
}
 };
 
-static int cmd_qgroup(const struct 

[PATCH 13/20] btrfs-progs: use cmd_struct as command entry point

2018-03-07 Thread jeffm
From: Jeff Mahoney 

Rather than having global command usage and callbacks used to create
cmd_structs in the command array, establish the cmd_struct structures
separately and use those.  The next commit in the series passes the
cmd_struct to the command callbacks such that we can access flags
and determine which of several potential command we were called as.

This establishes several macros to more easily define the commands
within each command's source.

Signed-off-by: Jeff Mahoney 
---
 btrfs-calc-size.c |   5 +-
 btrfs-debug-tree.c|   5 +-
 btrfs-show-super.c|   6 +--
 btrfs.c   |  48 ++
 check/main.c  |   5 +-
 cmds-balance.c|  27 ++
 cmds-device.c |  31 ++-
 cmds-fi-du.c  |   5 +-
 cmds-fi-usage.c   |   5 +-
 cmds-filesystem.c |  52 ++-
 cmds-inspect-dump-super.c |   5 +-
 cmds-inspect-dump-tree.c  |   5 +-
 cmds-inspect-tree-stats.c |   5 +-
 cmds-inspect.c|  36 +++--
 cmds-property.c   |  19 +++
 cmds-qgroup.c |  31 +--
 cmds-quota.c  |  17 ---
 cmds-receive.c|   5 +-
 cmds-replace.c|  19 +++
 cmds-rescue.c |  22 
 cmds-restore.c|   5 +-
 cmds-scrub.c  |  20 +---
 cmds-send.c   |   6 +--
 cmds-subvolume.c  |  38 --
 commands.h| 127 +++---
 help.c|  21 +---
 26 files changed, 326 insertions(+), 244 deletions(-)

diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c
index d2d68ab2..908e830f 100644
--- a/btrfs-calc-size.c
+++ b/btrfs-calc-size.c
@@ -23,15 +23,16 @@
 
 int main(int argc, char **argv)
 {
+   const struct cmd_struct *cmd = _struct_inspect_tree_stats;
int ret;
 
warning(
 "\nthe tool has been deprecated, please use 'btrfs inspect-internal 
tree-stats' instead\n");
 
if (argc > 1 && !strcmp(argv[1], "--help"))
-   usage(cmd_inspect_tree_stats_usage);
+   usage(cmd->usagestr);
 
-   ret = cmd_inspect_tree_stats(argc, argv);
+   ret = cmd_execute(cmd, argc, argv);
 
btrfs_close_all_devices();
 
diff --git a/btrfs-debug-tree.c b/btrfs-debug-tree.c
index 7bee018f..7f254519 100644
--- a/btrfs-debug-tree.c
+++ b/btrfs-debug-tree.c
@@ -24,16 +24,17 @@
 
 int main(int argc, char **argv)
 {
+   const struct cmd_struct *cmd = _struct_inspect_dump_tree;
int ret;
 
set_argv0(argv);
 
if (argc > 1 && !strcmp(argv[1], "--help"))
-   usage(cmd_inspect_dump_tree_usage);
+   usage(cmd->usagestr);
 
radix_tree_init();
 
-   ret = cmd_inspect_dump_tree(argc, argv);
+   ret = cmd_execute(cmd, argc, argv);
 
btrfs_close_all_devices();
 
diff --git a/btrfs-show-super.c b/btrfs-show-super.c
index 4273e42d..ee717c33 100644
--- a/btrfs-show-super.c
+++ b/btrfs-show-super.c
@@ -22,7 +22,7 @@
 
 int main(int argc, char **argv)
 {
-
+   const struct cmd_struct *cmd = _struct_inspect_dump_super;
int ret;
 
set_argv0(argv);
@@ -31,9 +31,9 @@ int main(int argc, char **argv)
 "\nthe tool has been deprecated, please use 'btrfs inspect-internal 
dump-super' instead\n");
 
if (argc > 1 && !strcmp(argv[1], "--help"))
-   usage(cmd_inspect_dump_super_usage);
+   usage(cmd->usagestr);
 
-   ret = cmd_inspect_dump_super(argc, argv);
+   ret = cmd_execute(cmd, argc, argv);
 
return ret;
 }
diff --git a/btrfs.c b/btrfs.c
index fec1a135..1e68b0c0 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -42,10 +42,11 @@ static inline const char *skip_prefix(const char *str, 
const char *prefix)
 static int parse_one_token(const char *arg, const struct cmd_group *grp,
   const struct cmd_struct **cmd_ret)
 {
-   const struct cmd_struct *cmd = grp->commands;
const struct cmd_struct *abbrev_cmd = NULL, *ambiguous_cmd = NULL;
+   int i = 0;
 
-   for (; cmd->token; cmd++) {
+   for (i = 0; grp->commands[i]; i++) {
+   const struct cmd_struct *cmd = grp->commands[i];
const char *rest;
 
rest = skip_prefix(arg, cmd->token);
@@ -134,7 +135,7 @@ int handle_command_group(const struct cmd_group *grp, int 
argc,
handle_help_options_next_level(cmd, argc, argv);
 
fixup_argv0(argv, cmd->token);
-   return cmd->fn(argc, argv);
+   return cmd_execute(cmd, argc, argv);
 }
 
 static const struct cmd_group btrfs_cmd_group;
@@ -153,6 +154,8 @@ static int cmd_help(int argc, char **argv)
return 0;
 }
 
+static DEFINE_SIMPLE_COMMAND(help, "help");
+
 static const char * const cmd_version_usage[] = {
"btrfs version",
"Display btrfs-progs version",
@@ -164,6 +167,7 @@ static int cmd_version(int 

[PATCH 14/20] btrfs-progs: pass cmd_struct to command callback function

2018-03-07 Thread jeffm
From: Jeff Mahoney 

This patch passes the cmd_struct to the command callback function.  This
has several purposes: It allows the command callback to identify which
command was used to call it.  It also gives us direct access to the
usage associated with that command.

Signed-off-by: Jeff Mahoney 
---
 btrfs.c   |  8 
 check/main.c  |  2 +-
 cmds-balance.c| 19 ---
 cmds-device.c | 35 +++
 cmds-fi-du.c  |  3 ++-
 cmds-fi-usage.c   |  3 ++-
 cmds-filesystem.c | 24 
 cmds-inspect-dump-super.c |  3 ++-
 cmds-inspect-dump-tree.c  |  3 ++-
 cmds-inspect-tree-stats.c |  3 ++-
 cmds-inspect.c| 17 +++--
 cmds-property.c   | 12 
 cmds-qgroup.c | 18 +++---
 cmds-quota.c  |  9 +
 cmds-receive.c|  2 +-
 cmds-replace.c| 11 +++
 cmds-rescue.c | 14 +-
 cmds-restore.c|  2 +-
 cmds-scrub.c  | 23 +++
 cmds-send.c   |  2 +-
 cmds-subvolume.c  | 27 +--
 commands.h|  4 ++--
 22 files changed, 146 insertions(+), 98 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 1e68b0c0..49128182 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -148,7 +148,7 @@ static const char * const cmd_help_usage[] = {
NULL
 };
 
-static int cmd_help(int argc, char **argv)
+static int cmd_help(const struct cmd_struct *unused, int argc, char **argv)
 {
help_command_group(_cmd_group, argc, argv);
return 0;
@@ -162,7 +162,7 @@ static const char * const cmd_version_usage[] = {
NULL
 };
 
-static int cmd_version(int argc, char **argv)
+static int cmd_version(const struct cmd_struct *unused, int argc, char **argv)
 {
printf("%s\n", PACKAGE_STRING);
return 0;
@@ -231,13 +231,13 @@ void handle_special_globals(int shift, int argc, char 
**argv)
if (has_full)
usage_command_group(_cmd_group, true, false);
else
-   cmd_help(argc, argv);
+   cmd_execute(_struct_help, argc, argv);
exit(0);
}
 
for (i = 0; i < shift; i++)
if (strcmp(argv[i], "--version") == 0) {
-   cmd_version(argc, argv);
+   cmd_execute(_struct_version, argc, argv);
exit(0);
}
 }
diff --git a/check/main.c b/check/main.c
index 4b8f7678..bd31fb9f 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9440,7 +9440,7 @@ static const char * const cmd_check_usage[] = {
NULL
 };
 
-static int cmd_check(int argc, char **argv)
+static int cmd_check(const struct cmd_struct *cmd, int argc, char **argv)
 {
struct cache_tree root_cache;
struct btrfs_root *root;
diff --git a/cmds-balance.c b/cmds-balance.c
index 7a60be61..1bd7b3ce 100644
--- a/cmds-balance.c
+++ b/cmds-balance.c
@@ -515,7 +515,8 @@ static const char * const cmd_balance_start_usage[] = {
NULL
 };
 
-static int cmd_balance_start(int argc, char **argv)
+static int cmd_balance_start(const struct cmd_struct *cmd,
+int argc, char **argv)
 {
struct btrfs_ioctl_balance_args args;
struct btrfs_balance_args *ptrs[] = { , ,
@@ -680,7 +681,8 @@ static const char * const cmd_balance_pause_usage[] = {
NULL
 };
 
-static int cmd_balance_pause(int argc, char **argv)
+static int cmd_balance_pause(const struct cmd_struct *cmd,
+int argc, char **argv)
 {
const char *path;
int fd;
@@ -719,7 +721,8 @@ static const char * const cmd_balance_cancel_usage[] = {
NULL
 };
 
-static int cmd_balance_cancel(int argc, char **argv)
+static int cmd_balance_cancel(const struct cmd_struct *cmd,
+ int argc, char **argv)
 {
const char *path;
int fd;
@@ -758,7 +761,8 @@ static const char * const cmd_balance_resume_usage[] = {
NULL
 };
 
-static int cmd_balance_resume(int argc, char **argv)
+static int cmd_balance_resume(const struct cmd_struct *cmd,
+ int argc, char **argv)
 {
struct btrfs_ioctl_balance_args args;
const char *path;
@@ -826,7 +830,8 @@ static const char * const cmd_balance_status_usage[] = {
  *   1 : Successful to know status of a pending balance
  *   0 : When there is no pending balance or completed
  */
-static int cmd_balance_status(int argc, char **argv)
+static int cmd_balance_status(const struct cmd_struct *cmd,
+ int argc, char **argv)
 {
struct btrfs_ioctl_balance_args args;
const char *path;
@@ -904,7 +909,7 @@ out:
 }
 static DEFINE_SIMPLE_COMMAND(balance_status, "status");
 
-static int 

[PATCH 16/20] btrfs-progs: pass cmd_struct to usage()

2018-03-07 Thread jeffm
From: Jeff Mahoney 

Now that every call site has a cmd_struct, we can just pass the cmd_struct
to usage to print the usager information.  This allows us to interpret
the format flags we'll add later in this series to inform the user of
which output formats any given command supports.

Signed-off-by: Jeff Mahoney 
---
 btrfs-calc-size.c |  2 +-
 btrfs-debug-tree.c|  2 +-
 btrfs-show-super.c|  2 +-
 check/main.c  |  4 ++--
 cmds-balance.c| 14 +++---
 cmds-device.c | 19 +--
 cmds-fi-du.c  |  4 ++--
 cmds-fi-usage.c   |  4 ++--
 cmds-filesystem.c | 18 +-
 cmds-inspect-dump-super.c |  4 ++--
 cmds-inspect-dump-tree.c  |  4 ++--
 cmds-inspect-tree-stats.c |  4 ++--
 cmds-inspect.c| 16 
 cmds-property.c   | 22 +-
 cmds-qgroup.c | 18 +-
 cmds-quota.c  |  8 
 cmds-receive.c|  4 ++--
 cmds-replace.c| 12 ++--
 cmds-rescue.c | 12 ++--
 cmds-restore.c|  8 
 cmds-scrub.c  | 25 ++---
 cmds-send.c   |  2 +-
 cmds-subvolume.c  | 30 +++---
 help.c|  6 +++---
 help.h|  2 +-
 25 files changed, 118 insertions(+), 128 deletions(-)

diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c
index 908e830f..4c8fcc19 100644
--- a/btrfs-calc-size.c
+++ b/btrfs-calc-size.c
@@ -30,7 +30,7 @@ int main(int argc, char **argv)
 "\nthe tool has been deprecated, please use 'btrfs inspect-internal 
tree-stats' instead\n");
 
if (argc > 1 && !strcmp(argv[1], "--help"))
-   usage(cmd->usagestr);
+   usage(cmd);
 
ret = cmd_execute(cmd, argc, argv);
 
diff --git a/btrfs-debug-tree.c b/btrfs-debug-tree.c
index 7f254519..49a2e949 100644
--- a/btrfs-debug-tree.c
+++ b/btrfs-debug-tree.c
@@ -30,7 +30,7 @@ int main(int argc, char **argv)
set_argv0(argv);
 
if (argc > 1 && !strcmp(argv[1], "--help"))
-   usage(cmd->usagestr);
+   usage(cmd);
 
radix_tree_init();
 
diff --git a/btrfs-show-super.c b/btrfs-show-super.c
index ee717c33..b4a5b693 100644
--- a/btrfs-show-super.c
+++ b/btrfs-show-super.c
@@ -31,7 +31,7 @@ int main(int argc, char **argv)
 "\nthe tool has been deprecated, please use 'btrfs inspect-internal 
dump-super' instead\n");
 
if (argc > 1 && !strcmp(argv[1], "--help"))
-   usage(cmd->usagestr);
+   usage(cmd);
 
ret = cmd_execute(cmd, argc, argv);
 
diff --git a/check/main.c b/check/main.c
index bd31fb9f..0bb8633a 100644
--- a/check/main.c
+++ b/check/main.c
@@ -9530,7 +9530,7 @@ static int cmd_check(const struct cmd_struct *cmd, int 
argc, char **argv)
break;
case '?':
case 'h':
-   usage(cmd_check_usage);
+   usage(cmd);
case GETOPT_VAL_REPAIR:
printf("enabling repair mode\n");
repair = 1;
@@ -9581,7 +9581,7 @@ static int cmd_check(const struct cmd_struct *cmd, int 
argc, char **argv)
}
 
if (check_argc_exact(argc - optind, 1))
-   usage(cmd_check_usage);
+   usage(cmd);
 
if (ctx.progress_enabled) {
ctx.tp = TASK_NOTHING;
diff --git a/cmds-balance.c b/cmds-balance.c
index 488fffcc..c639459f 100644
--- a/cmds-balance.c
+++ b/cmds-balance.c
@@ -585,12 +585,12 @@ static int cmd_balance_start(const struct cmd_struct *cmd,
background = 1;
break;
default:
-   usage(cmd_balance_start_usage);
+   usage(cmd);
}
}
 
if (check_argc_exact(argc - optind, 1))
-   usage(cmd_balance_start_usage);
+   usage(cmd);
 
/*
 * allow -s only under --force, otherwise do with system chunks
@@ -692,7 +692,7 @@ static int cmd_balance_pause(const struct cmd_struct *cmd,
clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
-   usage(cmd_balance_pause_usage);
+   usage(cmd);
 
path = argv[optind];
 
@@ -732,7 +732,7 @@ static int cmd_balance_cancel(const struct cmd_struct *cmd,
clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
-   usage(cmd_balance_cancel_usage);
+   usage(cmd);
 
path = argv[optind];
 
@@ -773,7 +773,7 @@ static int cmd_balance_resume(const struct cmd_struct *cmd,
clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
-

[PATCH 18/20] btrfs-progs: add generic support for json output

2018-03-07 Thread jeffm
From: Jeff Mahoney 

This patch adds support for JSON and JSON-compat output.  The latter is
intended to be compatible with Javascript's integers being represented
as 64-bit floats, with only 53 bits usable for the integer component.
Compat mode output will post 64-bit integers as an array of two 32-bit
integers in [high, low] format.

This patch also adds support for reporting which output formats a
command supports as well as detection of the json-c library.

Signed-off-by: Jeff Mahoney 
---
 Makefile.inc.in |  4 ++--
 commands.h  | 13 +
 configure.ac|  6 ++
 help.c  | 25 ++---
 4 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/Makefile.inc.in b/Makefile.inc.in
index 56271903..68bddbed 100644
--- a/Makefile.inc.in
+++ b/Makefile.inc.in
@@ -18,9 +18,9 @@ BTRFSRESTORE_ZSTD = @BTRFSRESTORE_ZSTD@
 SUBST_CFLAGS = @CFLAGS@
 SUBST_LDFLAGS = @LDFLAGS@
 
-LIBS_BASE = @UUID_LIBS@ @BLKID_LIBS@ -L. -pthread
+LIBS_BASE = @UUID_LIBS@ @BLKID_LIBS@ @JSON_LIBS@ -L. -pthread
 LIBS_COMP = @ZLIB_LIBS@ @LZO2_LIBS@ @ZSTD_LIBS@
-STATIC_LIBS_BASE = @UUID_LIBS_STATIC@ @BLKID_LIBS_STATIC@ -L. -pthread
+STATIC_LIBS_BASE = @UUID_LIBS_STATIC@ @BLKID_LIBS_STATIC@ @JSON_LIBS_STATIC@ 
-L. -pthread
 STATIC_LIBS_COMP = @ZLIB_LIBS_STATIC@ @LZO2_LIBS_STATIC@ @ZSTD_LIBS_STATIC@
 
 prefix ?= @prefix@
diff --git a/commands.h b/commands.h
index 83316c6d..bf74eaf8 100644
--- a/commands.h
+++ b/commands.h
@@ -19,9 +19,22 @@
 
 enum cmd_output {
CMD_OUTPUT_TEXT = 0,
+#ifdef HAVE_JSON
+   CMD_OUTPUT_JSON,
+   CMD_OUTPUT_JSON_COMPAT,
+#endif
CMD_OUTPUT_MAX,
 };
 
+/*
+ * If we don't have the JSON library, map the flags to text to avoid
+ * more ifdefs elsewhere.
+ */
+#ifndef HAVE_JSON
+#define CMD_OUTPUT_JSONCMD_OUTPUT_TEXT
+#define CMD_OUTPUT_JSON_COMPAT CMD_OUTPUT_TEXT
+#endif
+
 #define CMD_OUTPUT_FLAG(x) (1 << (CMD_OUTPUT_##x))
 
 struct cmd_context {
diff --git a/configure.ac b/configure.ac
index 56d17c3a..6aec672a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -197,6 +197,12 @@ PKG_STATIC(UUID_LIBS_STATIC, [uuid])
 PKG_CHECK_MODULES(ZLIB, [zlib])
 PKG_STATIC(ZLIB_LIBS_STATIC, [zlib])
 
+PKG_CHECK_MODULES(JSON, [json-c], [
+   AC_DEFINE(HAVE_JSON, [1], [Have JSON]),
+   PKG_STATIC(JSON_LIBS_STATIC, [json-c], [
+   AC_DEFINE(HAVE_JSON_STATIC, [1], [Have JSON static])], [true])
+   ], [true])
+
 AC_ARG_ENABLE([zstd],
AS_HELP_STRING([--disable-zstd], [build without zstd support]),
[], [enable_zstd=yes]
diff --git a/help.c b/help.c
index 063e9740..f1710621 100644
--- a/help.c
+++ b/help.c
@@ -32,6 +32,10 @@
 
 const char *cmd_outputs[CMD_OUTPUT_MAX] = {
"text",
+#ifdef HAVE_JSON
+   [CMD_OUTPUT_JSON] = "json",
+   [CMD_OUTPUT_JSON_COMPAT] = "json:compat",
+#endif
 };
 
 static char argv0_buf[ARGV0_BUF_SIZE] = "btrfs";
@@ -186,18 +190,17 @@ static int do_usage_one_command(const char * const 
*usagestr,
fprintf(outf, "%*s%s\n", pad, "", *usagestr++);
 
/* options (optional) */
-   if (!*usagestr || ((flags & USAGE_OPTIONS) == 0))
-   return 0;
-
-   /*
-* options (if present) should always (even if there is no long
-* description) be prepended with an empty line, skip it
-*/
-   usagestr++;
+   if (*usagestr && (flags & USAGE_OPTIONS)) {
+   /*
+* options (if present) should always (even if there is no long
+* description) be prepended with an empty line, skip it
+*/
+   usagestr++;
 
-   fputc('\n', outf);
-   while (*usagestr)
-   fprintf(outf, "%*s%s\n", pad, "", *usagestr++);
+   fputc('\n', outf);
+   while (*usagestr)
+   fprintf(outf, "%*s%s\n", pad, "", *usagestr++);
+   }
 
if (flags & USAGE_FORMAT) {
/* We always support text */
-- 
2.12.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/20] btrfs-progs: pass cmd_struct to clean_args_no_options{,_relaxed}

2018-03-07 Thread jeffm
From: Jeff Mahoney 

Now that we have a cmd_struct everywhere, we can pass it to
clean_args_no_options and have it resolve the usage string from
it there.  This is necessary for it to pass the cmd_struct to
usage() in the next patch.

Signed-off-by: Jeff Mahoney 
---
 cmds-balance.c|  6 +++---
 cmds-device.c |  9 +
 cmds-filesystem.c |  8 
 cmds-inspect.c|  4 ++--
 cmds-qgroup.c | 16 
 cmds-quota.c  |  4 ++--
 cmds-rescue.c |  4 ++--
 cmds-scrub.c  | 15 ---
 cmds-subvolume.c  |  6 +++---
 help.c|  9 +
 help.h|  7 ---
 11 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/cmds-balance.c b/cmds-balance.c
index 1bd7b3ce..488fffcc 100644
--- a/cmds-balance.c
+++ b/cmds-balance.c
@@ -689,7 +689,7 @@ static int cmd_balance_pause(const struct cmd_struct *cmd,
int ret;
DIR *dirstream = NULL;
 
-   clean_args_no_options(argc, argv, cmd_balance_pause_usage);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_balance_pause_usage);
@@ -729,7 +729,7 @@ static int cmd_balance_cancel(const struct cmd_struct *cmd,
int ret;
DIR *dirstream = NULL;
 
-   clean_args_no_options(argc, argv, cmd_balance_cancel_usage);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_balance_cancel_usage);
@@ -770,7 +770,7 @@ static int cmd_balance_resume(const struct cmd_struct *cmd,
int fd;
int ret;
 
-   clean_args_no_options(argc, argv, cmd_balance_resume_usage);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_balance_resume_usage);
diff --git a/cmds-device.c b/cmds-device.c
index feb53f68..5be748f7 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -149,11 +149,12 @@ static int _cmd_device_remove(const struct cmd_struct 
*cmd,
char*mntpnt;
int i, fdmnt, ret = 0;
DIR *dirstream = NULL;
+   const char * const *usagestr = cmd->usagestr;
 
-   clean_args_no_options(argc, argv, cmd->usagestr);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_min(argc - optind, 2))
-   usage(cmd->usagestr);
+   usage(usagestr);
 
mntpnt = argv[argc - 1];
 
@@ -347,7 +348,7 @@ static int cmd_device_ready(const struct cmd_struct *cmd, 
int argc, char **argv)
int ret;
char*path;
 
-   clean_args_no_options(argc, argv, cmd->usagestr);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_device_ready_usage);
@@ -573,7 +574,7 @@ static int cmd_device_usage(const struct cmd_struct *cmd, 
int argc, char **argv)
 
unit_mode = get_unit_mode_from_arg(, argv, 1);
 
-   clean_args_no_options(argc, argv, cmd->usagestr);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_min(argc - optind, 1))
usage(cmd_device_usage_usage);
diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index c2ee8595..b793532b 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -127,7 +127,7 @@ static int cmd_filesystem_df(const struct cmd_struct *cmd,
 
unit_mode = get_unit_mode_from_arg(, argv, 1);
 
-   clean_args_no_options(argc, argv, cmd_filesystem_df_usage);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_filesystem_df_usage);
@@ -822,7 +822,7 @@ static int cmd_filesystem_sync(const struct cmd_struct *cmd,
char*path;
DIR *dirstream = NULL;
 
-   clean_args_no_options(argc, argv, cmd_filesystem_sync_usage);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 1))
usage(cmd_filesystem_sync_usage);
@@ -1102,7 +1102,7 @@ static int cmd_filesystem_resize(const struct cmd_struct 
*cmd,
DIR *dirstream = NULL;
struct stat st;
 
-   clean_args_no_options_relaxed(argc, argv, cmd_filesystem_resize_usage);
+   clean_args_no_options_relaxed(cmd, argc, argv);
 
if (check_argc_exact(argc - optind, 2))
usage(cmd_filesystem_resize_usage);
@@ -1175,7 +1175,7 @@ static const char * const cmd_filesystem_label_usage[] = {
 static int cmd_filesystem_label(const struct cmd_struct *cmd,
int argc, char **argv)
 {
-   clean_args_no_options(argc, argv, cmd_filesystem_label_usage);
+   clean_args_no_options(cmd, argc, argv);
 
if (check_argc_min(argc - optind, 1) ||
check_argc_max(argc - optind, 2))
diff --git a/cmds-inspect.c b/cmds-inspect.c
index 1bdc8bd9..ece8c8d4 100644
--- a/cmds-inspect.c
+++ b/cmds-inspect.c
@@ -277,7 +277,7 @@ static int 

[PATCH v2 00/20] btrfs-progs: qgroups usability

2018-03-07 Thread jeffm
From: Jeff Mahoney 

Thanks to Qu Wenruo, Nikolay Borisov, and Tomohiro Misono for taking
the time to review my previous patchset.  I've incorporated your
suggestions into this version.

Obviously this one is quite a bit longer than the first version.  After
I posted it, Dave and I talked offline about whether it would make sense
to add the ability to output in JSON to other commands.  If it does,
and we agreed that it did, it would make sense for the choice of format
to be a global option.  In order to do that, I've had to rework some of
how we handle command definition and execution.  Mostly that is around
how to pass format flags and how to inform the user about which output
formats are available for each command.

So, here's the updated series:

* btrfs-progs: quota: Add -W option to rescan to wait without starting
  rescan
  - unchanged

* btrfs-progs: qgroups: fix misleading index check
  - unchanged

* btrfs-progs: constify pathnames passed as arguments
  - removed stray formatting change in send-utils.c

* btrfs-progs: btrfs-list: add rb_entry helpers for root_info
  - new patch, accessors for root_info rb_nodes

* btrfs-progs: btrfs-list: add btrfs_cleanup_root_info
  - new patch, adds a helper to clean up strings attached to root_info

* btrfs-progs: qgroups: add pathname to show output
  - Fixed help text to be more accurate
  - Fixed coding style issues
  - Added checks for NULL pathname
  - Free root_info strings after looking up pathname
  - Free pathname during teardown

* btrfs-progs: qgroups: introduce and use info and limit structures
  - edited to 80 columns

* btrfs-progs: qgroups: introduce btrfs_qgroup_query
  - Fixed issue with ENOENT vs ENOTTY
  - Added filter for search results
  - Use temporary key for search header
  - Cache passed search key for comparison since the loop modifies it

* btrfs-progs: subvolume: add quota info to btrfs sub show
  - Fixed/improved error reporting

* btrfs-progs: help: convert ints used as bools to bool
  - new patch

* btrfs-progs: reorder placement of help declarations for send/receive
  - new patch, required to remove usage declarations from commands.h

* btrfs-progs: filesystem balance: split out special handling
  - new patch, stop directly aliasing 'filesystem balance' to
'balance' -- it still does the right thing for normal execution
but help says "go read the balance help instead"

* btrfs-progs: use cmd_struct as command entry point
  - new patch, removes most command callback and usage declarations
  - replaces with cmd_struct declarations that are used in command
group arrays directly, similar to how sysfs attributes are
defined

* btrfs-progs: pass cmd_struct to command callback function
  - new patch, required to pass flags and have access to usage array

* btrfs-progs: pass cmd_struct to clean_args_no_options{,_relaxed}
  - new patch, required to pass cmd_struct to usage()

* btrfs-progs: pass cmd_struct to usage()
  - new patch, required to dynamically print what output formats
are available based on flags defined in command

* btrfs-progs: add support for output formats
  - new patch, adds infrastructure for output formats, including passing
caller context to commands

* btrfs-progs: add generic support for json output
  - new patch, split out JSON library detection from qgroups patch

* btrfs-progs: handle command groups directly for common case
  - new patch, remove most simple command group callbacks

* btrfs-progs: qgroups: add json output for usage command
  - remove -j and --compat-json options in favor of global --format json
or --format json:compat
  - added macro for qgroupid format buffer length
  - handle NULL pathnames better

-Jeff

Jeff Mahoney (20):
  btrfs-progs: quota: Add -W option to rescan to wait without starting
rescan
  btrfs-progs: qgroups: fix misleading index check
  btrfs-progs: constify pathnames passed as arguments
  btrfs-progs: btrfs-list: add rb_entry helpers for root_info
  btrfs-progs: btrfs-list: add btrfs_cleanup_root_info
  btrfs-progs: qgroups: add pathname to show output
  btrfs-progs: qgroups: introduce and use info and limit structures
  btrfs-progs: qgroups: introduce btrfs_qgroup_query
  btrfs-progs: subvolume: add quota info to btrfs sub show
  btrfs-progs: help: convert ints used as bools to bool
  btrfs-progs: reorder placement of help declarations for send/receive
  btrfs-progs: filesystem balance: split out special handling
  btrfs-progs: use cmd_struct as command entry point
  btrfs-progs: pass cmd_struct to command callback function
  btrfs-progs: pass cmd_struct to clean_args_no_options{,_relaxed}
  btrfs-progs: pass cmd_struct to usage()
  btrfs-progs: add support for output formats
  btrfs-progs: add generic support for json output
  btrfs-progs: qgroups: add json output for usage command
  btrfs-progs: handle command groups directly for common case

 Documentation/btrfs-qgroup.asciidoc |   7 +
 Documentation/btrfs-quota.asciidoc  |  

[PATCH 10/20] btrfs-progs: help: convert ints used as bools to bool

2018-03-07 Thread jeffm
From: Jeff Mahoney 

We use an int for 'full', 'all', and 'err' when we really mean a boolean.

Signed-off-by: Jeff Mahoney 
---
 btrfs.c | 14 +++---
 help.c  | 25 +
 help.h  |  4 ++--
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 2d39f2ce..fec1a135 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -109,7 +109,7 @@ static void handle_help_options_next_level(const struct 
cmd_struct *cmd,
argv++;
help_command_group(cmd->next, argc, argv);
} else {
-   usage_command(cmd, 1, 0);
+   usage_command(cmd, true, false);
}
 
exit(0);
@@ -125,7 +125,7 @@ int handle_command_group(const struct cmd_group *grp, int 
argc,
argc--;
argv++;
if (argc < 1) {
-   usage_command_group(grp, 0, 0);
+   usage_command_group(grp, false, false);
exit(1);
}
 
@@ -212,20 +212,20 @@ static int handle_global_options(int argc, char **argv)
 
 void handle_special_globals(int shift, int argc, char **argv)
 {
-   int has_help = 0;
-   int has_full = 0;
+   bool has_help = false;
+   bool has_full = false;
int i;
 
for (i = 0; i < shift; i++) {
if (strcmp(argv[i], "--help") == 0)
-   has_help = 1;
+   has_help = true;
else if (strcmp(argv[i], "--full") == 0)
-   has_full = 1;
+   has_full = true;
}
 
if (has_help) {
if (has_full)
-   usage_command_group(_cmd_group, 1, 0);
+   usage_command_group(_cmd_group, true, false);
else
cmd_help(argc, argv);
exit(0);
diff --git a/help.c b/help.c
index 311a4320..ef7986b4 100644
--- a/help.c
+++ b/help.c
@@ -196,8 +196,8 @@ static int do_usage_one_command(const char * const 
*usagestr,
 }
 
 static int usage_command_internal(const char * const *usagestr,
- const char *token, int full, int lst,
- int alias, FILE *outf)
+ const char *token, bool full, bool lst,
+ bool alias, FILE *outf)
 {
unsigned int flags = 0;
int ret;
@@ -223,17 +223,17 @@ static int usage_command_internal(const char * const 
*usagestr,
 }
 
 static void usage_command_usagestr(const char * const *usagestr,
-  const char *token, int full, int err)
+  const char *token, bool full, bool err)
 {
FILE *outf = err ? stderr : stdout;
int ret;
 
-   ret = usage_command_internal(usagestr, token, full, 0, 0, outf);
+   ret = usage_command_internal(usagestr, token, full, false, false, outf);
if (!ret)
fputc('\n', outf);
 }
 
-void usage_command(const struct cmd_struct *cmd, int full, int err)
+void usage_command(const struct cmd_struct *cmd, bool full, bool err)
 {
usage_command_usagestr(cmd->usagestr, cmd->token, full, err);
 }
@@ -241,11 +241,11 @@ void usage_command(const struct cmd_struct *cmd, int 
full, int err)
 __attribute__((noreturn))
 void usage(const char * const *usagestr)
 {
-   usage_command_usagestr(usagestr, NULL, 1, 1);
+   usage_command_usagestr(usagestr, NULL, true, true);
exit(1);
 }
 
-static void usage_command_group_internal(const struct cmd_group *grp, int full,
+static void usage_command_group_internal(const struct cmd_group *grp, bool 
full,
 FILE *outf)
 {
const struct cmd_struct *cmd = grp->commands;
@@ -265,7 +265,8 @@ static void usage_command_group_internal(const struct 
cmd_group *grp, int full,
}
 
usage_command_internal(cmd->usagestr, cmd->token, full,
-  1, cmd->flags & CMD_ALIAS, outf);
+  true, cmd->flags & CMD_ALIAS,
+  outf);
if (cmd->flags & CMD_ALIAS)
putchar('\n');
continue;
@@ -327,7 +328,7 @@ void usage_command_group_short(const struct cmd_group *grp)
fprintf(stderr, "All command groups have their manual page named 
'btrfs-'.\n");
 }
 
-void usage_command_group(const struct cmd_group *grp, int full, int err)
+void usage_command_group(const struct cmd_group *grp, bool full, bool err)
 {
const char * const *usagestr = grp->usagestr;
FILE *outf = err ? stderr : stdout;
@@ -350,7 +351,7 @@ __attribute__((noreturn))
 void help_unknown_token(const char *arg, const struct cmd_group *grp)
 {
fprintf(stderr, "%s: unknown token '%s'\n", 

Re: [PATCH v2 00/12] mkfs: Quota support through -R|--runtime quota

2018-03-07 Thread Qu Wenruo
Ping again.

Since David is planning to merge qgroup patchset, this feature would
greatly improve test coverage.

Thanks,
Qu

On 2018年01月11日 14:04, Qu Wenruo wrote:
> Ping?
> 
> Or do I need to rebase the patchset?
> 
> Thanks,
> Qu
> 
> On 2017年11月07日 16:42, Qu Wenruo wrote:
>> Can be fetched from github:
>> https://github.com/adam900710/btrfs-progs/tree/mkfs_qgroup
>>
>> This patchset adds quota support, which means the result fs will have
>> quota enabled by default, and its accounting is already consistent, no
>> manually rescan or quota enable is needed.
>>
>> The overall design of such support is:
>> 1) Create needed tree
>>Both btrfs_root and real root item and tree root leaf.
>>For this, a new infrastructure, btrfs_create_tree(), is added for
>>this.
>>
>> 2) Fill quota root with basic skeleton
>>Only 3 items are really needed
>>a) global quota status item
>>b) quota info for specified qgroup
>>c) quota limit for specified qgroup
>>
>>Currently only 0/5 qgroup is passed.
>>If we're going to support extra subvolume at mkfs time, just pass the
>>subvolume id into insert_qgroup_items().
>>
>>The content doesn't matter at all.
>>
>> 3) Repair qgroups using infrastructure from qgroup-verify
>>In fact, qgroup repair is just offline rescan.
>>Although the original qgroup-verify infrastructure is mostly noisy,
>>modify it a little to make it silent to function as offline quota
>>rescan.
>>
>> And such support is mainly designed for developers and QA guys.
>>
>> As to enable quota, before we must normally mount the fs, enable quota
>> (and rescan if needed).
>> This ioctl based procedure is not common, and fstests doesn't provide
>> such support.
>>
>> There are several attempts to make fstests to support it, but due to
>> different reasons, all these attempts failed.
>>
>> To make it easier to test all existing test cases with btrfs quota
>> enabled, the current best method is to support quota at mkfs time, and
>> here comes the patchset.
>>
>>
>> BTW with -R|--runtime-features, we have several possible target to add.
>> Not limited to such ioctl based operation, but also mount option based
>> ones.
>> Like space-cache-tree (space_cache=v2).
>>
>>
>> Qu Wenruo (12):
>>   btrfs-progs: qgroup-verify: Also repair qgroup status version
>>   btrfs-progs: qgroup-verify: Use fs_info->readonly to check if we
>> should repair qgroups
>>   btrfs-progs: qgroup-verify: Move qgroup classification out of
>> report_qgroups
>>   btrfs-progs: qgroup-verify: Allow repair_qgroups function to do silent
>> repair
>>   btrfs-progs: ctree: Introduce function to create an empty tree
>>   btrfs-progs: mkfs: Introduce function to insert qgroup info and limit
>> items
>>
>>   ^^^ Above patches are not modified at all ^^^
>>   vvv Modification starts below vvv
>>
>>   btrfs-progs: mkfs: Introduce function to setup quota root and rescan
>>   btrfs-progs: fsfeatures: Introduce a new set of features,
>> runtime_features
>>   btrfs-progs: mkfs: Introduce --runtime-features option
>>   btrfs-progs: mkfs: Introduce quota runtime feature
>>   btrfs-progs: test/mkfs: Add test case for -R quota option
>>   btrfs-progs: test/mkfs: Add test case for --rootdir and -R quota
>>
>>  Documentation/mkfs.btrfs.asciidoc  |  23 +++
>>  cmds-check.c   |   2 +-
>>  convert/main.c |   4 +-
>>  ctree.c| 109 ++
>>  ctree.h|   3 +
>>  fsfeatures.c   | 131 ++---
>>  fsfeatures.h   |  10 +-
>>  mkfs/main.c| 194 
>> ++---
>>  qgroup-verify.c|  51 +--
>>  qgroup-verify.h|   2 +-
>>  tests/mkfs-tests/001-basic-profiles/test.sh|  10 ++
>>  tests/mkfs-tests/010-rootdir-and-quota/test.sh |  51 +++
>>  12 files changed, 529 insertions(+), 61 deletions(-)
>>  create mode 100755 tests/mkfs-tests/010-rootdir-and-quota/test.sh
>>
> 



signature.asc
Description: OpenPGP digital signature


[PATCH] fstests: test regression of -EEXIST on creating new file after log replay

2018-03-07 Thread Liu Bo
The regression is introduced to btrfs in linux v4.4 and it refuses to create
new files after log replay by returning -EEXIST.

Although the problem is on btrfs only, there is no btrfs stuff in terms of
test, so this makes it generic.

The kernel fix is
  Btrfs: fix unexpected -EEXIST when creating new inode

Signed-off-by: Liu Bo 
---
 tests/generic/481 | 75 +++
 tests/generic/481.out |  5 
 tests/generic/group   |  1 +
 3 files changed, 81 insertions(+)
 create mode 100755 tests/generic/481
 create mode 100644 tests/generic/481.out

diff --git a/tests/generic/481 b/tests/generic/481
new file mode 100755
index 000..8d8bb2b
--- /dev/null
+++ b/tests/generic/481
@@ -0,0 +1,75 @@
+#! /bin/bash
+# FSQA Test No. 481
+#
+# Reproduce a regression of btrfs that leads to -EEXIST on creating new files
+# after log replay.
+#
+# The kernel fix is
+#   Btrfs: fix unexpected -EEXIST when creating new inode
+#
+#---
+#
+# Copyright (C) 2018 Oracle. All Rights Reserved.
+# Author: Bo Liu 
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+_cleanup()
+{
+   _cleanup_flakey
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+. ./common/dmflakey
+
+# real QA test starts here
+_supported_fs generic
+_supported_os Linux
+_require_scratch
+_require_dm_target flakey
+
+rm -f $seqres.full
+
+_scratch_mkfs >>$seqres.full 2>&1
+_init_flakey
+_mount_flakey
+
+# create a file and keep it in write ahead log
+$XFS_IO_PROG -f -c "pwrite 0 4k" -c "fsync" $SCRATCH_MNT/foo | _filter_xfs_io
+
+# fail this filesystem so that remount can replay the write ahead log
+_flakey_drop_and_remount
+
+# see if we can create a new file successfully
+touch $SCRATCH_MNT/bar
+
+_unmount_flakey
+
+echo "Silence is golden"
+
+status=0
+exit
diff --git a/tests/generic/481.out b/tests/generic/481.out
new file mode 100644
index 000..66a6345
--- /dev/null
+++ b/tests/generic/481.out
@@ -0,0 +1,5 @@
+QA output created by 481
+wrote 4096/4096 bytes at offset 0
+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+touch: cannot touch '/mnt/scratch/bar': File exists
+Silence is golden
diff --git a/tests/generic/group b/tests/generic/group
index ea2056b..05f60f2 100644
--- a/tests/generic/group
+++ b/tests/generic/group
@@ -483,3 +483,4 @@
 478 auto quick
 479 auto quick metadata
 480 auto quick metadata
+481 auto quick
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: qgroups, properly handle no reservations

2018-03-07 Thread Qu Wenruo


On 2018年03月08日 00:02, David Sterba wrote:
> On Thu, Feb 22, 2018 at 10:05:36AM +0800, Qu Wenruo wrote:
>>
>>
>> On 2018年02月22日 09:50, Jeff Mahoney wrote:
>>> On 2/21/18 8:36 PM, Qu Wenruo wrote:


 On 2018年02月22日 04:19, je...@suse.com wrote:
> From: Jeff Mahoney 
>
> There are several places where we call btrfs_qgroup_reserve_meta and
> assume that a return value of 0 means that the reservation was successful.
>
> Later, we use the original bytes value passed to that call to free
> bytes during error handling or to pass the number of bytes reserved to
> the caller.
>
> This patch returns -ENODATA when we don't perform a reservation so that
> callers can make the distinction.  This also lets call sites not
> necessarily care whether qgroups are enabled.

 IMHO if we don't need to reserve, returning 0 seems good enough.
 Caller doesn't really need to care if it has reserved some bytes.

 Or is there any special case where we need to distinguish such case?
>>>
>>> Anywhere where the reservation takes place prior to the transaction
>>> starting, which is pretty much everywhere.  We wait until transaction
>>> commit to flip the bit to turn on quotas, which means that if a
>>> transaction commits that enables quotas lands in between the reservation
>>> being take and any error handling that involves freeing the reservation,
>>> we'll end up with an underflow.
>>
>> So the same case as btrfs_qgroup_reserve_data().
>>
>> In that case we could use ret > 0 to indicates the real bytes we
>> reserved, instead of -ENODATA which normally means error.
>>
>>>
>>> This is the first patch of a series I'm working on, but it can stand
>>> alone.  The rest is the patch set I mentioned when we talked a few
>>> months ago where the lifetimes of reservations are incorrect.  We can't
>>> just drop all the reservations at the end of the transaction because 1)
>>> the lifetime of some reservations can cross transactions and 2) because
>>> especially in the start_transaction case, we do the reservation prior to
>>> waiting to join the transaction.  So if the transaction we're waiting on
>>> commits, our reservation goes away with it but we continue on as if we
>>> still have it.
>>
>> Right, the same problem I also addressed in patchset "[PATCH v2 00/10]
>> Use split qgroup rsv type".
>>
>> In 6th patch, "[PATCH v2 06/10] btrfs: qgroup: Use
>> root->qgroup_meta_rsv_* to record qgroup meta reserved space" qgroup
>> meta reserve will only be increased if we succeeded in reserving
>> metadata, so later free won't underflow that number.
> 
> What should we do now when there are 2 different fixes? Applying Jeff's
> patch on top of your qgroup-types patches causes some conflicts that do
> not seem to be difficult, but the end result might not work as expected.

Jeff's patch is a more pinpoint solution to handle metadata reservation
error, while mine is a generic one which adds another layer to catch
possible underflow.

I think both could co-exist.

The only thing I'm not a fan is the return value of -ENODATA.
Despite that it should be fine.

Thanks,
Qu

> 
> The changes do not seem to be fundamentally conflicting, would it make
> sense to merge both?
> The patchset has been in for-next for a while but I
> don't run qgroup specific tests besides what's in fstests. Also the
> patchset fixes more problems so I think we need to merge it at some
> point and now it's a good time about deciding whether it'd go to 4.17.
> 
> I did a pass through the patches, there are some minor things to fix but
> a review from somebody with qgroup knowledge would be still desirable.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 3/8] btrfs-progs: constify pathnames passed as arguments

2018-03-07 Thread Jeff Mahoney
On 3/7/18 3:17 AM, Nikolay Borisov wrote:
> 
> 
> On  2.03.2018 20:46, je...@suse.com wrote:
>> From: Jeff Mahoney 
>>
>> It's unlikely we're going to modify a pathname argument, so codify that
>> and use const.
>>
>> Signed-off-by: Jeff Mahoney 
>> ---
>>  chunk-recover.c | 4 ++--
>>  cmds-device.c   | 2 +-
>>  cmds-fi-usage.c | 6 +++---
>>  cmds-rescue.c   | 4 ++--
>>  send-utils.c| 4 ++--
>>  5 files changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/chunk-recover.c b/chunk-recover.c
>> index 705bcf52..1d30db51 100644
>> --- a/chunk-recover.c
>> +++ b/chunk-recover.c
>> @@ -1492,7 +1492,7 @@ out:
>>  return ERR_PTR(ret);
>>  }
>>  
>> -static int recover_prepare(struct recover_control *rc, char *path)
>> +static int recover_prepare(struct recover_control *rc, const char *path)
>>  {
>>  int ret;
>>  int fd;
>> @@ -2296,7 +2296,7 @@ static void validate_rebuild_chunks(struct 
>> recover_control *rc)
>>  /*
>>   * Return 0 when successful, < 0 on error and > 0 if aborted by user
>>   */
>> -int btrfs_recover_chunk_tree(char *path, int verbose, int yes)
>> +int btrfs_recover_chunk_tree(const char *path, int verbose, int yes)
>>  {
>>  int ret = 0;
>>  struct btrfs_root *root = NULL;
>> diff --git a/cmds-device.c b/cmds-device.c
>> index 86459d1b..a49c9d9d 100644
>> --- a/cmds-device.c
>> +++ b/cmds-device.c
>> @@ -526,7 +526,7 @@ static const char * const cmd_device_usage_usage[] = {
>>  NULL
>>  };
>>  
>> -static int _cmd_device_usage(int fd, char *path, unsigned unit_mode)
>> +static int _cmd_device_usage(int fd, const char *path, unsigned unit_mode)
> 
> Actually the path parameter is not used in this function at all, I'd say
> just remove it.

Yep, it's unused, but that's a different project.  Add
-Wunused-parameter and see what shakes out. :)

>>  {
>>  int i;
>>  int ret = 0;> diff --git a/cmds-fi-usage.c b/cmds-fi-usage.c
>> index de7ad668..9a1c76ab 100644
>> --- a/cmds-fi-usage.c
>> +++ b/cmds-fi-usage.c
>> @@ -227,7 +227,7 @@ static int cmp_btrfs_ioctl_space_info(const void *a, 
>> const void *b)
>>  /*
>>   * This function load all the information about the space usage
>>   */
>> -static struct btrfs_ioctl_space_args *load_space_info(int fd, char *path)
>> +static struct btrfs_ioctl_space_args *load_space_info(int fd, const char 
>> *path)
>>  {
>>  struct btrfs_ioctl_space_args *sargs = NULL, *sargs_orig = NULL;
>>  int ret, count;
>> @@ -305,7 +305,7 @@ static void get_raid56_used(struct chunk_info *chunks, 
>> int chunkcount,
>>  #define MIN_UNALOCATED_THRESH   SZ_16M
>>  static int print_filesystem_usage_overall(int fd, struct chunk_info 
>> *chunkinfo,
>>  int chunkcount, struct device_info *devinfo, int devcount,
>> -char *path, unsigned unit_mode)
>> +const char *path, unsigned unit_mode)
>>  {
>>  struct btrfs_ioctl_space_args *sargs = NULL;
>>  int i;
>> @@ -931,7 +931,7 @@ static void _cmd_filesystem_usage_linear(unsigned 
>> unit_mode,
>>  static int print_filesystem_usage_by_chunk(int fd,
>>  struct chunk_info *chunkinfo, int chunkcount,
>>  struct device_info *devinfo, int devcount,
>> -char *path, unsigned unit_mode, int tabular)
>> +const char *path, unsigned unit_mode, int tabular)
>>  {
>>  struct btrfs_ioctl_space_args *sargs;
>>  int ret = 0;
>> diff --git a/cmds-rescue.c b/cmds-rescue.c
>> index c40088ad..c61145bc 100644
>> --- a/cmds-rescue.c
>> +++ b/cmds-rescue.c
>> @@ -32,8 +32,8 @@ static const char * const rescue_cmd_group_usage[] = {
>>  NULL
>>  };
>>  
>> -int btrfs_recover_chunk_tree(char *path, int verbose, int yes);
>> -int btrfs_recover_superblocks(char *path, int verbose, int yes);
>> +int btrfs_recover_chunk_tree(const char *path, int verbose, int yes);
> 
> That path argument is being passed to recover_prepare which can alo use
> a const to its path parameter

Yep, and it was in the first chunk.

>> +int btrfs_recover_superblocks(const char *path, int verbose, int yes);
>>  
>>  static const char * const cmd_rescue_chunk_recover_usage[] = {
>>  "btrfs rescue chunk-recover [options] ",
>> diff --git a/send-utils.c b/send-utils.c
>> index b5289e76..8ce94de1 100644
>> --- a/send-utils.c
>> +++ b/send-utils.c
>> @@ -28,8 +28,8 @@
>>  #include "ioctl.h"
>>  #include "btrfs-list.h"
>>  
>> -static int btrfs_subvolid_resolve_sub(int fd, char *path, size_t *path_len,
>> -  u64 subvol_id);
>> +static int btrfs_subvolid_resolve_sub(int fd, char *path,
>> +  size_t *path_len, u64 subvol_id);
> 
> This seems like an unrelated change. As a matter of fact
> btrfs_subvolid_resolve_sub is used only by btrfs_subvolid_resolve. So if
> you move the latter after the former then you can drop the declaration
> at the beginning of the file altogether.

Ah, yep.  That's the fallout from adding const, reformatting, and

Re: [PATCH] Improve error stats message

2018-03-07 Thread Hugo Mills
On Wed, Mar 07, 2018 at 08:02:51PM +0100, Diego wrote:
> El miércoles, 7 de marzo de 2018 19:24:53 (CET) Hugo Mills escribió:
> >On multi-device filesystems, the two are not necessarily the same.
> 
> Ouch. FWIW, I was moved to do this because I saw this conversation on
> IRC which made me think that people aren't understanding what the
> message means:
> 
>hi! I noticed bdev rd 13  as a kernel message
>what does it mean
>Well, that's not the whole message.
>Can you paste the whole line in here? (Just one line)
   ^^ nick2... that would be me. :)

>[3.404959] BTRFS info (device sda4): bdev /dev/sda4 errs: 
> wr 0, rd 13, flush 0, corrupt 0, gen 0
> 
> 
> Maybe something like this would be better:
> 
> BTRFS info (device sda4): disk /dev/sda4 errors: write 0, read 13, flush 0, 
> corrupt 0, generation 0

   I think the single most helpful modification here would be to
change "device" to "fs on", to show that it's only an indicator of the
filesystem ID, rather than actually the device on which the errors
occurred. The others I'm not really bothered about, personally.

   Hugo.

> ---
>  fs/btrfs/volumes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 2ceb924ca0d6..cfa029468585 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -7239,7 +7239,7 @@ static void btrfs_dev_stat_print_on_error(struct 
> btrfs_device *dev)
>   if (!dev->dev_stats_valid)
>   return;
>   btrfs_err_rl_in_rcu(dev->fs_info,
> - "bdev %s errs: wr %u, rd %u, flush %u, corrupt %u, gen %u",
> + "disk %s errors: write %u, read %u, flush %u, corrupt %u, 
> generation %u",
>  rcu_str_deref(dev->name),
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_WRITE_ERRS),
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_READ_ERRS),

-- 
Hugo Mills | Q: What goes, "Pieces of seven! Pieces of seven!"?
hugo@... carfax.org.uk | A: A parroty error.
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Re: [PATCH 6/8] btrfs-progs: qgroups: introduce btrfs_qgroup_query

2018-03-07 Thread Jeff Mahoney
On 3/7/18 3:02 AM, Misono, Tomohiro wrote:
> On 2018/03/03 3:47, je...@suse.com wrote:
>> From: Jeff Mahoney 
>>
>> The only mechanism we have in the progs for searching qgroups is to load
>> all of them and filter the results.  This works for qgroup show but
>> to add quota information to 'btrfs subvoluem show' it's pretty wasteful.
>>
>> This patch splits out setting up the search and performing the search so
>> we can search for a single qgroupid more easily.
>>
>> Signed-off-by: Jeff Mahoney 
>> ---
>>  qgroup.c | 98 
>> +---
>>  qgroup.h |  7 +
>>  2 files changed, 77 insertions(+), 28 deletions(-)
>>
>> diff --git a/qgroup.c b/qgroup.c
>> index b1be3311..2d0a6947 100644
>> --- a/qgroup.c
>> +++ b/qgroup.c
>> @@ -1146,11 +1146,11 @@ static inline void print_status_flag_warning(u64 
>> flags)
>>  warning("qgroup data inconsistent, rescan recommended");
>>  }
>>  
>> -static int __qgroups_search(int fd, struct qgroup_lookup *qgroup_lookup)
>> +static int __qgroups_search(int fd, struct btrfs_ioctl_search_args *args,
>> +struct qgroup_lookup *qgroup_lookup)
>>  {
>>  int ret;
>> -struct btrfs_ioctl_search_args args;
>> -struct btrfs_ioctl_search_key *sk = 
>> +struct btrfs_ioctl_search_key *sk = >key;
>>  struct btrfs_ioctl_search_header *sh;
>>  unsigned long off = 0;
>>  unsigned int i;
>> @@ -1161,30 +1161,12 @@ static int __qgroups_search(int fd, struct 
>> qgroup_lookup *qgroup_lookup)
>>  u64 qgroupid;
>>  u64 qgroupid1;
>>  
>> -memset(, 0, sizeof(args));
>> -
>> -sk->tree_id = BTRFS_QUOTA_TREE_OBJECTID;
>> -sk->max_type = BTRFS_QGROUP_RELATION_KEY;
>> -sk->min_type = BTRFS_QGROUP_STATUS_KEY;
>> -sk->max_objectid = (u64)-1;
>> -sk->max_offset = (u64)-1;
>> -sk->max_transid = (u64)-1;
>> -sk->nr_items = 4096;
>> -
>>  qgroup_lookup_init(qgroup_lookup);
>>  
>>  while (1) {
>> -ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, );
>> +ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, args);
>>  if (ret < 0) {
>> -if (errno == ENOENT) {
>> -error("can't list qgroups: quotas not enabled");
>> -ret = -ENOTTY;
>> -} else {
>> -error("can't list qgroups: %s",
>> -   strerror(errno));
>> -ret = -errno;
>> -}
>> -
>> +ret = -errno;
> 
> Originally, -ENOTTY would be returned when qgroup is disabled
> but this changes to return -ENOENT. so, it seems that error check
> in 7th patch would not work correctly when qgroup is disabled.
> 

Good catch.

Thanks,

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 7/8] btrfs-progs: subvolume: add quota info to btrfs sub show

2018-03-07 Thread Jeff Mahoney
On 3/7/18 1:09 AM, Qu Wenruo wrote:
> 
> 
> On 2018年03月03日 02:47, je...@suse.com wrote:
>> From: Jeff Mahoney 
>>
>> This patch reports on the first-level qgroup, if any, associated with
>> a particular subvolume.  It displays the usage and limit, subject
>> to the usual unit parameters.
>>
>> Signed-off-by: Jeff Mahoney 
>> ---
>>  cmds-subvolume.c | 46 ++
>>  1 file changed, 46 insertions(+)
>>
>> diff --git a/cmds-subvolume.c b/cmds-subvolume.c
>> index 8a473f7a..29d0e0e5 100644
>> --- a/cmds-subvolume.c
>> +++ b/cmds-subvolume.c
>> @@ -972,6 +972,7 @@ static const char * const cmd_subvol_show_usage[] = {
>>  "Show more information about the subvolume",
>>  "-r|--rootid   rootid of the subvolume",
>>  "-u|--uuid uuid of the subvolume",
>> +HELPINFO_UNITS_SHORT_LONG,
>>  "",
>>  "If no option is specified,  will be shown, otherwise",
>>  "the rootid or uuid are resolved relative to the  path.",
>> @@ -993,6 +994,13 @@ static int cmd_subvol_show(int argc, char **argv)
>>  int by_uuid = 0;
>>  u64 rootid_arg;
>>  u8 uuid_arg[BTRFS_UUID_SIZE];
>> +struct btrfs_qgroup_stats stats;
>> +unsigned int unit_mode;
>> +const char *referenced_size;
>> +const char *referenced_limit_size = "-";
>> +unsigned field_width = 0;
>> +
>> +unit_mode = get_unit_mode_from_arg(, argv, 1);
>>  
>>  while (1) {
>>  int c;
>> @@ -1112,6 +1120,44 @@ static int cmd_subvol_show(int argc, char **argv)
>>  btrfs_list_subvols_print(fd, filter_set, NULL, BTRFS_LIST_LAYOUT_RAW,
>>  1, raw_prefix);
>>  
>> +ret = btrfs_qgroup_query(fd, get_ri.root_id, );
>> +if (ret < 0) {
>> +if (ret == -ENODATA)
>> +printf("Quotas must be enabled for per-subvolume 
>> usage\n");
> 
> This seems a little confusing.
> If quota is disabled, we get ENOTTY not ENODATA.
> 
> For ENODATA we know quota is enabled but just no info for this qgroup.

Yep.

Thanks,

-Jeff


> Thanks,
> Qu
> 
>> +else if (ret != -ENOTTY)
>> +fprintf(stderr,
>> +"\nERROR: BTRFS_IOC_QUOTA_QUERY failed: %s\n",
>> +strerror(errno));
>> +goto out;
>> +}
>> +
>> +printf("\tQuota Usage:\t\t");
>> +fflush(stdout);
>> +
>> +referenced_size = pretty_size_mode(stats.info.referenced, unit_mode);
>> +if (stats.limit.max_referenced)
>> +   referenced_limit_size = pretty_size_mode(
>> +stats.limit.max_referenced,
>> +unit_mode);
>> +field_width = max(strlen(referenced_size),
>> +  strlen(referenced_limit_size));
>> +
>> +printf("%-*s referenced, %s exclusive\n ", field_width,
>> +   referenced_size,
>> +   pretty_size_mode(stats.info.exclusive, unit_mode));
>> +
>> +printf("\tQuota Limits:\t\t");
>> +if (stats.limit.max_referenced || stats.limit.max_exclusive) {
>> +const char *excl = "-";
>> +
>> +if (stats.limit.max_exclusive)
>> +   excl = pretty_size_mode(stats.limit.max_exclusive,
>> +   unit_mode);
>> +printf("%-*s referenced, %s exclusive\n", field_width,
>> +   referenced_limit_size, excl);
>> +} else
>> +printf("None\n");
>> +
>>  out:
>>  /* clean up */
>>  free(get_ri.path);
>>
> 


-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


[PATCH v2] Btrfs: scrub: batch rebuild for raid56

2018-03-07 Thread Liu Bo
In case of raid56, writes and rebuilds always take BTRFS_STRIPE_LEN(64K)
as unit, however, scrub_extent() sets blocksize as unit, so rebuild
process may be triggered on every block on a same stripe.

A typical example would be that when we're replacing a disappeared disk,
all reads on the disks get -EIO, every block (size is 4K if blocksize is
4K) would go thru these,

scrub_handle_errored_block
  scrub_recheck_block # re-read pages one by one
  scrub_recheck_block # rebuild by calling raid56_parity_recover()
page by page

Although with raid56 stripe cache most of reads during rebuild can be
avoided, the parity recover calculation(xor or raid6 algorithms) needs to
be done $(BTRFS_STRIPE_LEN / blocksize) times.

This makes it smarter by doing raid56 scrub/replace on stripe length.

Signed-off-by: Liu Bo 
---
v2: - Place bio allocation in code statement.
- Get rid of bio_set_op_attrs.
- Add SOB.

 fs/btrfs/scrub.c | 79 +++-
 1 file changed, 61 insertions(+), 18 deletions(-)

diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index ec56f33..3ccabad 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -1718,6 +1718,45 @@ static int scrub_submit_raid56_bio_wait(struct 
btrfs_fs_info *fs_info,
return blk_status_to_errno(bio->bi_status);
 }
 
+static void scrub_recheck_block_on_raid56(struct btrfs_fs_info *fs_info,
+ struct scrub_block *sblock)
+{
+   struct scrub_page *first_page = sblock->pagev[0];
+   struct bio *bio;
+   int page_num;
+
+   /* All pages in sblock belong to the same stripe on the same device. */
+   ASSERT(first_page->dev);
+   if (!first_page->dev->bdev)
+   goto out;
+
+   bio = btrfs_io_bio_alloc(BIO_MAX_PAGES);
+   bio_set_dev(bio, first_page->dev->bdev);
+
+   for (page_num = 0; page_num < sblock->page_count; page_num++) {
+   struct scrub_page *page = sblock->pagev[page_num];
+
+   WARN_ON(!page->page);
+   bio_add_page(bio, page->page, PAGE_SIZE, 0);
+   }
+
+   if (scrub_submit_raid56_bio_wait(fs_info, bio, first_page)) {
+   bio_put(bio);
+   goto out;
+   }
+
+   bio_put(bio);
+
+   scrub_recheck_block_checksum(sblock);
+
+   return;
+out:
+   for (page_num = 0; page_num < sblock->page_count; page_num++)
+   sblock->pagev[page_num]->io_error = 1;
+
+   sblock->no_io_error_seen = 0;
+}
+
 /*
  * this function will check the on disk data for checksum errors, header
  * errors and read I/O errors. If any I/O errors happen, the exact pages
@@ -1733,6 +1772,10 @@ static void scrub_recheck_block(struct btrfs_fs_info 
*fs_info,
 
sblock->no_io_error_seen = 1;
 
+   /* short cut for raid56 */
+   if (!retry_failed_mirror && scrub_is_page_on_raid56(sblock->pagev[0]))
+   return scrub_recheck_block_on_raid56(fs_info, sblock);
+
for (page_num = 0; page_num < sblock->page_count; page_num++) {
struct bio *bio;
struct scrub_page *page = sblock->pagev[page_num];
@@ -1748,19 +1791,12 @@ static void scrub_recheck_block(struct btrfs_fs_info 
*fs_info,
bio_set_dev(bio, page->dev->bdev);
 
bio_add_page(bio, page->page, PAGE_SIZE, 0);
-   if (!retry_failed_mirror && scrub_is_page_on_raid56(page)) {
-   if (scrub_submit_raid56_bio_wait(fs_info, bio, page)) {
-   page->io_error = 1;
-   sblock->no_io_error_seen = 0;
-   }
-   } else {
-   bio->bi_iter.bi_sector = page->physical >> 9;
-   bio_set_op_attrs(bio, REQ_OP_READ, 0);
+   bio->bi_iter.bi_sector = page->physical >> 9;
+   bio->bi_opf = REQ_OP_READ;
 
-   if (btrfsic_submit_bio_wait(bio)) {
-   page->io_error = 1;
-   sblock->no_io_error_seen = 0;
-   }
+   if (btrfsic_submit_bio_wait(bio)) {
+   page->io_error = 1;
+   sblock->no_io_error_seen = 0;
}
 
bio_put(bio);
@@ -2728,7 +2764,8 @@ static int scrub_find_csum(struct scrub_ctx *sctx, u64 
logical, u8 *csum)
 }
 
 /* scrub extent tries to collect up to 64 kB for each bio */
-static int scrub_extent(struct scrub_ctx *sctx, u64 logical, u64 len,
+static int scrub_extent(struct scrub_ctx *sctx, struct map_lookup *map,
+   u64 logical, u64 len,
u64 physical, struct btrfs_device *dev, u64 flags,
u64 gen, int mirror_num, u64 physical_for_dev_replace)
 {
@@ -2737,13 +2774,19 @@ static int scrub_extent(struct scrub_ctx *sctx, u64 
logical, u64 len,
u32 blocksize;
 

Inconsistence between sender and receiver

2018-03-07 Thread Liu Bo
Hi,

In the following steps[1], if  on receiver side has got
changed via 'btrfs property set', then after doing incremental
updates, receiver gets a different snapshot from what sender has sent.

The reason behind it is that there is no change about file 'foo' in
the send stream, such that receiver simply creates a snapshot of
 on its side with nothing to apply from the send stream.

A possible way to avoid this is to check rtransid and ctranid of
 on receiver side, but I'm not very sure whether the current
behavior is made deliberately, does anyone have an idea? 

Thanks,

-liubo

[1]:
$ btrfs sub create /mnt/send/sub
$ touch /mnt/send/sub/foo
$ btrfs sub snap -r /mnt/send/sub /mnt/send/parent

# send parent out
$ btrfs send /mnt/send/parent | btrfs receive /mnt/recv/

# change parent and file under it
$ btrfs property set -t subvol /mnt/recv/parent ro false
$ truncate -s 4096 /mnt/recv/parent/foo

$ btrfs sub snap -r /mnt/send/sub /mnt/send/update
$ btrfs send -p /mnt/send/parent /mnt/send/update | btrfs receive /mnt/recv

$ ls -l /mnt/send/update
total 0
-rw-r--r-- 1 root root 0 Mar  6 11:13 foo

$ ls -l /mnt/recv/update
total 0
-rw-r--r-- 1 root root 4096 Mar  6 11:14 foo

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] btrfs-progs: qgroups: introduce btrfs_qgroup_query

2018-03-07 Thread Jeff Mahoney
On 3/7/18 12:58 AM, Qu Wenruo wrote:
> 
> 
> On 2018年03月03日 02:47, je...@suse.com wrote:
>> diff --git a/qgroup.c b/qgroup.c
>> index b1be3311..2d0a6947 100644
>> --- a/qgroup.c
>> +++ b/qgroup.c
>> @@ -1267,6 +1249,66 @@ static int __qgroups_search(int fd, struct 
>> qgroup_lookup *qgroup_lookup)
>>  return ret;
>>  }
>>  
>> +static int qgroups_search_all(int fd, struct qgroup_lookup *qgroup_lookup)
>> +{
>> +struct btrfs_ioctl_search_args args = {
>> +.key = {
>> +.tree_id = BTRFS_QUOTA_TREE_OBJECTID,
>> +.max_type = BTRFS_QGROUP_RELATION_KEY,
>> +.min_type = BTRFS_QGROUP_STATUS_KEY,
>> +.max_objectid = (u64)-1,
>> +.max_offset = (u64)-1,
>> +.max_transid = (u64)-1,
>> +.nr_items = 4096,
>> +},
>> +};
>> +int ret;
>> +
>> +ret = __qgroups_search(fd, , qgroup_lookup);
>> +if (ret == -ENOTTY)
>> +error("can't list qgroups: quotas not enabled");
>> +else if (ret < 0)
>> +error("can't list qgroups: %s", strerror(-ret));
>> +return ret;
>> +}
>> +
>> +int btrfs_qgroup_query(int fd, u64 qgroupid, struct btrfs_qgroup_stats 
>> *stats)
>> +{
>> +struct btrfs_ioctl_search_args args = {
>> +.key = {
>> +.tree_id = BTRFS_QUOTA_TREE_OBJECTID,
>> +.min_type = BTRFS_QGROUP_INFO_KEY,
>> +.max_type = BTRFS_QGROUP_LIMIT_KEY,
>> +.max_objectid = 0,
>> +.max_offset = qgroupid,
>> +.max_transid = (u64)-1,
>> +.nr_items = 4096, /* should be 2, i think */
> 
> 2 is not correct in fact.
> 
> As QGROUP_INFO is smaller than QGROUP_LIMIT, to get a slice of all what
> we need, we need to include all other unrelated items.
> 
> One example will be:
>   item 1 key (0 QGROUP_INFO 0/5) itemoff 16211 itemsize 40
>   item 2 key (0 QGROUP_INFO 0/257) itemoff 16171 itemsize 40
>   item 3 key (0 QGROUP_INFO 1/1) itemoff 16131 itemsize 40
>   item 4 key (0 QGROUP_LIMIT 0/5) itemoff 16091 itemsize 40
>   item 5 key (0 QGROUP_LIMIT 0/257) itemoff 16051 itemsize 40
>   item 6 key (0 QGROUP_LIMIT 1/1) itemoff 16011 itemsize 40
> 
> To query qgroup info about 0/257, above setup will get the following slice:
>   item 1 key (0 QGROUP_INFO 0/5) itemoff 16211 itemsize 40
>   item 2 key (0 QGROUP_INFO 0/257) itemoff 16171 itemsize 40
>   item 3 key (0 QGROUP_INFO 1/1) itemoff 16131 itemsize 40
>   item 4 key (0 QGROUP_LIMIT 0/5) itemoff 16091 itemsize 40
>   item 5 key (0 QGROUP_LIMIT 0/257) itemoff 16051 itemsize 40
> So we still need that large @nr_items.
> 
> Despite this comment it looks good.

Of course.  I use TREE_SEARCH so infrequently that I forget about this
every time so the pain is always fresh.

It should be .min_offset = qgroupid, .nr_items = 2, but of course that
doesn't work either for different reasons.  __qgroups_search's loop will
loop until it comes back with no more results and it sets the nr_items
itself to 4096 at the end of the loop.  The key comparison in the ioctl
only does a regular key comparison and offset doesn't get evaluated if
the types aren't equal.  That works fine when doing tree insertion or
searches for a single key but is wrong for searching for a range.  I
have a TREE_SEARCH_V3 lying around somewhere to address this ridiculous
behavior and should probably finish it up at some point.

This hasn't mattered for __qgroup_search until now since it hasn't used
anything other than -1 for the offset and objectid so I'll just add a
filter there.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Improve error stats message

2018-03-07 Thread Diego
El miércoles, 7 de marzo de 2018 19:24:53 (CET) Hugo Mills escribió:
>On multi-device filesystems, the two are not necessarily the same.

Ouch. FWIW, I was moved to do this because I saw this conversation on
IRC which made me think that people aren't understanding what the
message means:

   hi! I noticed bdev rd 13  as a kernel message
   what does it mean
   Well, that's not the whole message.
   Can you paste the whole line in here? (Just one line)
   [3.404959] BTRFS info (device sda4): bdev /dev/sda4 errs: wr 
0, rd 13, flush 0, corrupt 0, gen 0


Maybe something like this would be better:

BTRFS info (device sda4): disk /dev/sda4 errors: write 0, read 13, flush 0, 
corrupt 0, generation 0


---
 fs/btrfs/volumes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2ceb924ca0d6..cfa029468585 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7239,7 +7239,7 @@ static void btrfs_dev_stat_print_on_error(struct 
btrfs_device *dev)
if (!dev->dev_stats_valid)
return;
btrfs_err_rl_in_rcu(dev->fs_info,
-   "bdev %s errs: wr %u, rd %u, flush %u, corrupt %u, gen %u",
+   "disk %s errors: write %u, read %u, flush %u, corrupt %u, 
generation %u",
   rcu_str_deref(dev->name),
   btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_WRITE_ERRS),
   btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_READ_ERRS),
-- 
2.16.2




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: Add unprivileged subvolume search ioctl

2018-03-07 Thread Goffredo Baroncelli
On 03/07/2018 01:40 AM, Misono, Tomohiro wrote:
> On 2018/03/07 5:29, Goffredo Baroncelli wrote:
>> On 03/06/2018 09:30 AM, Misono, Tomohiro wrote:
>>> Add new unprivileged ioctl (BTRFS_IOC_GET_SUBVOL_INFO) which searches
>>> and returns only subvolume related item (ROOT_ITEM/ROOT_BACKREF/ROOT_REF)
>>> from root tree. The arguments of this ioctl are the same as treesearch
>>> ioctl and can be used like treesearch ioctl.
>>
>> Is it a pro ? The treesearch ioctl is tightly coupled to the btrfs internal 
>> structure, this means that if we would change the btrfs internal structure, 
>> we have to update all the clients of this api. For the treesearch it is an 
>> acceptable compromise between flexibility and speed of developing. But for a 
>> more specialized API, I think that it would make sense to provide a less 
>> coupled api to the internal structure.
> 
> Thanks for the comments.
> 
> The reason I choose the same api is just to minimize the code change in 
> btrfs-progs.
> For tree search part, it works just switching the ioctl number from 
> BTRFS_IOC_TREE_SEARCH
> to BTRFS_IOC_GET_SUBVOL_INFO in list_subvol_search()[1].
> 
> [1] https://marc.info/?l=linux-btrfs=152032537911218=2

I suggest to avoid this approach. An API is forever; we already have a 
"root-only" one which is quite unfriendly and error prone; I think that we 
should put all the energies to make a better one. 

I think that the major weaknesses of this api are:
- it exports the the data in "le" format  (see struct btrfs_root_item as 
example); 
- it requires the user to increase the key for the next ioctl call. This could 
be doable in the kernel space before returning
- this ioctl exports both the ROOT_BACKREF and ROOT_ITEM info. Why not make two 
separate (simplers) ioctl(s) ?

>>
>> Below some comments
[...]

>>> +   if ((key->objectid == BTRFS_FS_TREE_OBJECTID ||
>>> +   (key->objectid >= BTRFS_FIRST_FREE_OBJECTID &&
>>> +key->objectid <= BTRFS_LAST_FREE_OBJECTID)) &&
>>> +   key->type >= BTRFS_ROOT_ITEM_KEY &&
>>> +   key->type <= BTRFS_ROOT_BACKREF_KEY)
>>
>> Why returning all the range [BTRFS_ROOT_ITEM_KEY...BTRFS_ROOT_BACKREF_KEY]. 
>> It would be sufficient to return only
>>
>>   +  (key->type == BTRFS_ROOT_ITEM_KEY ||
>>   +   key->type == BTRFS_ROOT_BACKREF_KEY))
> 
> Sorry, it is a mistake, I mean "key->type <= BTRFS_ROOTREF_KEY".
> Although btrfs-progs only uses BTRFS_ROOT_BACKREF_KEY, I notice libbtrfs 
> uses BTRFS_ROOT_REF_KEY instead. So, I think it is better to return both
> ROOT_BACKREF_KEY and ROOT_REF_KEY. I will fix this in v2.
> 

I was referring to the '>=' and '<=' instead of '=='. If another type is added 
in the middle, it would be returned. I find it a bit error prone.

BR
G.Baroncelli
-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Improve error stats message

2018-03-07 Thread Hugo Mills
On Wed, Mar 07, 2018 at 06:37:29PM +0100, Diego wrote:
> A typical notification of filesystem errors looks like this:
> 
> BTRFS error (device sda2): bdev /dev/sda2 errs: wr 0, rd 1, flush 0, corrupt 
> 0, gen 0
> 
> The device name is being printed twice.

   For good reason -- the first part ("device sda2") indicates the
filesystem, and is the arbitrarily-selected device used by the kernel
to represent the FS. The second part ("bdev /dev/sda2") indicates the
_actual_ device for which the errors are being reported.

   On multi-device filesystems, the two are not necessarily the same.

   Hugo.

> Also, these abbreviatures
> feel unnecesary. Make the message look like this instead:
> 
> BTRFS error (device sda2): errors: write 0, read 1, flush 0, corrupt 0, 
> generation 0
> 
> 
> Signed-off-by: Diego Calleja 
> ---
>  fs/btrfs/volumes.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 2ceb924ca0d6..52fee5bb056f 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -7238,9 +7238,8 @@ static void btrfs_dev_stat_print_on_error(struct 
> btrfs_device *dev)
>  {
>   if (!dev->dev_stats_valid)
>   return;
> - btrfs_err_rl_in_rcu(dev->fs_info,
> - "bdev %s errs: wr %u, rd %u, flush %u, corrupt %u, gen %u",
> -rcu_str_deref(dev->name),
> + btrfs_err_rl(dev->fs_info,
> + "errors: write %u, read %u, flush %u, corrupt %u, generation 
> %u",
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_WRITE_ERRS),
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_READ_ERRS),
>  btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_FLUSH_ERRS),

-- 
Hugo Mills | Would you like an ocelot with that non-sequitur?
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


[PATCH] Improve error stats message

2018-03-07 Thread Diego
A typical notification of filesystem errors looks like this:

BTRFS error (device sda2): bdev /dev/sda2 errs: wr 0, rd 1, flush 0, corrupt 0, 
gen 0

The device name is being printed twice. Also, these abbreviatures
feel unnecesary. Make the message look like this instead:

BTRFS error (device sda2): errors: write 0, read 1, flush 0, corrupt 0, 
generation 0


Signed-off-by: Diego Calleja 
---
 fs/btrfs/volumes.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 2ceb924ca0d6..52fee5bb056f 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7238,9 +7238,8 @@ static void btrfs_dev_stat_print_on_error(struct 
btrfs_device *dev)
 {
if (!dev->dev_stats_valid)
return;
-   btrfs_err_rl_in_rcu(dev->fs_info,
-   "bdev %s errs: wr %u, rd %u, flush %u, corrupt %u, gen %u",
-  rcu_str_deref(dev->name),
+   btrfs_err_rl(dev->fs_info,
+   "errors: write %u, read %u, flush %u, corrupt %u, generation 
%u",
   btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_WRITE_ERRS),
   btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_READ_ERRS),
   btrfs_dev_stat_read(dev, BTRFS_DEV_STAT_FLUSH_ERRS),
-- 
2.16.2




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: Add unprivileged subvolume search ioctl

2018-03-07 Thread David Sterba
On Wed, Mar 07, 2018 at 09:40:18AM +0900, Misono, Tomohiro wrote:
> On 2018/03/07 5:29, Goffredo Baroncelli wrote:
> > On 03/06/2018 09:30 AM, Misono, Tomohiro wrote:
> >> Add new unprivileged ioctl (BTRFS_IOC_GET_SUBVOL_INFO) which searches
> >> and returns only subvolume related item (ROOT_ITEM/ROOT_BACKREF/ROOT_REF)
> >> from root tree. The arguments of this ioctl are the same as treesearch
> >> ioctl and can be used like treesearch ioctl.
> > 
> > Is it a pro ? The treesearch ioctl is tightly coupled to the btrfs internal 
> > structure, this means that if we would change the btrfs internal structure, 
> > we have to update all the clients of this api. For the treesearch it is an 
> > acceptable compromise between flexibility and speed of developing. But for 
> > a more specialized API, I think that it would make sense to provide a less 
> > coupled api to the internal structure.
> 
> Thanks for the comments.
> 
> The reason I choose the same api is just to minimize the code change in 
> btrfs-progs.

That's not IMO a good reason. We can cahnge the code in btrfs-progs and
that's not going to be the only user of the ioctl so the interfact (ie.
the structures) should be adapted for the needs of the ioctl.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/8] btrfs-progs: qgroups: add pathname to show output

2018-03-07 Thread Jeff Mahoney
On 3/7/18 12:45 AM, Qu Wenruo wrote:
> 
> 
> On 2018年03月03日 02:47, je...@suse.com wrote:
>> diff --git a/cmds-qgroup.c b/cmds-qgroup.c
>> index 48686436..94cd0fd3 100644
>> --- a/cmds-qgroup.c
>> +++ b/cmds-qgroup.c
>> @@ -280,8 +280,10 @@ static const char * const cmd_qgroup_show_usage[] = {
>>  "   (including ancestral qgroups)",
>>  "-f list all qgroups which impact the given path",
>>  "   (excluding ancestral qgroups)",
>> +"-P print first-level qgroups using pathname",
>> +"-v verbose, prints all nested subvolumes",
> 
> Did you mean the subvolume paths of all children qgroups?

Yes.  I'll make that clearer.

>>  HELPINFO_UNITS_LONG,
>> -"--sort=qgroupid,rfer,excl,max_rfer,max_excl",
>> +"--sort=qgroupid,rfer,excl,max_rfer,max_excl,pathname",
>>  "   list qgroups sorted by specified items",
>>  "   you can use '+' or '-' in front of each item.",
>>  "   (+:ascending, -:descending, ascending default)",

>> diff --git a/qgroup.c b/qgroup.c
>> index 67bc0738..83918134 100644
>> --- a/qgroup.c
>> +++ b/qgroup.c
>> @@ -210,8 +220,42 @@ static void print_qgroup_column_add_blank(enum 
>> btrfs_qgroup_column_enum column,
>>  printf(" ");
>>  }
>>  
>> +void print_pathname_column(struct btrfs_qgroup *qgroup, bool verbose)
>> +{
>> +struct btrfs_qgroup_list *list = NULL;
>> +
>> +fputs("  ", stdout);
>> +if (btrfs_qgroup_level(qgroup->qgroupid) > 0) {
>> +int count = 0;
> 
> Newline after declaration please.

Ack.

> And declaration in if() {} block doesn't pass checkpath IIRC.

Declarations in if () {} are fine.

>> +list_for_each_entry(list, >qgroups,
>> +next_qgroup) {
>> +if (verbose) {
>> +struct btrfs_qgroup *member = list->qgroup;
> 
> Same coding style problem here.

Ack.

>> +u64 level = 
>> btrfs_qgroup_level(member->qgroupid);
>> +u64 sid = btrfs_qgroup_subvid(member->qgroupid);
>> +if (count)
>> +fputs(" ", stdout);
>> +if (btrfs_qgroup_level(member->qgroupid) == 0)
>> +fputs(member->pathname, stdout);
> 
> What about checking member->pathname before outputting?
> As it could be missing.

Yep.

>> +static const char *qgroup_pathname(int fd, u64 qgroupid)
>> +{
>> +struct root_info root_info;
>> +int ret;
>> +char *pathname;
>> +
>> +ret = get_subvol_info_by_rootid_fd(fd, _info, qgroupid);

This is a leak too.  Callers are responsible for freeing the root_info
paths.  With this and your review fixed, valgrind passes with 0 leaks
for btrfs qgroups show -P.

>> +if (ret)
>> +return NULL;
>> +
>> +ret = asprintf(, "%s%s",
>> +   root_info.full_path[0] == '/' ? "" : "/",
>> +   root_info.full_path);
>> +if (ret < 0)
>> +return NULL;
>> +
>> +return pathname;
>> +}
>> +
>>  /*
>>   * Lookup or insert btrfs_qgroup into qgroup_lookup.
>>   *
>> @@ -588,7 +697,7 @@ static struct btrfs_qgroup *qgroup_tree_search(struct 
>> qgroup_lookup *root_tree,
>>   * Return the pointer to the btrfs_qgroup if found or if inserted 
>> successfully.
>>   * Return ERR_PTR if any error occurred.
>>   */
>> -static struct btrfs_qgroup *get_or_add_qgroup(
>> +static struct btrfs_qgroup *get_or_add_qgroup(int fd,
>>  struct qgroup_lookup *qgroup_lookup, u64 qgroupid)
>>  {
>>  struct btrfs_qgroup *bq;
>> @@ -608,6 +717,8 @@ static struct btrfs_qgroup *get_or_add_qgroup(
>>  INIT_LIST_HEAD(>qgroups);
>>  INIT_LIST_HEAD(>members);
>>  
>> +bq->pathname = qgroup_pathname(fd, qgroupid);
>> +
> 
> Here qgroup_pathname() will allocate memory, but no one is freeing this
> memory.
> 
> The cleaner should be in __free_btrfs_qgroup() but there is no
> modification to that function.

Ack.

Thanks for the review,

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] btrfs: qgroup: Fix root item corruption when multiple same source snapshiots are created with quota enabled

2018-03-07 Thread David Sterba
On Fri, Feb 02, 2018 at 11:45:46AM +, Filipe Manana wrote:
> On Tue, Dec 19, 2017 at 7:44 AM, Qu Wenruo  wrote:
> > When multiple pending snapshots referring the same source subvolume are
> > executed, enabled quota will cause root item corruption, where root
> > items are using old bytenr (no backref in extent tree).
> >
> > This can be triggered by fstests btrfs/152.
> >
> > The cause is when source subvolume is still dirty, extra commit
> > (simplied transaction commit) of qgroup_account_snapshot() can skip
> > dirty roots not recorded in current transaction, making root item of
> > source subvolume not updated.
> >
> > Fix it by forcing recording source subvolume in current transaction
> > before qgroup sub-transaction commit.
> >
> > Reported-by: Justin Maggard 
> > Signed-off-by: Qu Wenruo 
> Reviewed-by: Filipe Manana 

I overlooked the patch, sorry. Added to next now, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: fix unexpected cow in run_delalloc_nocow

2018-03-07 Thread David Sterba
On Wed, Jan 31, 2018 at 05:09:13PM -0700, Liu Bo wrote:
> Fstests generic/475 provides a way to fail metadata reads while
> checking if checksum exists for the inode inside run_delalloc_nocow(),
> and csum_exist_in_range() interprets error (-EIO) as inode having
> checksum and makes its caller enters the cow path.
> 
> In case of free space inode, this ends up with a warning in
> cow_file_range().
> 
> The same problem applies for btrfs_cross_ref_exist() since it may also
> read metadata in between.
> 
> With this, run_delalloc_nocow() bails out when errors occur at the two
> places.
> 
> cc:  v2.6.28+
> Fixes: 17d217fe970d ("Btrfs: fix nodatasum handling in balancing code")
> Signed-off-by: Liu Bo 

For the record, this has been added to next some time ago and testing
was ok.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] btrfs: fix bare unsigned declarations

2018-03-07 Thread David Sterba
On Mon, Feb 26, 2018 at 04:46:05PM +0800, Anand Jain wrote:
> We have btrfs_fs_info::data_chunk_allocations and
> btrfs_fs_info::metadata_ratio declared as unsigned which would
> be unsinged int and kernel style prefers unsigned int over bare
> unsigned. So this patch changes them to u32.
> 
> Signed-off-by: Anand Jain 
> ---
> v2->v3: @old_metadata_ratio change it to u32
> v1->v2: Update change log.

Added to next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Relax memory barrier in btrfs_tree_unlock

2018-03-07 Thread David Sterba
On Wed, Feb 14, 2018 at 02:37:26PM +0200, Nikolay Borisov wrote:
> When performing an unlock on an extent buffer we'd like to order the
> decrement of extent_buffer::blocking_writers with waking up any
> waiters. In such situations it's sufficient to use smp_mb__after_atomic
> rather than the heavy smp_mb. On architectures where atomic operations
> are fully ordered (such as x86 or s390) unconditionally executing
> a heavyweight smp_mb instruction causes a severe hit to performance
> while bringin no improvements in terms of correctness.
> 
> The better thing is to use the appropriate smp_mb__after_atomic routine
> which will do the correct thing (invoke a full smp_mb or in the case
> of ordered atomics insert a compiler barrier). Put another way,
> an RMW atomic op + smp_load__after_atomic equals, in terms of
> semantics, to a full smp_mb. This ensures that none of the problems
> described in the accompanying comment of waitqueue_active occur.
> No functional changes.
> 
> Signed-off-by: Nikolay Borisov 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: qgroups, properly handle no reservations

2018-03-07 Thread David Sterba
On Thu, Feb 22, 2018 at 10:05:36AM +0800, Qu Wenruo wrote:
> 
> 
> On 2018年02月22日 09:50, Jeff Mahoney wrote:
> > On 2/21/18 8:36 PM, Qu Wenruo wrote:
> >>
> >>
> >> On 2018年02月22日 04:19, je...@suse.com wrote:
> >>> From: Jeff Mahoney 
> >>>
> >>> There are several places where we call btrfs_qgroup_reserve_meta and
> >>> assume that a return value of 0 means that the reservation was successful.
> >>>
> >>> Later, we use the original bytes value passed to that call to free
> >>> bytes during error handling or to pass the number of bytes reserved to
> >>> the caller.
> >>>
> >>> This patch returns -ENODATA when we don't perform a reservation so that
> >>> callers can make the distinction.  This also lets call sites not
> >>> necessarily care whether qgroups are enabled.
> >>
> >> IMHO if we don't need to reserve, returning 0 seems good enough.
> >> Caller doesn't really need to care if it has reserved some bytes.
> >>
> >> Or is there any special case where we need to distinguish such case?
> > 
> > Anywhere where the reservation takes place prior to the transaction
> > starting, which is pretty much everywhere.  We wait until transaction
> > commit to flip the bit to turn on quotas, which means that if a
> > transaction commits that enables quotas lands in between the reservation
> > being take and any error handling that involves freeing the reservation,
> > we'll end up with an underflow.
> 
> So the same case as btrfs_qgroup_reserve_data().
> 
> In that case we could use ret > 0 to indicates the real bytes we
> reserved, instead of -ENODATA which normally means error.
> 
> > 
> > This is the first patch of a series I'm working on, but it can stand
> > alone.  The rest is the patch set I mentioned when we talked a few
> > months ago where the lifetimes of reservations are incorrect.  We can't
> > just drop all the reservations at the end of the transaction because 1)
> > the lifetime of some reservations can cross transactions and 2) because
> > especially in the start_transaction case, we do the reservation prior to
> > waiting to join the transaction.  So if the transaction we're waiting on
> > commits, our reservation goes away with it but we continue on as if we
> > still have it.
> 
> Right, the same problem I also addressed in patchset "[PATCH v2 00/10]
> Use split qgroup rsv type".
> 
> In 6th patch, "[PATCH v2 06/10] btrfs: qgroup: Use
> root->qgroup_meta_rsv_* to record qgroup meta reserved space" qgroup
> meta reserve will only be increased if we succeeded in reserving
> metadata, so later free won't underflow that number.

What should we do now when there are 2 different fixes? Applying Jeff's
patch on top of your qgroup-types patches causes some conflicts that do
not seem to be difficult, but the end result might not work as expected.

The changes do not seem to be fundamentally conflicting, would it make
sense to merge both? The patchset has been in for-next for a while but I
don't run qgroup specific tests besides what's in fstests. Also the
patchset fixes more problems so I think we need to merge it at some
point and now it's a good time about deciding whether it'd go to 4.17.

I did a pass through the patches, there are some minor things to fix but
a review from somebody with qgroup knowledge would be still desirable.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: Add nossd_spread mount option

2018-03-07 Thread David Sterba
On Wed, Feb 21, 2018 at 03:31:40PM -0800, Howard McLauchlan wrote:
> Btrfs has two mount options for SSD optimizations: ssd and ssd_spread.
> Presently there is an option to disable all SSD optimizations, but there
> isn't an option to disable just ssd_spread.
> 
> This patch adds a mount option nossd_spread that disables ssd_spread
> only.
> 
> Signed-off-by: Howard McLauchlan 
> ---
>  fs/btrfs/super.c | 13 +
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 6e71a2a78363..4c0fcf5b3e7e 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -310,10 +310,10 @@ static void btrfs_put_super(struct super_block *sb)
>  enum {
>   Opt_degraded, Opt_subvol, Opt_subvolid, Opt_device, Opt_nodatasum,
>   Opt_nodatacow, Opt_max_inline, Opt_alloc_start, Opt_nobarrier, Opt_ssd,
> - Opt_nossd, Opt_ssd_spread, Opt_thread_pool, Opt_noacl, Opt_compress,
> - Opt_compress_type, Opt_compress_force, Opt_compress_force_type,
> - Opt_notreelog, Opt_ratio, Opt_flushoncommit, Opt_discard,
> - Opt_space_cache, Opt_space_cache_version, Opt_clear_cache,
> + Opt_nossd, Opt_ssd_spread, Opt_nossd_spread, Opt_thread_pool, Opt_noacl,
> + Opt_compress, Opt_compress_type, Opt_compress_force,
> + Opt_compress_force_type, Opt_notreelog, Opt_ratio, Opt_flushoncommit,
> + Opt_discard, Opt_space_cache, Opt_space_cache_version, Opt_clear_cache,
>   Opt_user_subvol_rm_allowed, Opt_enospc_debug, Opt_subvolrootid,
>   Opt_defrag, Opt_inode_cache, Opt_no_space_cache, Opt_recovery,
>   Opt_skip_balance, Opt_check_integrity,
> @@ -353,6 +353,7 @@ static const match_table_t tokens = {
>   {Opt_ssd, "ssd"},
>   {Opt_ssd_spread, "ssd_spread"},
>   {Opt_nossd, "nossd"},
> + {Opt_nossd_spread, "nossd_spread"},
>   {Opt_acl, "acl"},
>   {Opt_noacl, "noacl"},
>   {Opt_notreelog, "notreelog"},
> @@ -582,6 +583,10 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char 
> *options,
>   btrfs_clear_and_info(info, SSD_SPREAD,
>"not using spread ssd allocation 
> scheme");
>   break;
> + case Opt_nossd_spread:
> + btrfs_clear_and_info(info, SSD_SPREAD,
> +  "not using spread ssd allocation 
> scheme");

The message is the same as above adn the 2 cases can be merged.

> + break;
>   case Opt_barrier:
>   btrfs_clear_and_info(info, NOBARRIER,
>"turning on barriers");
> -- 
> 2.14.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] btrfs-progs: qgroups: export qgroups usage information as JSON

2018-03-07 Thread Jeff Mahoney
On 3/7/18 1:34 AM, Qu Wenruo wrote:
> 
> 
> On 2018年03月03日 02:47, je...@suse.com wrote:
>> diff --git a/configure.ac b/configure.ac
>> index 56d17c3a..6aec672a 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -197,6 +197,12 @@ PKG_STATIC(UUID_LIBS_STATIC, [uuid])
>>  PKG_CHECK_MODULES(ZLIB, [zlib])
>>  PKG_STATIC(ZLIB_LIBS_STATIC, [zlib])
>>  
>> +PKG_CHECK_MODULES(JSON, [json-c], [
> 
> Json-c is quite common and also used by cryptsetup, so pretty good
> library choice.

Yep, so that puts it in the base system packages of most distros.

>> diff --git a/qgroup.c b/qgroup.c
>> index 2d0a6947..f632a45c 100644
>> --- a/qgroup.c
>> +++ b/qgroup.c
>>  return ret;
>>  }
>>  
>> +#ifdef HAVE_JSON
>> +static void format_qgroupid(char *buf, size_t size, u64 qgroupid)
>> +{
>> +int ret;
>> +
>> +ret = snprintf(buf, size, "%llu/%llu",
>> +   btrfs_qgroup_level(qgroupid),
>> +   btrfs_qgroup_subvid(qgroupid));
>> +ASSERT(ret < sizeof(buf));
> 
> This is designed to catch truncated snprintf(), right?
> This can be addressed by setting up the @buf properly.
> (See below)
> 
> And in fact, due to that super magic number, we won't hit this ASSERT()
> anyway.

Yep, but ASSERTs are there to detect/prevent developer mistakes.  This
should only trigger due to a simple bug, so we ASSERT rather than handle
the error gracefully.

[...]

>> +static bool export_one_qgroup(json_object *container,
>> + const struct btrfs_qgroup *qgroup, bool compat)
>> +{
>> +json_object *obj = json_object_new_object();
>> +json_object *tmp;
>> +char buf[42];
> 
> Answer to the ultimate question of life, the universe, and everything. :)
> 
> Although according to the format level/subvolid, it should be
> count_digits(MAX_U16) + 1 + count_digits(MAX_U48) + 1. (1 for '/' and 1
> for '\n')
> 
> Could be defined as a macro as:
> #define QGROUP_FORMAT_BUF_LEN (count_digits(1ULL<<16) + 1 + \
>count_digits(1ULL<<48) + 1)

Which would mean we'd execute count_digits twice for every qgroup to
resolve a constant.  I'll replace the magic number with a define though.

> BTW, the result is just 22.
It's a worst-case.  We're using %llu, so 42 is the length of two 64-bit
numbers in base ten, plus the slash and nul terminator.  In practice we
won't hit the limit, but it doesn't hurt.

Thanks for the review.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


[RFC PATCH] btrfs: Fix memory ordering of unlocked dio reads vs truncate

2018-03-07 Thread Nikolay Borisov
Signed-off-by: Nikolay Borisov 
---

Hello, 

Sending it as an RFC for the time being to see how people are going to react
and also I'd like some feedback on the mb semantics. For this purposed I've 
CC'ed some memory ordering people :) 

 fs/btrfs/btrfs_inode.h | 17 -
 fs/btrfs/inode.c   | 40 +---
 2 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index f527e99c9f8d..3519e49d4ef0 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -329,23 +329,6 @@ struct btrfs_dio_private {
blk_status_t);
 };
 
-/*
- * Disable DIO read nolock optimization, so new dio readers will be forced
- * to grab i_mutex. It is used to avoid the endless truncate due to
- * nonlocked dio read.
- */
-static inline void btrfs_inode_block_unlocked_dio(struct btrfs_inode *inode)
-{
-   set_bit(BTRFS_INODE_READDIO_NEED_LOCK, >runtime_flags);
-   smp_mb();
-}
-
-static inline void btrfs_inode_resume_unlocked_dio(struct btrfs_inode *inode)
-{
-   smp_mb__before_atomic();
-   clear_bit(BTRFS_INODE_READDIO_NEED_LOCK, >runtime_flags);
-}
-
 static inline void btrfs_print_data_csum_error(struct btrfs_inode *inode,
u64 logical_start, u32 csum, u32 csum_expected, int mirror_num)
 {
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index f7aebb8424b1..7ded6808a0f6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -5148,10 +5148,28 @@ static int btrfs_setsize(struct inode *inode, struct 
iattr *attr)
/* we don't support swapfiles, so vmtruncate shouldn't fail */
truncate_setsize(inode, newsize);
 
-   /* Disable nonlocked read DIO to avoid the end less truncate */
-   btrfs_inode_block_unlocked_dio(BTRFS_I(inode));
+   /*
+* This code is very subtle. It is essentially a lock of its
+* own type. BTRFS allows multiple DIO readers to race with
+* writers so long as they don't read beyond EOF of an inode.
+* However, if we have a pending truncate we'd like to signal
+* DIO readers they should fall back to DIO_LOCKING semantics.
+* This ensures that multiple aggressive DIO readers cannot
+* starve the truncating thread.
+*
+* This semantics is achieved by the use of the below flag. If
+* new readers come after the flag has been cleared then the
+* state is still consistent, since the RELEASE semantics of
+* clear_bit_unlock ensure the truncate inode size will be
+* visible and DIO readers will bail out.
+*
+* The implied memory barrier by inode_dio_wait is paired with
+* smp_mb__before_atomic in btrfs_direct_IO.
+*/
+   set_bit(BTRFS_INODE_READDIO_NEED_LOCK, >runtime_flags);
inode_dio_wait(inode);
-   btrfs_inode_resume_unlocked_dio(BTRFS_I(inode));
+   clear_bit_unlock(BTRFS_INODE_READDIO_NEED_LOCK,
+>runtime_flags);
 
ret = btrfs_truncate(inode, newsize == oldsize);
if (ret && inode->i_nlink) {
@@ -8716,11 +8734,19 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, 
struct iov_iter *iter)
dio_data.unsubmitted_oe_range_end = (u64)offset;
current->journal_info = _data;
down_read(_I(inode)->dio_sem);
-   } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
+   } else {
+   /*
+* This barrier is paired with the implied barrier in
+* inode_dio_wait. It ensures that READDIO_NEED_LOCK is
+* visible if we have a pending truncate.
+*/
+   smp_mb__before_atomic();
+   if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
 _I(inode)->runtime_flags)) {
-   inode_dio_end(inode);
-   flags = DIO_LOCKING | DIO_SKIP_HOLES;
-   wakeup = false;
+   inode_dio_end(inode);
+   flags = DIO_LOCKING | DIO_SKIP_HOLES;
+   wakeup = false;
+   }
}
 
ret = __blockdev_direct_IO(iocb, inode,
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] btrfs: Parse options after node/sector size initialized

2018-03-07 Thread David Sterba
On Fri, Mar 02, 2018 at 01:22:50PM +0800, Qu Wenruo wrote:
> This provides the basis for later max_inline enhancement, which needs to
> access fs_info->nodesize.

I've checked if this patch can be applied independently, but no, see the
comment below.

> Signed-off-by: Qu Wenruo 
> ---
>  fs/btrfs/disk-io.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index a8ecccfc36de..f7f985ed5af9 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -2644,12 +2644,6 @@ int open_ctree(struct super_block *sb,
>*/
>   fs_info->compress_type = BTRFS_COMPRESS_ZLIB;
>  
> - ret = btrfs_parse_options(fs_info, options, sb->s_flags);
> - if (ret) {
> - err = ret;
> - goto fail_alloc;
> - }
> -
>   features = btrfs_super_incompat_flags(disk_super) &

We cannot swap the order of btrfs_parse_options and
btrfs_super_incompat_flags as the incompat flags can be set by mount
options. Currently it's for lzo or zstd, that are supported, but int the
future this can be anything and such bug would be hard to catch. If the
nodesize is requred, it would need to be obtained in another way.

>   ~BTRFS_FEATURE_INCOMPAT_SUPP;
>   if (features) {
> @@ -2692,6 +2686,13 @@ int open_ctree(struct super_block *sb,
>   fs_info->sectorsize = sectorsize;
>   fs_info->stripesize = stripesize;
>  
> + /* Only parse options after node/sector size initialized */
> + ret = btrfs_parse_options(fs_info, options, sb->s_flags);
> + if (ret) {
> + err = ret;
> + goto fail_alloc;
> + }
> +
>   /*
>* mixed block groups end up with duplicate but slightly offset
>* extent buffers for the same range.  It leads to corruptions
> -- 
> 2.16.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: scrub: batch rebuild for raid56

2018-03-07 Thread Nikolay Borisov


On  7.03.2018 16:43, David Sterba wrote:
> On Tue, Mar 06, 2018 at 11:22:21AM -0700, Liu Bo wrote:
>> On Tue, Mar 06, 2018 at 11:47:47AM +0100, David Sterba wrote:
>>> On Fri, Mar 02, 2018 at 04:10:37PM -0700, Liu Bo wrote:
 In case of raid56, writes and rebuilds always take BTRFS_STRIPE_LEN(64K)
 as unit, however, scrub_extent() sets blocksize as unit, so rebuild
 process may be triggered on every block on a same stripe.

 A typical example would be that when we're replacing a disappeared disk,
 all reads on the disks get -EIO, every block (size is 4K if blocksize is
 4K) would go thru these,

 scrub_handle_errored_block
   scrub_recheck_block # re-read pages one by one
   scrub_recheck_block # rebuild by calling raid56_parity_recover()
 page by page

 Although with raid56 stripe cache most of reads during rebuild can be
 avoided, the parity recover calculation(xor or raid6 algorithms) needs to
 be done $(BTRFS_STRIPE_LEN / blocksize) times.

 This makes it less stupid by doing raid56 scrub/replace on stripe length.
>>>
>>> missing s-o-b
>>
>> I'm surprised that checkpatch.pl didn't complain.

I have written a python script that can scrape the mailing list and run
checkpatch (and any other software deemed appropriate) on posted patches
and reply back with results. However, I haven't really activated it, I
guess if people think there is merit in it I could hook it up to the
mailing list :)

> 
> Never mind, I'm your true checkpatch.cz
> 
> (http://checkpatch.pl actually exists)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: drop nonvaring variable, instead define it

2018-03-07 Thread David Sterba
On Wed, Mar 07, 2018 at 05:29:18PM +0800, Anand Jain wrote:
> btrfs_defrag_leaves() declares min_trans = 0; as variable, but
> doesn't vary it, so define it.
> 
> Signed-off-by: Anand Jain 

Reviewed-by: David Sterba 

> ---
> v2->v1: Use BTRFS_OLDEST_GENERATION at more places where needed.

I've updated to reflect the change as the new define is the interesting
part, removing min_trans in defrag_leaf is only collateral in this case.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: scrub: batch rebuild for raid56

2018-03-07 Thread David Sterba
On Tue, Mar 06, 2018 at 11:22:21AM -0700, Liu Bo wrote:
> On Tue, Mar 06, 2018 at 11:47:47AM +0100, David Sterba wrote:
> > On Fri, Mar 02, 2018 at 04:10:37PM -0700, Liu Bo wrote:
> > > In case of raid56, writes and rebuilds always take BTRFS_STRIPE_LEN(64K)
> > > as unit, however, scrub_extent() sets blocksize as unit, so rebuild
> > > process may be triggered on every block on a same stripe.
> > > 
> > > A typical example would be that when we're replacing a disappeared disk,
> > > all reads on the disks get -EIO, every block (size is 4K if blocksize is
> > > 4K) would go thru these,
> > > 
> > > scrub_handle_errored_block
> > >   scrub_recheck_block # re-read pages one by one
> > >   scrub_recheck_block # rebuild by calling raid56_parity_recover()
> > > page by page
> > > 
> > > Although with raid56 stripe cache most of reads during rebuild can be
> > > avoided, the parity recover calculation(xor or raid6 algorithms) needs to
> > > be done $(BTRFS_STRIPE_LEN / blocksize) times.
> > > 
> > > This makes it less stupid by doing raid56 scrub/replace on stripe length.
> > 
> > missing s-o-b
> 
> I'm surprised that checkpatch.pl didn't complain.

Never mind, I'm your true checkpatch.cz

(http://checkpatch.pl actually exists)
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 55/63] btrfs: Convert page cache to XArray

2018-03-07 Thread David Sterba
On Tue, Mar 06, 2018 at 11:24:05AM -0800, Matthew Wilcox wrote:
> From: Matthew Wilcox 
> 
> Signed-off-by: Matthew Wilcox 

Acked-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 06/63] btrfs: Use filemap_range_has_page()

2018-03-07 Thread David Sterba
On Tue, Mar 06, 2018 at 11:23:16AM -0800, Matthew Wilcox wrote:
> From: Matthew Wilcox 
> 
> The current implementation of btrfs_page_exists_in_range() gives the
> wrong answer if the workingset code has stored a shadow entry in the
> page cache.  The filemap_range_has_page() function does not have this
> problem, and it's shared code, so use it instead.

I'm going to merge this patch. btrfs_page_exists_in_range was full of
bugs from the beginning so I'm more than happy to use the shared one.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fstests: btrfs/004: increase the buffer size of logical-resolve to the maximum value 64K

2018-03-07 Thread Filipe Manana
On Wed, Mar 7, 2018 at 7:07 AM, Eryu Guan  wrote:
> On Tue, Mar 06, 2018 at 03:02:31PM +0800, Lu Fengqi wrote:
>> Because of commit e76e13ce8c0b ("fsstress: implement the
>> clonerange/deduperange ioctls"), dedupe makes the number of references to
>> the same extent item increase so much that the default 4K buffer of
>> logical-resolve is no longer sufficient.
>>
>> Signed-off-by: Lu Fengqi 
>
> This looks fine to me. But I'd like an explicit ack from btrfs
> developers.

Reviewed-by: Filipe Manana 

Looks good to me.

>
> Thanks,
> Eryu
>
>> ---
>>  tests/btrfs/004 | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tests/btrfs/004 b/tests/btrfs/004
>> index de583cc355d4..0d2efb91dba7 100755
>> --- a/tests/btrfs/004
>> +++ b/tests/btrfs/004
>> @@ -103,7 +103,7 @@ _btrfs_inspect_addr()
>>   expect_addr=$3
>>   expect_inum=$4
>>   file=$5
>> - cmd="$BTRFS_UTIL_PROG inspect-internal logical-resolve -P $addr $mp"
>> + cmd="$BTRFS_UTIL_PROG inspect-internal logical-resolve -s 65536 -P 
>> $addr $mp"
>>   echo "# $cmd" >> $seqres.full
>>   out=`$cmd`
>>   echo "$out" >> $seqres.full
>> --
>> 2.16.2
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe fstests" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe fstests" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread Qu Wenruo


On 2018年03月07日 20:17, Nikolay Borisov wrote:
> 
> 
> On  7.03.2018 14:14, Qu Wenruo wrote:
>>
>>
> 
> 
> 

 SHARED flag is determined after extent map merge, so here we can't rely
 on em here.
>>>
>>> Shouldn't extent maps correspond to 1:1 disk-state. I.e. they are just
>>> the memory cache of the extent state. So if we merge them, shouldn't we
>>> also merge the on-disk extents as well ?
>>
>> Not 1:1.
>>
>> In memory one is merged maybe to save memory.
>>
>> But on-disk file extents has size limit.
>> For compressed one it's 128K and 128M for uncompressed one.
> 
> Fair enough, however 4 extents, 16k each should warrant merging on-disk
> as well, no ?

dd oflag=dsync
That ensures we won't merge on-disk extents.

Thanks,
Qu

> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread Nikolay Borisov


On  7.03.2018 14:14, Qu Wenruo wrote:
> 
> 



>>>
>>> SHARED flag is determined after extent map merge, so here we can't rely
>>> on em here.
>>
>> Shouldn't extent maps correspond to 1:1 disk-state. I.e. they are just
>> the memory cache of the extent state. So if we merge them, shouldn't we
>> also merge the on-disk extents as well ?
> 
> Not 1:1.
> 
> In memory one is merged maybe to save memory.
> 
> But on-disk file extents has size limit.
> For compressed one it's 128K and 128M for uncompressed one.

Fair enough, however 4 extents, 16k each should warrant merging on-disk
as well, no ?


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread Qu Wenruo


On 2018年03月07日 19:27, Nikolay Borisov wrote:
> 
> 
> On  7.03.2018 13:18, Qu Wenruo wrote:
>>
>>
>> On 2018年03月07日 19:01, robbieko wrote:
>>> Qu Wenruo 於 2018-03-07 18:42 寫到:
 On 2018年03月07日 18:33, Qu Wenruo wrote:
>
>
> On 2018年03月07日 16:20, robbieko wrote:
>> From: Robbie Ko 
>>
>> [BUG]
>> Range clone can cause fiemap to return error result.
>> Like:
>>  # mount /dev/vdb5 /mnt/btrfs
>>  # dd if=ev/zero bsK count=2 oflag=dsync of=/mnt/btrfs/file
>>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>>  /mnt/btrfs/file:
>>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>>    0: [0..63]: 4241424..4241487    64   0x1
>>
>>  # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
>>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>>  /mnt/btrfs/file:
>>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>>    0: [0..63]: 4241424..4241487    64   0x1
>>  If we clone second file extent, we will get error FLAGS,
>>  SHARED bit is not set.
>>
>> [REASON]
>> Btrfs only checks if the first extent in extent map is shared,
>> but extent may merge.
>>
>> [FIX]
>> Here we will check each extent with extent map range,
>> if one of them is shared, extent map is shared.
>>
>> [PATCH RESULT]
>>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>>  /mnt/btrfs/file:
>>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>>    0: [0..63]: 4241424..4241487    64 0x2001
>>
>> Signed-off-by: Robbie Ko 
>> ---
>>  fs/btrfs/extent_io.c | 146
>> +--
>>  1 file changed, 131 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 066b6df..5c6dca9 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -4394,8 +4394,8 @@ static int emit_fiemap_extent(struct
>> fiemap_extent_info *fieinfo,
>>   */
>>  if (cache->offset + cache->len  =offset &&
>>  cache->phys + cache->len =phys  &&
>> -    (cache->flags & ~FIEMAP_EXTENT_LAST) => -    (flags 
>> & ~FIEMAP_EXTENT_LAST)) {
>> +    (cache->flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED)) 
>> => +    (flags & 
>> ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED))) {
>>  cache->len +=en;
>>  cache->flags |=lags;
>>  goto try_submit_last;
>> @@ -4450,6 +4450,134 @@ static int emit_last_fiemap_cache(struct
>> btrfs_fs_info *fs_info,
>>  return ret;
>>  }
>>
>> +/*
>> + * Helper to check the file range is shared.
>> + *
>> + * Fiemap extent will be combined with many extents, so we need to
>> examine
>> + * each extent, and if shared, the results are shared.
>> + *
>> + * Return: 0 if file range is not shared, 1 if it is shared, < 0 on
>> error.
>> + */
>> +static int extent_map_check_shared(struct inode *inode, u64 start,
>> u64 end)
>> +{
>> +    struct btrfs_fs_info *fs_info =trfs_sb(inode->i_sb);
>> +    struct btrfs_root *root =TRFS_I(inode)->root;
>> +    int ret =;
>> +    struct extent_buffer *leaf;
>> +    struct btrfs_path *path;
>> +    struct btrfs_file_extent_item *fi;
>> +    struct btrfs_key found_key;
>> +    int check_prev =;
>> +    int extent_type;
>> +    int shared =;
>> +    u64 cur_offset;
>> +    u64 extent_end;
>> +    u64 ino =trfs_ino(BTRFS_I(inode));
>> +    u64 disk_bytenr;
>> +
>> +    path =trfs_alloc_path();
>> +    if (!path) {
>> +    return -ENOMEM;
>> +    }
>> +
>> +    cur_offset =tart;
>> +    while (1) {
>> +    ret =trfs_lookup_file_extent(NULL, root, path, ino,
>> +   cur_offset, 0);
>> +    if (ret < 0)
>> +    goto error;
>> +    if (ret > 0 && path->slots[0] > 0 && check_prev) {
>> +    leaf =ath->nodes[0];
>> +    btrfs_item_key_to_cpu(leaf, _key,
>> +  path->slots[0] - 1);
>> +    if (found_key.objectid =ino &&
>> +    found_key.type =BTRFS_EXTENT_DATA_KEY)
>> +    path->slots[0]--;
>> +    }
>> +    check_prev =;
>> +next_slot:
>> +    leaf =ath->nodes[0];
>> +    if (path->slots[0] >=trfs_header_nritems(leaf)) {
>> +    ret =trfs_next_leaf(root, path);
>> +    if (ret < 0)
>> +    goto error;
>> +    if (ret > 0)
>> +    break;
>> +    leaf =ath->nodes[0];
>> +    }
>> +
>> +    disk_bytenr =;
>> +    btrfs_item_key_to_cpu(leaf, _key, path->slots[0]);
>> +
>> +    if (found_key.objectid > ino)

Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread Nikolay Borisov


On  7.03.2018 13:18, Qu Wenruo wrote:
> 
> 
> On 2018年03月07日 19:01, robbieko wrote:
>> Qu Wenruo 於 2018-03-07 18:42 寫到:
>>> On 2018年03月07日 18:33, Qu Wenruo wrote:


 On 2018年03月07日 16:20, robbieko wrote:
> From: Robbie Ko 
>
> [BUG]
> Range clone can cause fiemap to return error result.
> Like:
>  # mount /dev/vdb5 /mnt/btrfs
>  # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>    0: [0..63]: 4241424..4241487    64   0x1
>
>  # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>    0: [0..63]: 4241424..4241487    64   0x1
>  If we clone second file extent, we will get error FLAGS,
>  SHARED bit is not set.
>
> [REASON]
> Btrfs only checks if the first extent in extent map is shared,
> but extent may merge.
>
> [FIX]
> Here we will check each extent with extent map range,
> if one of them is shared, extent map is shared.
>
> [PATCH RESULT]
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>    0: [0..63]: 4241424..4241487    64 0x2001
>
> Signed-off-by: Robbie Ko 
> ---
>  fs/btrfs/extent_io.c | 146
> +--
>  1 file changed, 131 insertions(+), 15 deletions(-)
>
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 066b6df..5c6dca9 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4394,8 +4394,8 @@ static int emit_fiemap_extent(struct
> fiemap_extent_info *fieinfo,
>   */
>  if (cache->offset + cache->len  == offset &&
>  cache->phys + cache->len == phys  &&
> -    (cache->flags & ~FIEMAP_EXTENT_LAST) ==
> -    (flags & ~FIEMAP_EXTENT_LAST)) {
> +    (cache->flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED)) ==
> +    (flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED))) {
>  cache->len += len;
>  cache->flags |= flags;
>  goto try_submit_last;
> @@ -4450,6 +4450,134 @@ static int emit_last_fiemap_cache(struct
> btrfs_fs_info *fs_info,
>  return ret;
>  }
>
> +/*
> + * Helper to check the file range is shared.
> + *
> + * Fiemap extent will be combined with many extents, so we need to
> examine
> + * each extent, and if shared, the results are shared.
> + *
> + * Return: 0 if file range is not shared, 1 if it is shared, < 0 on
> error.
> + */
> +static int extent_map_check_shared(struct inode *inode, u64 start,
> u64 end)
> +{
> +    struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> +    struct btrfs_root *root = BTRFS_I(inode)->root;
> +    int ret = 0;
> +    struct extent_buffer *leaf;
> +    struct btrfs_path *path;
> +    struct btrfs_file_extent_item *fi;
> +    struct btrfs_key found_key;
> +    int check_prev = 1;
> +    int extent_type;
> +    int shared = 0;
> +    u64 cur_offset;
> +    u64 extent_end;
> +    u64 ino = btrfs_ino(BTRFS_I(inode));
> +    u64 disk_bytenr;
> +
> +    path = btrfs_alloc_path();
> +    if (!path) {
> +    return -ENOMEM;
> +    }
> +
> +    cur_offset = start;
> +    while (1) {
> +    ret = btrfs_lookup_file_extent(NULL, root, path, ino,
> +   cur_offset, 0);
> +    if (ret < 0)
> +    goto error;
> +    if (ret > 0 && path->slots[0] > 0 && check_prev) {
> +    leaf = path->nodes[0];
> +    btrfs_item_key_to_cpu(leaf, _key,
> +  path->slots[0] - 1);
> +    if (found_key.objectid == ino &&
> +    found_key.type == BTRFS_EXTENT_DATA_KEY)
> +    path->slots[0]--;
> +    }
> +    check_prev = 0;
> +next_slot:
> +    leaf = path->nodes[0];
> +    if (path->slots[0] >= btrfs_header_nritems(leaf)) {
> +    ret = btrfs_next_leaf(root, path);
> +    if (ret < 0)
> +    goto error;
> +    if (ret > 0)
> +    break;
> +    leaf = path->nodes[0];
> +    }
> +
> +    disk_bytenr = 0;
> +    btrfs_item_key_to_cpu(leaf, _key, path->slots[0]);
> +
> +    if (found_key.objectid > ino)
> +    break;
> +    if (WARN_ON_ONCE(found_key.objectid < ino) ||
> +    found_key.type < BTRFS_EXTENT_DATA_KEY) {
> 

Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread Qu Wenruo


On 2018年03月07日 19:01, robbieko wrote:
> Qu Wenruo 於 2018-03-07 18:42 寫到:
>> On 2018年03月07日 18:33, Qu Wenruo wrote:
>>>
>>>
>>> On 2018年03月07日 16:20, robbieko wrote:
 From: Robbie Ko 

 [BUG]
 Range clone can cause fiemap to return error result.
 Like:
  # mount /dev/vdb5 /mnt/btrfs
  # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
  # xfs_io -c "fiemap -v" /mnt/btrfs/file
  /mnt/btrfs/file:
  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
    0: [0..63]: 4241424..4241487    64   0x1

  # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
  # xfs_io -c "fiemap -v" /mnt/btrfs/file
  /mnt/btrfs/file:
  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
    0: [0..63]: 4241424..4241487    64   0x1
  If we clone second file extent, we will get error FLAGS,
  SHARED bit is not set.

 [REASON]
 Btrfs only checks if the first extent in extent map is shared,
 but extent may merge.

 [FIX]
 Here we will check each extent with extent map range,
 if one of them is shared, extent map is shared.

 [PATCH RESULT]
  # xfs_io -c "fiemap -v" /mnt/btrfs/file
  /mnt/btrfs/file:
  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
    0: [0..63]: 4241424..4241487    64 0x2001

 Signed-off-by: Robbie Ko 
 ---
  fs/btrfs/extent_io.c | 146
 +--
  1 file changed, 131 insertions(+), 15 deletions(-)

 diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
 index 066b6df..5c6dca9 100644
 --- a/fs/btrfs/extent_io.c
 +++ b/fs/btrfs/extent_io.c
 @@ -4394,8 +4394,8 @@ static int emit_fiemap_extent(struct
 fiemap_extent_info *fieinfo,
   */
  if (cache->offset + cache->len  == offset &&
  cache->phys + cache->len == phys  &&
 -    (cache->flags & ~FIEMAP_EXTENT_LAST) ==
 -    (flags & ~FIEMAP_EXTENT_LAST)) {
 +    (cache->flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED)) ==
 +    (flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED))) {
  cache->len += len;
  cache->flags |= flags;
  goto try_submit_last;
 @@ -4450,6 +4450,134 @@ static int emit_last_fiemap_cache(struct
 btrfs_fs_info *fs_info,
  return ret;
  }

 +/*
 + * Helper to check the file range is shared.
 + *
 + * Fiemap extent will be combined with many extents, so we need to
 examine
 + * each extent, and if shared, the results are shared.
 + *
 + * Return: 0 if file range is not shared, 1 if it is shared, < 0 on
 error.
 + */
 +static int extent_map_check_shared(struct inode *inode, u64 start,
 u64 end)
 +{
 +    struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 +    struct btrfs_root *root = BTRFS_I(inode)->root;
 +    int ret = 0;
 +    struct extent_buffer *leaf;
 +    struct btrfs_path *path;
 +    struct btrfs_file_extent_item *fi;
 +    struct btrfs_key found_key;
 +    int check_prev = 1;
 +    int extent_type;
 +    int shared = 0;
 +    u64 cur_offset;
 +    u64 extent_end;
 +    u64 ino = btrfs_ino(BTRFS_I(inode));
 +    u64 disk_bytenr;
 +
 +    path = btrfs_alloc_path();
 +    if (!path) {
 +    return -ENOMEM;
 +    }
 +
 +    cur_offset = start;
 +    while (1) {
 +    ret = btrfs_lookup_file_extent(NULL, root, path, ino,
 +   cur_offset, 0);
 +    if (ret < 0)
 +    goto error;
 +    if (ret > 0 && path->slots[0] > 0 && check_prev) {
 +    leaf = path->nodes[0];
 +    btrfs_item_key_to_cpu(leaf, _key,
 +  path->slots[0] - 1);
 +    if (found_key.objectid == ino &&
 +    found_key.type == BTRFS_EXTENT_DATA_KEY)
 +    path->slots[0]--;
 +    }
 +    check_prev = 0;
 +next_slot:
 +    leaf = path->nodes[0];
 +    if (path->slots[0] >= btrfs_header_nritems(leaf)) {
 +    ret = btrfs_next_leaf(root, path);
 +    if (ret < 0)
 +    goto error;
 +    if (ret > 0)
 +    break;
 +    leaf = path->nodes[0];
 +    }
 +
 +    disk_bytenr = 0;
 +    btrfs_item_key_to_cpu(leaf, _key, path->slots[0]);
 +
 +    if (found_key.objectid > ino)
 +    break;
 +    if (WARN_ON_ONCE(found_key.objectid < ino) ||
 +    found_key.type < BTRFS_EXTENT_DATA_KEY) {
 +    path->slots[0]++;
 +    goto next_slot;
 +    }
 +    if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
 +    found_key.offset > 

Re: [PATCH 1/2] Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero

2018-03-07 Thread Nikolay Borisov


On  7.03.2018 12:27, robbieko wrote:
> Nikolay Borisov 於 2018-03-07 18:19 寫到:
>> On  7.03.2018 10:20, robbieko wrote:
>>> From: Robbie Ko 
>>>
>>>  # mount /dev/vdb5 /mnt/btrfs
>>>  # dd if=/dev/zero bs=16K count=4 oflag=dsync of=/mnt/btrfs/file
>>>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>>>  /mnt/btrfs/file:
>>>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>>>    0: [0..127]:    25088..25215   128   0x1
>>>
>>> Run fiemap with fm_extent_count set to 0, we'll get wrong value 4
>>> instead of 1.
>>
>> Wrong value 4 instead of 1 for which exact column, the flags? State this
>> explicitly.
>>
>> Also this seems a bit bogus since fiemap's documentation states:
>>
>> If fm_extent_count is zero, then the fm_extents[] array is ignored (no
>> extents will be returned), and the fm_mapped_extents count will hold the
>> number of extents needed in fm_extents[] to hold the file's current
>> mapping.
>>
>> So when fm_extent_count we shouldn't really be returning anything from
>> kernel.
>>
> 
> Sorry I did not explain clearly.
> The value is fm_mapped_extents.

But fm_mapped_extents is tagged as an OUT member, meaning the user has
no job writing to it.

> If fm_extent_count  is zero, the fm_mapped_extents count will hold the
> number of extents needed.
> 
> 
>>
>>>
>>> [REASON]
>>> When fm_extent_count is 0, disko is not initialized correctly,
>>> The value is 0 in this case, not the right bytenr.
>>
>> This is too sparse, be more explicit i.e. that disko=0 is passed to
>> emit_fiemap_extent which then leads to issues.
>>
>>>
>>> [FIX]
>>> Use correct disko.
>>>
>>> Signed-off-by: Robbie Ko 
>>> ---
>>>  fs/btrfs/extent_io.c | 4 +---
>>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>>
>>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>>> index 012d638..066b6df 100644
>>> --- a/fs/btrfs/extent_io.c
>>> +++ b/fs/btrfs/extent_io.c
>>> @@ -4567,7 +4567,7 @@ int extent_fiemap(struct inode *inode, struct
>>> fiemap_extent_info *fieinfo,
>>>  offset_in_extent = em_start - em->start;
>>>  em_end = extent_map_end(em);
>>>  em_len = em_end - em_start;
>>> -    disko = 0;
>>> +    disko = em->block_start + offset_in_extent;
>>>  flags = 0;
>>>
>>>  /*
>>> @@ -4590,8 +4590,6 @@ int extent_fiemap(struct inode *inode, struct
>>> fiemap_extent_info *fieinfo,
>>>  u64 bytenr = em->block_start -
>>>  (em->start - em->orig_start);
>>>
>>> -    disko = em->block_start + offset_in_extent;
>>> -
>>>  /*
>>>   * As btrfs supports shared space, this information
>>>   * can be exported to userspace tools via
>>> -- 
>>> 1.9.1
>>>
>>> -- 
>>> To unsubscribe from this list: send the line "unsubscribe
>>> linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread robbieko

Qu Wenruo 於 2018-03-07 18:42 寫到:

On 2018年03月07日 18:33, Qu Wenruo wrote:



On 2018年03月07日 16:20, robbieko wrote:

From: Robbie Ko 

[BUG]
Range clone can cause fiemap to return error result.
Like:
 # mount /dev/vdb5 /mnt/btrfs
 # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764   0x1

 # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764   0x1
 If we clone second file extent, we will get error FLAGS,
 SHARED bit is not set.

[REASON]
Btrfs only checks if the first extent in extent map is shared,
but extent may merge.

[FIX]
Here we will check each extent with extent map range,
if one of them is shared, extent map is shared.

[PATCH RESULT]
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764 0x2001

Signed-off-by: Robbie Ko 
---
 fs/btrfs/extent_io.c | 146 
+--

 1 file changed, 131 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 066b6df..5c6dca9 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4394,8 +4394,8 @@ static int emit_fiemap_extent(struct 
fiemap_extent_info *fieinfo,

 */
if (cache->offset + cache->len  == offset &&
cache->phys + cache->len == phys  &&
-   (cache->flags & ~FIEMAP_EXTENT_LAST) ==
-   (flags & ~FIEMAP_EXTENT_LAST)) {
+   (cache->flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED)) ==
+   (flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED))) {
cache->len += len;
cache->flags |= flags;
goto try_submit_last;
@@ -4450,6 +4450,134 @@ static int emit_last_fiemap_cache(struct 
btrfs_fs_info *fs_info,

return ret;
 }

+/*
+ * Helper to check the file range is shared.
+ *
+ * Fiemap extent will be combined with many extents, so we need to 
examine

+ * each extent, and if shared, the results are shared.
+ *
+ * Return: 0 if file range is not shared, 1 if it is shared, < 0 on 
error.

+ */
+static int extent_map_check_shared(struct inode *inode, u64 start, 
u64 end)

+{
+   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+   struct btrfs_root *root = BTRFS_I(inode)->root;
+   int ret = 0;
+   struct extent_buffer *leaf;
+   struct btrfs_path *path;
+   struct btrfs_file_extent_item *fi;
+   struct btrfs_key found_key;
+   int check_prev = 1;
+   int extent_type;
+   int shared = 0;
+   u64 cur_offset;
+   u64 extent_end;
+   u64 ino = btrfs_ino(BTRFS_I(inode));
+   u64 disk_bytenr;
+
+   path = btrfs_alloc_path();
+   if (!path) {
+   return -ENOMEM;
+   }
+
+   cur_offset = start;
+   while (1) {
+   ret = btrfs_lookup_file_extent(NULL, root, path, ino,
+  cur_offset, 0);
+   if (ret < 0)
+   goto error;
+   if (ret > 0 && path->slots[0] > 0 && check_prev) {
+   leaf = path->nodes[0];
+   btrfs_item_key_to_cpu(leaf, _key,
+ path->slots[0] - 1);
+   if (found_key.objectid == ino &&
+   found_key.type == BTRFS_EXTENT_DATA_KEY)
+   path->slots[0]--;
+   }
+   check_prev = 0;
+next_slot:
+   leaf = path->nodes[0];
+   if (path->slots[0] >= btrfs_header_nritems(leaf)) {
+   ret = btrfs_next_leaf(root, path);
+   if (ret < 0)
+   goto error;
+   if (ret > 0)
+   break;
+   leaf = path->nodes[0];
+   }
+
+   disk_bytenr = 0;
+   btrfs_item_key_to_cpu(leaf, _key, path->slots[0]);
+
+   if (found_key.objectid > ino)
+   break;
+   if (WARN_ON_ONCE(found_key.objectid < ino) ||
+   found_key.type < BTRFS_EXTENT_DATA_KEY) {
+   path->slots[0]++;
+   goto next_slot;
+   }
+   if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
+   found_key.offset > end)
+   break;
+
+   fi = btrfs_item_ptr(leaf, path->slots[0],
+   struct btrfs_file_extent_item);
+   extent_type = btrfs_file_extent_type(leaf, fi);
+
+  

Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread Qu Wenruo


On 2018年03月07日 18:33, Qu Wenruo wrote:
> 
> 
> On 2018年03月07日 16:20, robbieko wrote:
>> From: Robbie Ko 
>>
>> [BUG]
>> Range clone can cause fiemap to return error result.
>> Like:
>>  # mount /dev/vdb5 /mnt/btrfs
>>  # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
>>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>>  /mnt/btrfs/file:
>>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>>0: [0..63]: 4241424..424148764   0x1
>>
>>  # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
>>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>>  /mnt/btrfs/file:
>>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>>0: [0..63]: 4241424..424148764   0x1
>>  If we clone second file extent, we will get error FLAGS,
>>  SHARED bit is not set.
>>
>> [REASON]
>> Btrfs only checks if the first extent in extent map is shared,
>> but extent may merge.
>>
>> [FIX]
>> Here we will check each extent with extent map range,
>> if one of them is shared, extent map is shared.
>>
>> [PATCH RESULT]
>>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>>  /mnt/btrfs/file:
>>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>>0: [0..63]: 4241424..424148764 0x2001
>>
>> Signed-off-by: Robbie Ko 
>> ---
>>  fs/btrfs/extent_io.c | 146 
>> +--
>>  1 file changed, 131 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
>> index 066b6df..5c6dca9 100644
>> --- a/fs/btrfs/extent_io.c
>> +++ b/fs/btrfs/extent_io.c
>> @@ -4394,8 +4394,8 @@ static int emit_fiemap_extent(struct 
>> fiemap_extent_info *fieinfo,
>>   */
>>  if (cache->offset + cache->len  == offset &&
>>  cache->phys + cache->len == phys  &&
>> -(cache->flags & ~FIEMAP_EXTENT_LAST) ==
>> -(flags & ~FIEMAP_EXTENT_LAST)) {
>> +(cache->flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED)) ==
>> +(flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED))) {
>>  cache->len += len;
>>  cache->flags |= flags;
>>  goto try_submit_last;
>> @@ -4450,6 +4450,134 @@ static int emit_last_fiemap_cache(struct 
>> btrfs_fs_info *fs_info,
>>  return ret;
>>  }
>>
>> +/*
>> + * Helper to check the file range is shared.
>> + *
>> + * Fiemap extent will be combined with many extents, so we need to examine
>> + * each extent, and if shared, the results are shared.
>> + *
>> + * Return: 0 if file range is not shared, 1 if it is shared, < 0 on error.
>> + */
>> +static int extent_map_check_shared(struct inode *inode, u64 start, u64 end)
>> +{
>> +struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>> +struct btrfs_root *root = BTRFS_I(inode)->root;
>> +int ret = 0;
>> +struct extent_buffer *leaf;
>> +struct btrfs_path *path;
>> +struct btrfs_file_extent_item *fi;
>> +struct btrfs_key found_key;
>> +int check_prev = 1;
>> +int extent_type;
>> +int shared = 0;
>> +u64 cur_offset;
>> +u64 extent_end;
>> +u64 ino = btrfs_ino(BTRFS_I(inode));
>> +u64 disk_bytenr;
>> +
>> +path = btrfs_alloc_path();
>> +if (!path) {
>> +return -ENOMEM;
>> +}
>> +
>> +cur_offset = start;
>> +while (1) {
>> +ret = btrfs_lookup_file_extent(NULL, root, path, ino,
>> +   cur_offset, 0);
>> +if (ret < 0)
>> +goto error;
>> +if (ret > 0 && path->slots[0] > 0 && check_prev) {
>> +leaf = path->nodes[0];
>> +btrfs_item_key_to_cpu(leaf, _key,
>> +  path->slots[0] - 1);
>> +if (found_key.objectid == ino &&
>> +found_key.type == BTRFS_EXTENT_DATA_KEY)
>> +path->slots[0]--;
>> +}
>> +check_prev = 0;
>> +next_slot:
>> +leaf = path->nodes[0];
>> +if (path->slots[0] >= btrfs_header_nritems(leaf)) {
>> +ret = btrfs_next_leaf(root, path);
>> +if (ret < 0)
>> +goto error;
>> +if (ret > 0)
>> +break;
>> +leaf = path->nodes[0];
>> +}
>> +
>> +disk_bytenr = 0;
>> +btrfs_item_key_to_cpu(leaf, _key, path->slots[0]);
>> +
>> +if (found_key.objectid > ino)
>> +break;
>> +if (WARN_ON_ONCE(found_key.objectid < ino) ||
>> +found_key.type < BTRFS_EXTENT_DATA_KEY) {
>> +path->slots[0]++;
>> +goto next_slot;
>> +}
>> +if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
>> +found_key.offset > end)
>> +break;
>> +
>> +fi = btrfs_item_ptr(leaf, 

Re: [PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread Qu Wenruo


On 2018年03月07日 16:20, robbieko wrote:
> From: Robbie Ko 
> 
> [BUG]
> Range clone can cause fiemap to return error result.
> Like:
>  # mount /dev/vdb5 /mnt/btrfs
>  # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>0: [0..63]: 4241424..424148764   0x1
> 
>  # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>0: [0..63]: 4241424..424148764   0x1
>  If we clone second file extent, we will get error FLAGS,
>  SHARED bit is not set.
> 
> [REASON]
> Btrfs only checks if the first extent in extent map is shared,
> but extent may merge.
> 
> [FIX]
> Here we will check each extent with extent map range,
> if one of them is shared, extent map is shared.
> 
> [PATCH RESULT]
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>0: [0..63]: 4241424..424148764 0x2001
> 
> Signed-off-by: Robbie Ko 
> ---
>  fs/btrfs/extent_io.c | 146 
> +--
>  1 file changed, 131 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 066b6df..5c6dca9 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4394,8 +4394,8 @@ static int emit_fiemap_extent(struct fiemap_extent_info 
> *fieinfo,
>*/
>   if (cache->offset + cache->len  == offset &&
>   cache->phys + cache->len == phys  &&
> - (cache->flags & ~FIEMAP_EXTENT_LAST) ==
> - (flags & ~FIEMAP_EXTENT_LAST)) {
> + (cache->flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED)) ==
> + (flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED))) {
>   cache->len += len;
>   cache->flags |= flags;
>   goto try_submit_last;
> @@ -4450,6 +4450,134 @@ static int emit_last_fiemap_cache(struct 
> btrfs_fs_info *fs_info,
>   return ret;
>  }
> 
> +/*
> + * Helper to check the file range is shared.
> + *
> + * Fiemap extent will be combined with many extents, so we need to examine
> + * each extent, and if shared, the results are shared.
> + *
> + * Return: 0 if file range is not shared, 1 if it is shared, < 0 on error.
> + */
> +static int extent_map_check_shared(struct inode *inode, u64 start, u64 end)
> +{
> + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> + struct btrfs_root *root = BTRFS_I(inode)->root;
> + int ret = 0;
> + struct extent_buffer *leaf;
> + struct btrfs_path *path;
> + struct btrfs_file_extent_item *fi;
> + struct btrfs_key found_key;
> + int check_prev = 1;
> + int extent_type;
> + int shared = 0;
> + u64 cur_offset;
> + u64 extent_end;
> + u64 ino = btrfs_ino(BTRFS_I(inode));
> + u64 disk_bytenr;
> +
> + path = btrfs_alloc_path();
> + if (!path) {
> + return -ENOMEM;
> + }
> +
> + cur_offset = start;
> + while (1) {
> + ret = btrfs_lookup_file_extent(NULL, root, path, ino,
> +cur_offset, 0);
> + if (ret < 0)
> + goto error;
> + if (ret > 0 && path->slots[0] > 0 && check_prev) {
> + leaf = path->nodes[0];
> + btrfs_item_key_to_cpu(leaf, _key,
> +   path->slots[0] - 1);
> + if (found_key.objectid == ino &&
> + found_key.type == BTRFS_EXTENT_DATA_KEY)
> + path->slots[0]--;
> + }
> + check_prev = 0;
> +next_slot:
> + leaf = path->nodes[0];
> + if (path->slots[0] >= btrfs_header_nritems(leaf)) {
> + ret = btrfs_next_leaf(root, path);
> + if (ret < 0)
> + goto error;
> + if (ret > 0)
> + break;
> + leaf = path->nodes[0];
> + }
> +
> + disk_bytenr = 0;
> + btrfs_item_key_to_cpu(leaf, _key, path->slots[0]);
> +
> + if (found_key.objectid > ino)
> + break;
> + if (WARN_ON_ONCE(found_key.objectid < ino) ||
> + found_key.type < BTRFS_EXTENT_DATA_KEY) {
> + path->slots[0]++;
> + goto next_slot;
> + }
> + if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
> + found_key.offset > end)
> + break;
> +
> + fi = btrfs_item_ptr(leaf, path->slots[0],
> + struct btrfs_file_extent_item);
> + 

Re: [PATCH 1/2] Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero

2018-03-07 Thread robbieko

Nikolay Borisov 於 2018-03-07 18:19 寫到:

On  7.03.2018 10:20, robbieko wrote:

From: Robbie Ko 

 # mount /dev/vdb5 /mnt/btrfs
 # dd if=/dev/zero bs=16K count=4 oflag=dsync of=/mnt/btrfs/file
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..127]:25088..25215   128   0x1

Run fiemap with fm_extent_count set to 0, we'll get wrong value 4
instead of 1.


Wrong value 4 instead of 1 for which exact column, the flags? State 
this

explicitly.

Also this seems a bit bogus since fiemap's documentation states:

If fm_extent_count is zero, then the fm_extents[] array is ignored (no
extents will be returned), and the fm_mapped_extents count will hold 
the
number of extents needed in fm_extents[] to hold the file's current 
mapping.


So when fm_extent_count we shouldn't really be returning anything from
kernel.



Sorry I did not explain clearly.
The value is fm_mapped_extents.
If fm_extent_count  is zero, the fm_mapped_extents count will hold the 
number of extents needed.







[REASON]
When fm_extent_count is 0, disko is not initialized correctly,
The value is 0 in this case, not the right bytenr.


This is too sparse, be more explicit i.e. that disko=0 is passed to
emit_fiemap_extent which then leads to issues.



[FIX]
Use correct disko.

Signed-off-by: Robbie Ko 
---
 fs/btrfs/extent_io.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 012d638..066b6df 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4567,7 +4567,7 @@ int extent_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,

offset_in_extent = em_start - em->start;
em_end = extent_map_end(em);
em_len = em_end - em_start;
-   disko = 0;
+   disko = em->block_start + offset_in_extent;
flags = 0;

/*
@@ -4590,8 +4590,6 @@ int extent_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,

u64 bytenr = em->block_start -
(em->start - em->orig_start);

-   disko = em->block_start + offset_in_extent;
-
/*
 * As btrfs supports shared space, this information
 * can be exported to userspace tools via
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" 
in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero

2018-03-07 Thread Nikolay Borisov


On  7.03.2018 10:20, robbieko wrote:
> From: Robbie Ko 
> 
>  # mount /dev/vdb5 /mnt/btrfs
>  # dd if=/dev/zero bs=16K count=4 oflag=dsync of=/mnt/btrfs/file
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>0: [0..127]:25088..25215   128   0x1
> 
> Run fiemap with fm_extent_count set to 0, we'll get wrong value 4
> instead of 1.

Wrong value 4 instead of 1 for which exact column, the flags? State this
explicitly.

Also this seems a bit bogus since fiemap's documentation states:

If fm_extent_count is zero, then the fm_extents[] array is ignored (no
extents will be returned), and the fm_mapped_extents count will hold the
number of extents needed in fm_extents[] to hold the file's current mapping.

So when fm_extent_count we shouldn't really be returning anything from
kernel.


> 
> [REASON]
> When fm_extent_count is 0, disko is not initialized correctly,
> The value is 0 in this case, not the right bytenr.

This is too sparse, be more explicit i.e. that disko=0 is passed to
emit_fiemap_extent which then leads to issues.

> 
> [FIX]
> Use correct disko.
> 
> Signed-off-by: Robbie Ko 
> ---
>  fs/btrfs/extent_io.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index 012d638..066b6df 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -4567,7 +4567,7 @@ int extent_fiemap(struct inode *inode, struct 
> fiemap_extent_info *fieinfo,
>   offset_in_extent = em_start - em->start;
>   em_end = extent_map_end(em);
>   em_len = em_end - em_start;
> - disko = 0;
> + disko = em->block_start + offset_in_extent;
>   flags = 0;
> 
>   /*
> @@ -4590,8 +4590,6 @@ int extent_fiemap(struct inode *inode, struct 
> fiemap_extent_info *fieinfo,
>   u64 bytenr = em->block_start -
>   (em->start - em->orig_start);
> 
> - disko = em->block_start + offset_in_extent;
> -
>   /*
>* As btrfs supports shared space, this information
>* can be exported to userspace tools via
> --
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] btrfs fiemap related BUG fix.

2018-03-07 Thread robbieko

Qu Wenruo 於 2018-03-07 17:27 寫到:

On 2018年03月07日 16:20, robbieko wrote:

From: Robbie Ko 

This patchset intends to fix btrfs fiemap related bug.

The fiemap has the following problems:

1) Wrong extent count when fm_extent_count is zero.


2) SHARED bit is not correct
I have two ideas, but I do not know which one is the best.

Like:
 # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764   0x1
 # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone

1. When any extent is shared in extent map, the entire extent map is 
shared

 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..4241487   64  0x2001


I think this is what btrfs is doing right now.
Although I'm not sure if this is the best solution.

BTW, I just did the same operation, and just get the SHARED flag on 
both

source and destination.

Is there something wrong?



Currently, only the first extent is checked for shared in extent_map.
Details can refer to "[PATCH 2/2] Btrfs: fix fiemap extent SHARED flag 
error with range clone."





2. Split into different extent
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..31]: 4241424..4241455   32  0x0
   1: [32..63]:4241456..4241487   32  0x2001


This is what XFS does.

Thanks,
Qu



Robbie Ko (2):
  Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero
  Btrfs: fix fiemap extent SHARED flag error with range clone.

 fs/btrfs/extent_io.c | 150 
---

 1 file changed, 132 insertions(+), 18 deletions(-)

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" 
in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/2] btrfs fiemap related BUG fix.

2018-03-07 Thread Qu Wenruo


On 2018年03月07日 16:20, robbieko wrote:
> From: Robbie Ko 
> 
> This patchset intends to fix btrfs fiemap related bug.
> 
> The fiemap has the following problems:
> 
> 1) Wrong extent count when fm_extent_count is zero.
> 
> 
> 2) SHARED bit is not correct
> I have two ideas, but I do not know which one is the best.
> 
> Like:
>  # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>0: [0..63]: 4241424..424148764   0x1
>  # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
> 
> 1. When any extent is shared in extent map, the entire extent map is shared
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>0: [0..63]: 4241424..4241487   64  0x2001

I think this is what btrfs is doing right now.
Although I'm not sure if this is the best solution.

BTW, I just did the same operation, and just get the SHARED flag on both
source and destination.

Is there something wrong?

> 
> 2. Split into different extent
>  # xfs_io -c "fiemap -v" /mnt/btrfs/file
>  /mnt/btrfs/file:
>  EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
>0: [0..31]: 4241424..4241455   32  0x0
>1: [32..63]:4241456..4241487   32  0x2001

This is what XFS does.

Thanks,
Qu

> 
> Robbie Ko (2):
>   Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero
>   Btrfs: fix fiemap extent SHARED flag error with range clone.
> 
>  fs/btrfs/extent_io.c | 150 
> ---
>  1 file changed, 132 insertions(+), 18 deletions(-)
> 
> --
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



signature.asc
Description: OpenPGP digital signature


[PATCH v2] btrfs: drop nonvaring variable, instead define it

2018-03-07 Thread Anand Jain
btrfs_defrag_leaves() declares min_trans = 0; as variable, but
doesn't vary it, so define it.

Signed-off-by: Anand Jain 
---
v2->v1: Use BTRFS_OLDEST_GENERATION at more places where needed.

 fs/btrfs/ctree.h   | 2 ++
 fs/btrfs/ioctl.c   | 2 +-
 fs/btrfs/tree-defrag.c | 5 ++---
 fs/btrfs/uuid-tree.c   | 2 +-
 fs/btrfs/volumes.c | 2 +-
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index d688430f9be1..7c408062b6a5 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -66,6 +66,8 @@ struct btrfs_ordered_sum;
 
 #define BTRFS_MAX_LEVEL 8
 
+#define BTRFS_OLDEST_GENERATION0ULL
+
 #define BTRFS_COMPAT_EXTENT_TREE_V0
 
 /*
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index c5a559105949..ba403c00982c 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2600,7 +2600,7 @@ static int btrfs_ioctl_defrag(struct file *file, void 
__user *argp)
range->len = (u64)-1;
}
ret = btrfs_defrag_file(file_inode(file), file,
-   range, 0, 0);
+   range, BTRFS_OLDEST_GENERATION, 0);
if (ret > 0)
ret = 0;
kfree(range);
diff --git a/fs/btrfs/tree-defrag.c b/fs/btrfs/tree-defrag.c
index cb65089127cc..c09dbe4bd6e7 100644
--- a/fs/btrfs/tree-defrag.c
+++ b/fs/btrfs/tree-defrag.c
@@ -39,7 +39,6 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
int level;
int next_key_ret = 0;
u64 last_ret = 0;
-   u64 min_trans = 0;
 
if (root->fs_info->extent_root == root) {
/*
@@ -81,7 +80,7 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
 
path->keep_locks = 1;
 
-   ret = btrfs_search_forward(root, , path, min_trans);
+   ret = btrfs_search_forward(root, , path, BTRFS_OLDEST_GENERATION);
if (ret < 0)
goto out;
if (ret > 0) {
@@ -130,7 +129,7 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
 */
path->slots[1] = btrfs_header_nritems(path->nodes[1]);
next_key_ret = btrfs_find_next_key(root, path, , 1,
-  min_trans);
+  BTRFS_OLDEST_GENERATION);
if (next_key_ret == 0) {
memcpy(>defrag_progress, , sizeof(key));
ret = -EAGAIN;
diff --git a/fs/btrfs/uuid-tree.c b/fs/btrfs/uuid-tree.c
index 726f928238d0..9916f03430bc 100644
--- a/fs/btrfs/uuid-tree.c
+++ b/fs/btrfs/uuid-tree.c
@@ -282,7 +282,7 @@ int btrfs_uuid_tree_iterate(struct btrfs_fs_info *fs_info,
key.offset = 0;
 
 again_search_slot:
-   ret = btrfs_search_forward(root, , path, 0);
+   ret = btrfs_search_forward(root, , path, BTRFS_OLDEST_GENERATION);
if (ret) {
if (ret > 0)
ret = 0;
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index b4ada8f50f3c..328b97801836 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -4206,7 +4206,7 @@ static int btrfs_uuid_scan_kthread(void *data)
key.offset = 0;
 
while (1) {
-   ret = btrfs_search_forward(root, , path, 0);
+   ret = btrfs_search_forward(root, , path, 
BTRFS_OLDEST_GENERATION);
if (ret) {
if (ret > 0)
ret = 0;
-- 
2.15.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/8] btrfs-progs: qgroups: introduce and use info and limit structures

2018-03-07 Thread Nikolay Borisov


On  2.03.2018 20:47, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> We use structures to pass the info and limit from the kernel as items
> but store the individual values separately in btrfs_qgroup.  We already
> have a btrfs_qgroup_limit structure that's used for setting the limit.
> 
> This patch introduces a btrfs_qgroup_info structure and uses that and
> btrfs_qgroup_limit in btrfs_qgroup.
> 
> Signed-off-by: Jeff Mahoney 

Reviewed-by: Nikolay Borisov 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fstests: btrfs/146: make sure hit all stripes in the case of compression

2018-03-07 Thread Lu Fengqi
In the case of compression, each 128K input data chunk will be compressed
to 4K (because of the characters written are duplicate). Therefore we have
to write (128K * 16) to make sure every stripe can be hit.

Signed-off-by: Lu Fengqi 
---
 tests/btrfs/146 | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tests/btrfs/146 b/tests/btrfs/146
index 7071c128..b5d17c49 100755
--- a/tests/btrfs/146
+++ b/tests/btrfs/146
@@ -74,9 +74,16 @@ _scratch_pool_mkfs "-d raid0 -m raid1" > $seqres.full 2>&1
 _scratch_mount
 
 # How much do we need to write? We need to hit all of the stripes. btrfs uses
-# a fixed 64k stripesize, so write enough to hit each one
+# a fixed 64k stripesize, so write enough to hit each one. In the case of
+# compression, each 128K input data chunk will be compressed to 4K (because of
+# the characters written are duplicate). Therefore we have to write (128K * 16)
+# to make sure every stripe can be hit.
 number_of_devices=`echo $SCRATCH_DEV_POOL | wc -w`
-write_kb=$(($number_of_devices * 64))
+if ! echo $MOUNT_OPTIONS | grep -qoP 'compress=\K(?!no)\w+'; then
+   write_kb=$(($number_of_devices * 64))
+else
+   write_kb=$(($number_of_devices * 2048))
+fi
 _require_fs_space $SCRATCH_MNT $write_kb
 
 testfile=$SCRATCH_MNT/fsync-err-test
-- 
2.14.3



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] btrfs fiemap related BUG fix.

2018-03-07 Thread robbieko
From: Robbie Ko 

This patchset intends to fix btrfs fiemap related bug.

The fiemap has the following problems:

1) Wrong extent count when fm_extent_count is zero.


2) SHARED bit is not correct
I have two ideas, but I do not know which one is the best.

Like:
 # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764   0x1
 # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone

1. When any extent is shared in extent map, the entire extent map is shared
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..4241487   64  0x2001

2. Split into different extent
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..31]: 4241424..4241455   32  0x0
   1: [32..63]:4241456..4241487   32  0x2001

Robbie Ko (2):
  Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero
  Btrfs: fix fiemap extent SHARED flag error with range clone.

 fs/btrfs/extent_io.c | 150 ---
 1 file changed, 132 insertions(+), 18 deletions(-)

--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Btrfs: fiemap: pass correct bytenr when fm_extent_count is zero

2018-03-07 Thread robbieko
From: Robbie Ko 

 # mount /dev/vdb5 /mnt/btrfs
 # dd if=/dev/zero bs=16K count=4 oflag=dsync of=/mnt/btrfs/file
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..127]:25088..25215   128   0x1

Run fiemap with fm_extent_count set to 0, we'll get wrong value 4
instead of 1.

[REASON]
When fm_extent_count is 0, disko is not initialized correctly,
The value is 0 in this case, not the right bytenr.

[FIX]
Use correct disko.

Signed-off-by: Robbie Ko 
---
 fs/btrfs/extent_io.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 012d638..066b6df 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4567,7 +4567,7 @@ int extent_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,
offset_in_extent = em_start - em->start;
em_end = extent_map_end(em);
em_len = em_end - em_start;
-   disko = 0;
+   disko = em->block_start + offset_in_extent;
flags = 0;

/*
@@ -4590,8 +4590,6 @@ int extent_fiemap(struct inode *inode, struct 
fiemap_extent_info *fieinfo,
u64 bytenr = em->block_start -
(em->start - em->orig_start);

-   disko = em->block_start + offset_in_extent;
-
/*
 * As btrfs supports shared space, this information
 * can be exported to userspace tools via
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs: fix fiemap extent SHARED flag error with range clone.

2018-03-07 Thread robbieko
From: Robbie Ko 

[BUG]
Range clone can cause fiemap to return error result.
Like:
 # mount /dev/vdb5 /mnt/btrfs
 # dd if=/dev/zero bs=16K count=2 oflag=dsync of=/mnt/btrfs/file
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764   0x1

 # cloner -s $((16*1024)) /mnt/btrfs/file /mnt/btrfs/file_clone
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764   0x1
 If we clone second file extent, we will get error FLAGS,
 SHARED bit is not set.

[REASON]
Btrfs only checks if the first extent in extent map is shared,
but extent may merge.

[FIX]
Here we will check each extent with extent map range,
if one of them is shared, extent map is shared.

[PATCH RESULT]
 # xfs_io -c "fiemap -v" /mnt/btrfs/file
 /mnt/btrfs/file:
 EXT: FILE-OFFSET  BLOCK-RANGE  TOTAL FLAGS
   0: [0..63]: 4241424..424148764 0x2001

Signed-off-by: Robbie Ko 
---
 fs/btrfs/extent_io.c | 146 +--
 1 file changed, 131 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 066b6df..5c6dca9 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4394,8 +4394,8 @@ static int emit_fiemap_extent(struct fiemap_extent_info 
*fieinfo,
 */
if (cache->offset + cache->len  == offset &&
cache->phys + cache->len == phys  &&
-   (cache->flags & ~FIEMAP_EXTENT_LAST) ==
-   (flags & ~FIEMAP_EXTENT_LAST)) {
+   (cache->flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED)) ==
+   (flags & ~(FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_SHARED))) {
cache->len += len;
cache->flags |= flags;
goto try_submit_last;
@@ -4450,6 +4450,134 @@ static int emit_last_fiemap_cache(struct btrfs_fs_info 
*fs_info,
return ret;
 }

+/*
+ * Helper to check the file range is shared.
+ *
+ * Fiemap extent will be combined with many extents, so we need to examine
+ * each extent, and if shared, the results are shared.
+ *
+ * Return: 0 if file range is not shared, 1 if it is shared, < 0 on error.
+ */
+static int extent_map_check_shared(struct inode *inode, u64 start, u64 end)
+{
+   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+   struct btrfs_root *root = BTRFS_I(inode)->root;
+   int ret = 0;
+   struct extent_buffer *leaf;
+   struct btrfs_path *path;
+   struct btrfs_file_extent_item *fi;
+   struct btrfs_key found_key;
+   int check_prev = 1;
+   int extent_type;
+   int shared = 0;
+   u64 cur_offset;
+   u64 extent_end;
+   u64 ino = btrfs_ino(BTRFS_I(inode));
+   u64 disk_bytenr;
+
+   path = btrfs_alloc_path();
+   if (!path) {
+   return -ENOMEM;
+   }
+
+   cur_offset = start;
+   while (1) {
+   ret = btrfs_lookup_file_extent(NULL, root, path, ino,
+  cur_offset, 0);
+   if (ret < 0)
+   goto error;
+   if (ret > 0 && path->slots[0] > 0 && check_prev) {
+   leaf = path->nodes[0];
+   btrfs_item_key_to_cpu(leaf, _key,
+ path->slots[0] - 1);
+   if (found_key.objectid == ino &&
+   found_key.type == BTRFS_EXTENT_DATA_KEY)
+   path->slots[0]--;
+   }
+   check_prev = 0;
+next_slot:
+   leaf = path->nodes[0];
+   if (path->slots[0] >= btrfs_header_nritems(leaf)) {
+   ret = btrfs_next_leaf(root, path);
+   if (ret < 0)
+   goto error;
+   if (ret > 0)
+   break;
+   leaf = path->nodes[0];
+   }
+
+   disk_bytenr = 0;
+   btrfs_item_key_to_cpu(leaf, _key, path->slots[0]);
+
+   if (found_key.objectid > ino)
+   break;
+   if (WARN_ON_ONCE(found_key.objectid < ino) ||
+   found_key.type < BTRFS_EXTENT_DATA_KEY) {
+   path->slots[0]++;
+   goto next_slot;
+   }
+   if (found_key.type > BTRFS_EXTENT_DATA_KEY ||
+   found_key.offset > end)
+   break;
+
+   fi = btrfs_item_ptr(leaf, path->slots[0],
+   struct btrfs_file_extent_item);
+   extent_type = btrfs_file_extent_type(leaf, fi);
+
+   if (extent_type == BTRFS_FILE_EXTENT_REG ||
+   extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
+   

Re: [PATCH 3/8] btrfs-progs: constify pathnames passed as arguments

2018-03-07 Thread Nikolay Borisov


On  2.03.2018 20:46, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> It's unlikely we're going to modify a pathname argument, so codify that
> and use const.
> 
> Signed-off-by: Jeff Mahoney 
> ---
>  chunk-recover.c | 4 ++--
>  cmds-device.c   | 2 +-
>  cmds-fi-usage.c | 6 +++---
>  cmds-rescue.c   | 4 ++--
>  send-utils.c| 4 ++--
>  5 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/chunk-recover.c b/chunk-recover.c
> index 705bcf52..1d30db51 100644
> --- a/chunk-recover.c
> +++ b/chunk-recover.c
> @@ -1492,7 +1492,7 @@ out:
>   return ERR_PTR(ret);
>  }
>  
> -static int recover_prepare(struct recover_control *rc, char *path)
> +static int recover_prepare(struct recover_control *rc, const char *path)
>  {
>   int ret;
>   int fd;
> @@ -2296,7 +2296,7 @@ static void validate_rebuild_chunks(struct 
> recover_control *rc)
>  /*
>   * Return 0 when successful, < 0 on error and > 0 if aborted by user
>   */
> -int btrfs_recover_chunk_tree(char *path, int verbose, int yes)
> +int btrfs_recover_chunk_tree(const char *path, int verbose, int yes)
>  {
>   int ret = 0;
>   struct btrfs_root *root = NULL;
> diff --git a/cmds-device.c b/cmds-device.c
> index 86459d1b..a49c9d9d 100644
> --- a/cmds-device.c
> +++ b/cmds-device.c
> @@ -526,7 +526,7 @@ static const char * const cmd_device_usage_usage[] = {
>   NULL
>  };
>  
> -static int _cmd_device_usage(int fd, char *path, unsigned unit_mode)
> +static int _cmd_device_usage(int fd, const char *path, unsigned unit_mode)

Actually the path parameter is not used in this function at all, I'd say
just remove it.

>  {
>   int i;
>   int ret = 0;> diff --git a/cmds-fi-usage.c b/cmds-fi-usage.c
> index de7ad668..9a1c76ab 100644
> --- a/cmds-fi-usage.c
> +++ b/cmds-fi-usage.c
> @@ -227,7 +227,7 @@ static int cmp_btrfs_ioctl_space_info(const void *a, 
> const void *b)
>  /*
>   * This function load all the information about the space usage
>   */
> -static struct btrfs_ioctl_space_args *load_space_info(int fd, char *path)
> +static struct btrfs_ioctl_space_args *load_space_info(int fd, const char 
> *path)
>  {
>   struct btrfs_ioctl_space_args *sargs = NULL, *sargs_orig = NULL;
>   int ret, count;
> @@ -305,7 +305,7 @@ static void get_raid56_used(struct chunk_info *chunks, 
> int chunkcount,
>  #define  MIN_UNALOCATED_THRESH   SZ_16M
>  static int print_filesystem_usage_overall(int fd, struct chunk_info 
> *chunkinfo,
>   int chunkcount, struct device_info *devinfo, int devcount,
> - char *path, unsigned unit_mode)
> + const char *path, unsigned unit_mode)
>  {
>   struct btrfs_ioctl_space_args *sargs = NULL;
>   int i;
> @@ -931,7 +931,7 @@ static void _cmd_filesystem_usage_linear(unsigned 
> unit_mode,
>  static int print_filesystem_usage_by_chunk(int fd,
>   struct chunk_info *chunkinfo, int chunkcount,
>   struct device_info *devinfo, int devcount,
> - char *path, unsigned unit_mode, int tabular)
> + const char *path, unsigned unit_mode, int tabular)
>  {
>   struct btrfs_ioctl_space_args *sargs;
>   int ret = 0;
> diff --git a/cmds-rescue.c b/cmds-rescue.c
> index c40088ad..c61145bc 100644
> --- a/cmds-rescue.c
> +++ b/cmds-rescue.c
> @@ -32,8 +32,8 @@ static const char * const rescue_cmd_group_usage[] = {
>   NULL
>  };
>  
> -int btrfs_recover_chunk_tree(char *path, int verbose, int yes);
> -int btrfs_recover_superblocks(char *path, int verbose, int yes);
> +int btrfs_recover_chunk_tree(const char *path, int verbose, int yes);

That path argument is being passed to recover_prepare which can alo use
a const to its path parameter

> +int btrfs_recover_superblocks(const char *path, int verbose, int yes);
>  
>  static const char * const cmd_rescue_chunk_recover_usage[] = {
>   "btrfs rescue chunk-recover [options] ",
> diff --git a/send-utils.c b/send-utils.c
> index b5289e76..8ce94de1 100644
> --- a/send-utils.c
> +++ b/send-utils.c
> @@ -28,8 +28,8 @@
>  #include "ioctl.h"
>  #include "btrfs-list.h"
>  
> -static int btrfs_subvolid_resolve_sub(int fd, char *path, size_t *path_len,
> -   u64 subvol_id);
> +static int btrfs_subvolid_resolve_sub(int fd, char *path,
> +   size_t *path_len, u64 subvol_id);

This seems like an unrelated change. As a matter of fact
btrfs_subvolid_resolve_sub is used only by btrfs_subvolid_resolve. So if
you move the latter after the former then you can drop the declaration
at the beginning of the file altogether.

>  
>  static int btrfs_get_root_id_by_sub_path(int mnt_fd, const char *sub_path,
>u64 *root_id)
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/8] btrfs-progs: qgroups: fix misleading index check

2018-03-07 Thread Nikolay Borisov


On  2.03.2018 20:46, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> In print_single_qgroup_table we check the loop index against
> BTRFS_QGROUP_CHILD, but what we really mean is "last column."  Since
> we have an enum value to indicate the last value, use that instead
> of assuming that BTRFS_QGROUP_CHILD is always last.
> 
> Signed-off-by: Jeff Mahoney 

Reviewed-by: Nikolay Borisov 

> ---
>  qgroup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/qgroup.c b/qgroup.c
> index 11659e83..67bc0738 100644
> --- a/qgroup.c
> +++ b/qgroup.c
> @@ -267,7 +267,7 @@ static void print_single_qgroup_table(struct btrfs_qgroup 
> *qgroup)
>   continue;
>   print_qgroup_column(qgroup, i);
>  
> - if (i != BTRFS_QGROUP_CHILD)
> + if (i != BTRFS_QGROUP_ALL - 1)
>   printf(" ");
>   }
>   printf("\n");
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] btrfs-progs: qgroups: introduce btrfs_qgroup_query

2018-03-07 Thread Misono, Tomohiro
On 2018/03/03 3:47, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> The only mechanism we have in the progs for searching qgroups is to load
> all of them and filter the results.  This works for qgroup show but
> to add quota information to 'btrfs subvoluem show' it's pretty wasteful.
> 
> This patch splits out setting up the search and performing the search so
> we can search for a single qgroupid more easily.
> 
> Signed-off-by: Jeff Mahoney 
> ---
>  qgroup.c | 98 
> +---
>  qgroup.h |  7 +
>  2 files changed, 77 insertions(+), 28 deletions(-)
> 
> diff --git a/qgroup.c b/qgroup.c
> index b1be3311..2d0a6947 100644
> --- a/qgroup.c
> +++ b/qgroup.c
> @@ -1146,11 +1146,11 @@ static inline void print_status_flag_warning(u64 
> flags)
>   warning("qgroup data inconsistent, rescan recommended");
>  }
>  
> -static int __qgroups_search(int fd, struct qgroup_lookup *qgroup_lookup)
> +static int __qgroups_search(int fd, struct btrfs_ioctl_search_args *args,
> + struct qgroup_lookup *qgroup_lookup)
>  {
>   int ret;
> - struct btrfs_ioctl_search_args args;
> - struct btrfs_ioctl_search_key *sk = 
> + struct btrfs_ioctl_search_key *sk = >key;
>   struct btrfs_ioctl_search_header *sh;
>   unsigned long off = 0;
>   unsigned int i;
> @@ -1161,30 +1161,12 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   u64 qgroupid;
>   u64 qgroupid1;
>  
> - memset(, 0, sizeof(args));
> -
> - sk->tree_id = BTRFS_QUOTA_TREE_OBJECTID;
> - sk->max_type = BTRFS_QGROUP_RELATION_KEY;
> - sk->min_type = BTRFS_QGROUP_STATUS_KEY;
> - sk->max_objectid = (u64)-1;
> - sk->max_offset = (u64)-1;
> - sk->max_transid = (u64)-1;
> - sk->nr_items = 4096;
> -
>   qgroup_lookup_init(qgroup_lookup);
>  
>   while (1) {
> - ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, );
> + ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, args);
>   if (ret < 0) {
> - if (errno == ENOENT) {
> - error("can't list qgroups: quotas not enabled");
> - ret = -ENOTTY;
> - } else {
> - error("can't list qgroups: %s",
> -strerror(errno));
> - ret = -errno;
> - }
> -
> + ret = -errno;

Originally, -ENOTTY would be returned when qgroup is disabled
but this changes to return -ENOENT. so, it seems that error check
in 7th patch would not work correctly when qgroup is disabled.

>   break;
>   }
>  
> @@ -1198,14 +1180,14 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>* read the root_ref item it contains
>*/
>   for (i = 0; i < sk->nr_items; i++) {
> - sh = (struct btrfs_ioctl_search_header *)(args.buf +
> + sh = (struct btrfs_ioctl_search_header *)(args->buf +
> off);
>   off += sizeof(*sh);
>  
>   switch (btrfs_search_header_type(sh)) {
>   case BTRFS_QGROUP_STATUS_KEY:
>   si = (struct btrfs_qgroup_status_item *)
> -  (args.buf + off);
> +  (args->buf + off);
>   flags = btrfs_stack_qgroup_status_flags(si);
>  
>   print_status_flag_warning(flags);
> @@ -1213,7 +1195,7 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   case BTRFS_QGROUP_INFO_KEY:
>   qgroupid = btrfs_search_header_offset(sh);
>   info = (struct btrfs_qgroup_info_item *)
> -(args.buf + off);
> +(args->buf + off);
>  
>   ret = update_qgroup_info(fd, qgroup_lookup,
>qgroupid, info);
> @@ -1221,7 +1203,7 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   case BTRFS_QGROUP_LIMIT_KEY:
>   qgroupid = btrfs_search_header_offset(sh);
>   limit = (struct btrfs_qgroup_limit_item *)
> - (args.buf + off);
> + (args->buf + off);
>  
>   ret = update_qgroup_limit(fd, qgroup_lookup,
> qgroupid, limit);
> @@ -1267,6 +1249,66 @@ static int __qgroups_search(int fd, struct 
> qgroup_lookup *qgroup_lookup)
>   

Re: [PATCH 4.4 1/2] btrfs: Don't clear SGID when inheriting ACLs

2018-03-07 Thread Nikolay Borisov


On  7.03.2018 09:57, Nikolay Borisov wrote:
> From: Jan Kara 
> 
> When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
> set, DIR1 is expected to have SGID bit set (and owning group equal to
> the owning group of 'DIR0'). However when 'DIR0' also has some default
> ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
> 'DIR1' to get cleared if user is not member of the owning group.
> 
> Fix the problem by moving posix_acl_update_mode() out of
> __btrfs_set_acl() into btrfs_set_acl(). That way the function will not be
> called when inheriting ACLs which is what we want as it prevents SGID
> bit clearing and the mode has been properly set by posix_acl_create()
> anyway.
> 
> Fixes: 073931017b49d9458aa351605b43a7e34598caef
> CC: sta...@vger.kernel.org
> CC: linux-btrfs@vger.kernel.org
> CC: David Sterba 
> Signed-off-by: Jan Kara 
> Signed-off-by: David Sterba 
> Signed-off-by: Nikolay Borisov 
> ---

Geez, forgot to add the in-reply-to for proper threading. Anyways, those
are the two backports I emailed you yesterday about.


>  fs/btrfs/acl.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/btrfs/acl.c b/fs/btrfs/acl.c
> index fb3e64d37cb4..233bbc8789e0 100644
> --- a/fs/btrfs/acl.c
> +++ b/fs/btrfs/acl.c
> @@ -82,12 +82,6 @@ static int __btrfs_set_acl(struct btrfs_trans_handle 
> *trans,
>   switch (type) {
>   case ACL_TYPE_ACCESS:
>   name = POSIX_ACL_XATTR_ACCESS;
> - if (acl) {
> - ret = posix_acl_update_mode(inode, >i_mode, 
> );
> - if (ret)
> - return ret;
> - }
> - ret = 0;
>   break;
>   case ACL_TYPE_DEFAULT:
>   if (!S_ISDIR(inode->i_mode))
> @@ -123,6 +117,13 @@ static int __btrfs_set_acl(struct btrfs_trans_handle 
> *trans,
>  
>  int btrfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>  {
> + int ret;
> +
> + if (type == ACL_TYPE_ACCESS && acl) {
> + ret = posix_acl_update_mode(inode, >i_mode, );
> + if (ret)
> + return ret;
> + }
>   return __btrfs_set_acl(NULL, inode, acl, type);
>  }
>  
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html