[PATCH] btrfs-progs: Fix NULL pointer when receive clone operation

2016-12-14 Thread Qu Wenruo
The subvol_info returned from subvol_uuid_search() can be NULL.
So the branch checking IS_ERR(si) should also check if it's NULL.

Reported-by: Tsutomu Itoh 
Signed-off-by: Qu Wenruo 
---
 cmds-receive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index cb42aa2..c8f2fff 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -750,7 +750,7 @@ static int process_clone(const char *path, u64 offset, u64 
len,
si = subvol_uuid_search(>sus, 0, clone_uuid, clone_ctransid,
NULL,
subvol_search_by_received_uuid);
-   if (IS_ERR(si)) {
+   if (IS_ERR(si) || !si) {
if (memcmp(clone_uuid, rctx->cur_subvol.received_uuid,
BTRFS_UUID_SIZE) == 0) {
/* TODO check generation of extent */
-- 
2.10.2



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] btrfs fixes and cleanups

2016-12-14 Thread Liu Bo
Hi David,

This is the collection of my patches targetting 4.10, I've
dropped patch "Btrfs: adjust len of writes if following a
preallocated extent" because of the deadlock caused by this
commit.

Patches are based on v4.9-rc8, and test against fstests with
default mount options has been taken to make sure it doesn't
break anything.

I haven't got a kernel.org git repo, so this is mainly for
tracking purpose and for testing git flow.

(cherry-pick patches might be the only way at this moment...sorry
for the inconvenience.)

Anyway, patches can be found at

https://github.com/liubogithub/btrfs-work.git for-dave

Thanks,
liubo

Liu Bo (9):
  Btrfs: add 'inode' for extent map tracepoint
  Btrfs: add truncated_len for ordered extent tracepoints
  Btrfs: use down_read_nested to make lockdep silent
  Btrfs: fix lockdep warning about log_mutex
  Btrfs: fix truncate down when no_holes feature is enabled
  Btrfs: fix btrfs_ordered_update_i_size to update disk_i_size properly
  Btrfs: fix comment in btrfs_page_mkwrite
  Btrfs: clean up btrfs_ordered_update_i_size
  Btrfs: fix another race between truncate and lockless dio write

 fs/btrfs/extent-tree.c   |  3 ++-
 fs/btrfs/inode.c | 43 +++
 fs/btrfs/ordered-data.c  | 42 --
 fs/btrfs/tree-log.c  | 13 ++---
 include/trace/events/btrfs.h | 16 
 5 files changed, 83 insertions(+), 34 deletions(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix another race between truncate and lockless dio write

2016-12-14 Thread Liu Bo
Dio writes can update i_size in btrfs_get_blocks_direct when it
writes to offset beyond EOF so that endio can update disk_i_size
correctly (because we don't udpate disk_i_size beyond i_size).

However, when truncating down a file, we firstly update i_size
and then wait for in-flight lockless dio reads/writes, according
to the above, i_size may have been changed in dio writes, and
file extents don't get truncated.

For lockless dio writes are always overwrites, i_size is not
supposed to be changed, so this adds a check to filter out this
case.

The race could be reproduced by fstests/generic/299 with patch
"Btrfs: fix btrfs_ordered_update_i_size to update disk_i_size properly"
 applied.

Signed-off-by: Liu Bo 
---
 fs/btrfs/inode.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index c9973e5..171d8e8 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -72,6 +72,7 @@ struct btrfs_dio_data {
u64 reserve;
u64 unsubmitted_oe_range_start;
u64 unsubmitted_oe_range_end;
+   int overwrite;
 };
 
 static const struct inode_operations btrfs_dir_inode_operations;
@@ -7833,7 +7834,7 @@ static int btrfs_get_blocks_direct(struct inode *inode, 
sector_t iblock,
 * Need to update the i_size under the extent lock so buffered
 * readers will get the updated i_size when we unlock.
 */
-   if (start + len > i_size_read(inode))
+   if (!dio_data->overwrite && start + len > i_size_read(inode))
i_size_write(inode, start + len);
 
adjust_dio_outstanding_extents(inode, dio_data, len);
@@ -8715,6 +8716,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct 
iov_iter *iter)
 * not unlock the i_mutex at this case.
 */
if (offset + count <= inode->i_size) {
+   dio_data.overwrite = 1;
inode_unlock(inode);
relock = true;
}
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs progs pre-release 4.9-rc1

2016-12-14 Thread Tsutomu Itoh
On 2016/12/14 23:42, David Sterba wrote:
> Hi,
> 
> a pre-release has been tagged. Contains almost the entire devel branch from
> today. There are small fixes, the lowmem mode of check gets more updates but
> still does not work in the --repair mode and is considered experimental.
> 
> ETA for 4.9 is in +6 days (2016-12-20).
> 
> Minor fixes, docs improvements or more testcases will be still considered for
> 4.9 release.

xfstests btrfs/{108,109,117} that was working in 4.8.5 will not work properly.

+ ./check btrfs/108
FSTYP -- btrfs
PLATFORM  -- Linux/x86_64 luna 4.9.0
MKFS_OPTIONS  -- /dev/sdb3
MOUNT_OPTIONS -- /dev/sdb3 /test6

btrfs/108 1s ... [failed, exit status 1] - output mismatch (see 
/xfstests/results//btrfs/108.out.bad)
--- tests/btrfs/108.out 2015-10-19 09:55:52.0 +0900
+++ /xfstests/results//btrfs/108.out.bad2016-12-15 15:41:43.771411349 
+0900
@@ -8,6 +8,6 @@
 File digests in the original filesystem:
 fbf36a062ffcbd644b5739c4d683ccc7  SCRATCH_MNT/snap/foo
 5d2c92827a70aad932cfe7363105c55e  SCRATCH_MNT/snap/bar
-File digests in the new filesystem:
-fbf36a062ffcbd644b5739c4d683ccc7  SCRATCH_MNT/snap/foo
-5d2c92827a70aad932cfe7363105c55e  SCRATCH_MNT/snap/bar
+./common/rc: line 2784: 22352 Segmentation fault  (core dumped) "$@" 
>> $seqres.full 2>&1
...
(Run 'diff -u tests/btrfs/108.out /xfstests/results//btrfs/108.out.bad'  to 
see the entire diff)
Ran: btrfs/108
Failures: btrfs/108
Failed 1 of 1 tests

Thanks,
Tsutomu

> 
> Changes:
>   * check: many lowmem mode updates
>   * send: use splice syscall to copy buffer from kernel
>   * receive: new option to dump the stream in textual form
>   * convert:
> * move sources to own directory
> * prevent accounting of blocks beyond end of the device
> * make it work with 64k sectorsize
>   * mkfs: move sources to own directory
>   * defrag: warns if directory used without -r
>   * dev stats:
> * new option to check stats for non-zero values
> * add long option for -z
>   * library: version bump to 0.1.2, added subvol_uuid_search2
>   * other:
> * cleanups
> * docs updates
> 
> Tarballs: https://www.kernel.org/pub/linux/kernel/people/kdave/btrfs-progs/
> Git: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git
> 
> Shortlog:
> 
> Adam Borowski (1):
>   btrfs-progs: man mkfs: warn about RAID5/6 being experimental
> 
> Anand Jain (1):
>   btrfs-progs: recursive defrag cleanup duplicate code
> 
> Austin S. Hemmelgarn (1):
>   btrfs-progs: dev stats: add dev stats returncode option
> 
> Chandan Rajendra (3):
>   btrfs-progs: Use helper function to access 
> btrfs_super_block->sys_chunk_array_size
>   btrfs-progs: convert: Prevent accounting blocks beyond end of device
>   btrfs-progs: convert: Fix migrate_super_block() to work with 64k 
> sectorsize
> 
> David Sterba (35):
>   btrfs-progs: remove extra newline from messages
>   btrfs-progs: use symbolic name for first inode number when searching
>   btrfs-progs: send: use splice syscall instead of read/write to transfer 
> buffer
>   btrfs-progs: send: rename thread callback to read data from kernel
>   btrfs-progs: make incompat bit wrappers more compact
>   btrfs-progs: receive: rename receive context variable
>   btrfs-progs: check: use on-stack path buffer in check_fs_first_inode
>   btrfs-progs: check: use on-stack path buffer in check_fs_root_v2
>   btrfs-progs: check: use on-stack path buffer in check_fs_roots_v2
>   btrfs-progs: send dump: introduce helper for printing escaped path
>   btrfs-progs: send dump: print escaped path
>   btrfs-progs: send dump: use reentrant variant of localtime
>   btrfs-progs: tests: add more gobal option to test 001-btrfs
>   btrfs-progs: docs: update receive help and manual page
>   btrfs-progs: build: extend pattern rules for standalone directories
>   btrfs-progs: move btrfs-convert to own directory
>   btrfs-progs: move mkfs.btrfs sources to own directory
>   btrfs-progs: tests: check for partscan support in 
> misc/006-partitioned-loopdev
>   btrfs-progs: run mkfs tests in CI
>   btrfs-progs: mkfs: annotation of a case
>   btrfs-progs: docs: clarify trim after mkfs -K
>   btrfs-progs: docs: make documentation updates workflow more clear
>   btrfs-progs: dev stats: adjust some error messages
>   btrfs-progs: dev stats: use char type path
>   btrfs-progs: dev stats: use table based printing of items
>   btrfs-progs: dev stats: add long option for -z
>   btrfs-progs: docs: update dev stats help and manual page
>   btrfs-progs: help: fix printing of aliased commands
>   btrfs-progs: fixup API after change in subvol_uuid_search
>   btrfs-progs: library: bump to 0.1.2
>   btrfs-progs: handle failed strdup in subvol_uuid_search2
>   btrfs-progs: dev stats: update option name for checking non-zero 

[PATCH v2] btrfs-progs: tests: add test for --sync option of qgroup show

2016-12-14 Thread Tsutomu Itoh
Simple test script for the following patch.

   btrfs-progs: qgroup: add sync option to 'qgroup show'

Signed-off-by: Tsutomu Itoh 
---
v2: dropped the test of --no-sync
---
 tests/cli-tests/005-qgroup-show-sync/test.sh | 30 
 1 file changed, 30 insertions(+)
 create mode 100755 tests/cli-tests/005-qgroup-show-sync/test.sh

diff --git a/tests/cli-tests/005-qgroup-show-sync/test.sh 
b/tests/cli-tests/005-qgroup-show-sync/test.sh
new file mode 100755
index 000..a325b48
--- /dev/null
+++ b/tests/cli-tests/005-qgroup-show-sync/test.sh
@@ -0,0 +1,30 @@
+#!/bin/bash
+#
+# simple test of qgroup show --sync option
+
+source $TOP/tests/common
+
+check_prereq mkfs.btrfs
+check_prereq btrfs
+
+setup_root_helper
+prepare_test_dev 1g
+
+run_check $TOP/mkfs.btrfs -f $IMAGE
+run_check_mount_test_dev
+
+run_check $SUDO_HELPER $TOP/btrfs subvolume create $TEST_MNT/Sub
+run_check $SUDO_HELPER $TOP/btrfs quota enable $TEST_MNT/Sub
+
+for opt in '' '--' '--sync'; do
+   run_check $SUDO_HELPER $TOP/btrfs qgroup limit 300M $TEST_MNT/Sub
+   run_check $SUDU_HELPER dd if=/dev/zero of=$TEST_MNT/Sub/file bs=1M 
count=200
+
+   run_check $SUDO_HELPER $TOP/btrfs qgroup show -re $opt $TEST_MNT/Sub
+
+   run_check $SUDO_HELPER $TOP/btrfs qgroup limit none $TEST_MNT/Sub
+   run_check rm -f $TEST_MNT/Sub/file
+   run_check $TOP/btrfs filesystem sync $TEST_MNT/Sub
+done
+
+run_check_umount_test_dev
-- 
2.9.3


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/2] btrfs-progs: qgroup: change the value of sort option

2016-12-14 Thread Tsutomu Itoh
The value of sort option ('S') is not used for option letter.
Therefore, I'll change the single letter to non-character.

Signed-off-by: Tsutomu Itoh 
---
This patch is separated from patch of --sync option.
---
 cmds-qgroup.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index 2a10c97..34e3bcc 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -313,10 +313,11 @@ static int cmd_qgroup_show(int argc, char **argv)
while (1) {
int c;
enum {
-   GETOPT_VAL_SYNC = 256
+   GETOPT_VAL_SORT = 256,
+   GETOPT_VAL_SYNC
};
static const struct option long_options[] = {
-   {"sort", required_argument, NULL, 'S'},
+   {"sort", required_argument, NULL, GETOPT_VAL_SORT},
{"sync", no_argument, NULL, GETOPT_VAL_SYNC},
{ NULL, 0, NULL, 0 }
};
@@ -347,7 +348,7 @@ static int cmd_qgroup_show(int argc, char **argv)
case 'f':
filter_flag |= 0x2;
break;
-   case 'S':
+   case GETOPT_VAL_SORT:
ret = btrfs_qgroup_parse_sort_string(optarg,
 _set);
if (ret)
-- 
2.9.3
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/2] btrfs-progs: qgroup: add sync option to 'qgroup show'

2016-12-14 Thread Tsutomu Itoh
The 'qgroup show' command does not synchronize filesystem.
Therefore, 'qgroup show' may not display the correct value unless
synchronized with 'filesystem sync' command etc.

So add the '--sync' option so that we can choose whether or not
to synchronize when executing the command.

Signed-off-by: Tsutomu Itoh 
---
v2: use getopt_long with enum instead of single letter (suggested by Qu)
v3: dropped the --no-sync option and separated the patch of sort
option (suggested by David)
---
 Documentation/btrfs-qgroup.asciidoc |  4 
 cmds-qgroup.c   | 22 --
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/Documentation/btrfs-qgroup.asciidoc 
b/Documentation/btrfs-qgroup.asciidoc
index 438dbc7..3053f2e 100644
--- a/Documentation/btrfs-qgroup.asciidoc
+++ b/Documentation/btrfs-qgroup.asciidoc
@@ -126,6 +126,10 @@ Prefix \'+' means ascending order and \'-' means 
descending order of .
 If no prefix is given, use ascending order by default.
 +
 If multiple s is given, use comma to separate.
++
+--sync
+To retrieve information after updating the state of qgroups,
+force sync of the filesystem identified by  before getting information.
 
 EXIT STATUS
 ---
diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index bc15077..2a10c97 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -272,8 +272,7 @@ static int cmd_qgroup_destroy(int argc, char **argv)
 }
 
 static const char * const cmd_qgroup_show_usage[] = {
-   "btrfs qgroup show -pcreFf "
-   "[--sort=qgroupid,rfer,excl,max_rfer,max_excl] ",
+   "btrfs qgroup show [options] ",
"Show subvolume quota groups.",
"-p print parent qgroup id",
"-c print child qgroup id",
@@ -288,6 +287,7 @@ static const char * const cmd_qgroup_show_usage[] = {
"   list qgroups sorted by specified items",
"   you can use '+' or '-' in front of each item.",
"   (+:ascending, -:descending, ascending default)",
+   "--sync force sync of the filesystem before getting info",
NULL
 };
 
@@ -301,6 +301,7 @@ static int cmd_qgroup_show(int argc, char **argv)
u64 qgroupid;
int filter_flag = 0;
unsigned unit_mode;
+   int sync = 0;
 
struct btrfs_qgroup_comparer_set *comparer_set;
struct btrfs_qgroup_filter_set *filter_set;
@@ -311,8 +312,12 @@ static int cmd_qgroup_show(int argc, char **argv)
 
while (1) {
int c;
+   enum {
+   GETOPT_VAL_SYNC = 256
+   };
static const struct option long_options[] = {
{"sort", required_argument, NULL, 'S'},
+   {"sync", no_argument, NULL, GETOPT_VAL_SYNC},
{ NULL, 0, NULL, 0 }
};
 
@@ -348,6 +353,9 @@ static int cmd_qgroup_show(int argc, char **argv)
if (ret)
usage(cmd_qgroup_show_usage);
break;
+   case GETOPT_VAL_SYNC:
+   sync = 1;
+   break;
default:
usage(cmd_qgroup_show_usage);
}
@@ -365,6 +373,16 @@ static int cmd_qgroup_show(int argc, char **argv)
return 1;
}
 
+   if (sync) {
+   ret = ioctl(fd, BTRFS_IOC_SYNC);
+   if (ret < 0) {
+   error("sync ioctl failed on '%s': %s", path,
+ strerror(errno));
+   close_file_or_dir(fd, dirstream);
+   goto out;
+   }
+   }
+
if (filter_flag) {
ret = lookup_path_rootid(fd, );
if (ret < 0) {
-- 
2.9.3
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs-progs: qgroup: add sync option to 'qgroup show'

2016-12-14 Thread Tsutomu Itoh
Hi David,

Thanks for the review.

On 2016/12/14 19:54, David Sterba wrote:
> On Wed, Dec 07, 2016 at 04:55:15PM +0900, Tsutomu Itoh wrote:
>> The 'qgroup show' command does not synchronize filesystem.
>> Therefore, 'qgroup show' may not display the correct value unless
>> synchronized with 'filesystem sync' command etc.
>>
>> So add the '--sync' and '--no-sync' options so that we can choose
>> whether or not to synchronize when executing the command.
>>
>> Signed-off-by: Tsutomu Itoh 
>> ---
>> v2: use getopt_long with enum instead of single letter (suggested by Qu)
>> ---
>>  Documentation/btrfs-qgroup.asciidoc |  6 ++
>>  cmds-qgroup.c   | 33 +
>>  2 files changed, 35 insertions(+), 4 deletions(-)
>>
>> diff --git a/Documentation/btrfs-qgroup.asciidoc 
>> b/Documentation/btrfs-qgroup.asciidoc
>> index 438dbc7..9c65795 100644
>> --- a/Documentation/btrfs-qgroup.asciidoc
>> +++ b/Documentation/btrfs-qgroup.asciidoc
>> @@ -126,6 +126,12 @@ Prefix \'+' means ascending order and \'-' means 
>> descending order of .
>>  If no prefix is given, use ascending order by default.
>>  +
>>  If multiple s is given, use comma to separate.
>> ++
>> +--sync
>> +To retrieve information after updating the status of qgroups,
>> +invoke sync before getting information.
> 
> This could be more specific, that it's a filesystem sync.
> 
>> +--no-sync
>> +Do not invoke sync before getting information (default).
> 
> I'm not sure we need this option, how is it supposed to be used?

I made it to pair with --sync, but there is no use case in particular.
So, I would like to drop this with the next patch.

> 
>> @@ -311,8 +313,15 @@ static int cmd_qgroup_show(int argc, char **argv)
>>  
>>  while (1) {
>>  int c;
>> +enum {
>> +GETOPT_VAL_SORT = 256,
>> +GETOPT_VAL_SYNC,
>> +GETOPT_VAL_NO_SYNC
>> +};
>>  static const struct option long_options[] = {
>> -{"sort", required_argument, NULL, 'S'},
>> +{"sort", required_argument, NULL, GETOPT_VAL_SORT},
> 
> This change is unrelated to the patch, please make a separate patch for
> that.

OK. I'll separate this with the next patch.

Thanks,
Tsutomu

> 
> Otherwise looks good.
> 
>> +{"sync", no_argument, NULL, GETOPT_VAL_SYNC},
>> +{"no-sync", no_argument, NULL, GETOPT_VAL_NO_SYNC},
>>  { NULL, 0, NULL, 0 }
>>  };
>>  


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: page allocation stall in kernel 4.9 when copying files from one btrfs hdd to another

2016-12-14 Thread Xin Zhou
Hi,

The dirty data is in large amount, probably unable to commit to disk.
And this seems to happen when copying from 7200rpm to 5600rpm disks, according 
to previous post.

Probably the I/Os are buffered and pending, unable to get finished in-time.
It might be helpful to know if this only happens for specific types of 5600 rpm 
disks?

And are these disks on RAID groups? Thanks.
Xin
 
 

Sent: Wednesday, December 14, 2016 at 3:38 AM
From: admin 
To: "Michal Hocko" 
Cc: linux-btrfs@vger.kernel.org, linux-ker...@vger.kernel.org, "David Sterba" 
, "Chris Mason" 
Subject: Re: page allocation stall in kernel 4.9 when copying files from one 
btrfs hdd to another
Hi,

I verified the log files and see no prior oom killer invocation. Unfortunately 
the machine has been rebooted since. Next time it happens, I will also look in 
dmesg.

Thanks,
David Arendt


Michal Hocko – Wed., 14. December 2016 11:31
> Btw. the stall should be preceded by the OOM killer invocation. Could
> you share the OOM report please. I am asking because such an OOM killer
> would be clearly pre-mature as per your meminfo. I am trying to change
> that code and seeing your numbers might help me.
>
> Thanks!
>
> On Wed 14-12-16 11:17:43, Michal Hocko wrote:
> > On Tue 13-12-16 18:11:01, David Arendt wrote:
> > > Hi,
> > >
> > > I receive the following page allocation stall while copying lots of
> > > large files from one btrfs hdd to another.
> > >
> > > Dec 13 13:04:29 server kernel: kworker/u16:8: page allocation stalls for 
> > > 12260ms, order:0, mode:0x2400840(GFP_NOFS|__GFP_NOFAIL)
> > > Dec 13 13:04:29 server kernel: CPU: 0 PID: 24959 Comm: kworker/u16:8 
> > > Tainted: P O 4.9.0 #1
> > [...]
> > > Dec 13 13:04:29 server kernel: Call Trace:
> > > Dec 13 13:04:29 server kernel: [] ? dump_stack+0x46/0x5d
> > > Dec 13 13:04:29 server kernel: [] ? 
> > > warn_alloc+0x111/0x130
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > __alloc_pages_nodemask+0xbe8/0xd30
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > pagecache_get_page+0xe4/0x230
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > alloc_extent_buffer+0x10b/0x400
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > btrfs_alloc_tree_block+0x125/0x560
> >
> > OK, so this is
> > find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL)
> >
> > The main question is whether this really needs to be NOFS request...
> >
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > read_extent_buffer_pages+0x21f/0x280
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > __btrfs_cow_block+0x141/0x580
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > btrfs_cow_block+0x100/0x150
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > btrfs_search_slot+0x1e9/0x9c0
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > __set_extent_bit+0x512/0x550
> > > Dec 13 13:04:33 server kernel: [] ? 
> > > lookup_inline_extent_backref+0xf5/0x5e0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > set_extent_bit+0x24/0x30
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > update_block_group.isra.34+0x114/0x380
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > __btrfs_free_extent.isra.35+0xf4/0xd20
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > btrfs_merge_delayed_refs+0x61/0x5d0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > __btrfs_run_delayed_refs+0x902/0x10a0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > btrfs_run_delayed_refs+0x90/0x2a0
> > > Dec 13 13:04:34 server kernel: [] ? 
> > > delayed_ref_async_start+0x84/0xa0
> >
> > What would cause the reclaim recursion?
> >
> > > Dec 13 13:04:34 server kernel: Mem-Info:
> > > Dec 13 13:04:34 server kernel: active_anon:20 inactive_anon:34
> > > isolated_anon:0\x0a active_file:7370032 inactive_file:450105
> > > isolated_file:320\x0a unevictable:0 dirty:522748 writeback:189
> > > unstable:0\x0a slab_reclaimable:178255 slab_unreclaimable:124617\x0a
> > > mapped:4236 shmem:0 pagetables:1163 bounce:0\x0a free:38224 free_pcp:241
> > > free_cma:0
> >
> > This speaks for itself. There is a lot of dirty data, basically no
> > anonymous memory and GFP_NOFS cannot do much to reclaim obviously. This
> > is either a configuraion bug as somebody noted down the thread (setting
> > the dirty_ratio) or suboptimality of the btrfs code which might request
> > NOFS even though it is not strictly necessary. This would be more for
> > btrfs developers.
> > --
> > Michal Hocko
> > SUSE Labs
>
> --
> Michal Hocko
> SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] duperemove: test presence of dedupe ioctl

2016-12-14 Thread Christoph Hellwig
On Wed, Dec 14, 2016 at 10:38:45AM -0800, Darrick J. Wong wrote:
> > > +struct fake_btrfs_ioctl_same_args {
> > > + struct btrfs_ioctl_same_args args;
> > > + struct btrfs_ioctl_same_extent_info info;
> > > +};
> > 
> > Why does this need a fake structure here?
> 
> In order to test the ioctl we have to fill out at least one
> btrfs_ioctl_same_extent_info so that we get far enough into the fs-specific
> dedupe_range handler that we've verified that the fs is capable of dedupe and
> that the fs is willing to try to satisfy the request.

Oh, got it, it's just the fake that tripped me up.

> We could just malloc sizeof(_same_args) + sizeof(_same_extent_info)...

Either that, or more simply just don't give the structure a name
by just declaring it locally on the stack:

struct {
struct btrfs_ioctl_same_args args;
struct btrfs_ioctl_same_extent_info info;
} sa = { 0 };

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix btrfs_ordered_update_i_size to update disk_i_size properly

2016-12-14 Thread Liu Bo
On Thu, Dec 01, 2016 at 01:46:10PM -0800, Liu Bo wrote:
> btrfs_ordered_update_i_size can be called by truncate and endio, but only 
> endio
> takes ordered_extent which contains the completed IO.
> 
> while truncating down a file, if there are some in-flight IOs,
> btrfs_ordered_update_i_size in endio will set disk_i_size to @orig_offset that
> is zero.  If truncating-down fails somehow, we try to recover in memory isize
> with this zero'd disk_i_size.
> 
> Fix it by only updating disk_i_size with @orig_offset when
> btrfs_ordered_update_i_size is not called from endio while truncating down and
> waiting for in-flight IOs completing their work before recover in-memory size.
> 
> Besides fixing the above issue, add an assertion for last_size to double check
> we truncate down to the desired size.
> 
> Signed-off-by: Liu Bo 
> ---
>  fs/btrfs/inode.c| 14 ++
>  fs/btrfs/ordered-data.c |  9 +++--
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 09157dd..ef3594d 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -4682,6 +4682,13 @@ int btrfs_truncate_inode_items(struct 
> btrfs_trans_handle *trans,
>  
>   btrfs_free_path(path);
>  
> + if (err == 0) {
> + /* only inline file may have last_size != new_size */
> + if (new_size >= root->sectorsize ||
> + new_size > root->fs_info->max_inline)
> + ASSERT(last_size == new_size);
> + }
> +

This ASSERT has been hit by fstests/generic/299, and it didn't show up
the first time I tested, I'm trying to figure out whether we have
problems in code or in this ASSERT.

Thanks,

-liubo

>   if (be_nice && bytes_deleted > SZ_32M) {
>   unsigned long updates = trans->delayed_ref_updates;
>   if (updates) {
> @@ -5064,6 +5071,13 @@ static int btrfs_setsize(struct inode *inode, struct 
> iattr *attr)
>   if (ret && inode->i_nlink) {
>   int err;
>  
> + /* To get a stable disk_i_size */
> + err = btrfs_wait_ordered_range(inode, 0, (u64)-1);
> + if (err) {
> + btrfs_orphan_del(NULL, inode);
> + return err;
> + }
> +
>   /*
>* failed to truncate, disk_i_size is only adjusted down
>* as we remove extents, so it should represent the true
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index b2d1e95..5eaa25a 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -982,8 +982,13 @@ int btrfs_ordered_update_i_size(struct inode *inode, u64 
> offset,
>   }
>   disk_i_size = BTRFS_I(inode)->disk_i_size;
>  
> - /* truncate file */
> - if (disk_i_size > i_size) {
> + /*
> +  * truncate file.
> +  * If ordered is not NULL, then this is called from endio and
> +  * disk_i_size will be updated by either truncate itself or any
> +  * in-flight IOs which are inside the disk_i_size.
> +  */
> + if (!ordered && disk_i_size > i_size) {
>   BTRFS_I(inode)->disk_i_size = orig_offset;
>   ret = 0;
>   goto out;
> -- 
> 2.5.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] duperemove: test presence of dedupe ioctl

2016-12-14 Thread Darrick J. Wong
On Wed, Dec 14, 2016 at 02:44:36AM -0800, Christoph Hellwig wrote:
> On Fri, Dec 09, 2016 at 09:56:45AM -0800, Darrick J. Wong wrote:
> > Since a zero-length dedupe operation is guaranteed to succeed, use that
> > to test whether or not this filesystem supports dedupe.
> > 
> > Signed-off-by: Darrick J. Wong 
> > ---
> >  file_scan.c |   47 +--
> >  1 file changed, 37 insertions(+), 10 deletions(-)
> > 
> > diff --git a/file_scan.c b/file_scan.c
> > index 617f166..a34453e 100644
> > --- a/file_scan.c
> > +++ b/file_scan.c
> > @@ -45,11 +45,7 @@
> >  #include "file_scan.h"
> >  #include "dbfile.h"
> >  #include "util.h"
> > -
> > -/* This is not in linux/magic.h */
> > -#ifndefXFS_SB_MAGIC
> > -#defineXFS_SB_MAGIC0x58465342  /* 'XFSB' */
> > -#endif
> > +#include "btrfs-ioctl.h"
> >  
> >  static char path[PATH_MAX] = { 0, };
> >  static char *pathp = path;
> > @@ -189,6 +185,39 @@ static int walk_dir(const char *name)
> > return ret;
> >  }
> >  
> > +struct fake_btrfs_ioctl_same_args {
> > +   struct btrfs_ioctl_same_args args;
> > +   struct btrfs_ioctl_same_extent_info info;
> > +};
> 
> Why does this need a fake structure here?

In order to test the ioctl we have to fill out at least one
btrfs_ioctl_same_extent_info so that we get far enough into the fs-specific
dedupe_range handler that we've verified that the fs is capable of dedupe and
that the fs is willing to try to satisfy the request.

We could just malloc sizeof(_same_args) + sizeof(_same_extent_info)...

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs progs pre-release 4.9-rc1

2016-12-14 Thread David Sterba
Hi,

a pre-release has been tagged. Contains almost the entire devel branch from
today. There are small fixes, the lowmem mode of check gets more updates but
still does not work in the --repair mode and is considered experimental.

ETA for 4.9 is in +6 days (2016-12-20).

Minor fixes, docs improvements or more testcases will be still considered for
4.9 release.

Changes:
  * check: many lowmem mode updates
  * send: use splice syscall to copy buffer from kernel
  * receive: new option to dump the stream in textual form
  * convert:
* move sources to own directory
* prevent accounting of blocks beyond end of the device
* make it work with 64k sectorsize
  * mkfs: move sources to own directory
  * defrag: warns if directory used without -r
  * dev stats:
* new option to check stats for non-zero values
* add long option for -z
  * library: version bump to 0.1.2, added subvol_uuid_search2
  * other:
* cleanups
* docs updates

Tarballs: https://www.kernel.org/pub/linux/kernel/people/kdave/btrfs-progs/
Git: git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git

Shortlog:

Adam Borowski (1):
  btrfs-progs: man mkfs: warn about RAID5/6 being experimental

Anand Jain (1):
  btrfs-progs: recursive defrag cleanup duplicate code

Austin S. Hemmelgarn (1):
  btrfs-progs: dev stats: add dev stats returncode option

Chandan Rajendra (3):
  btrfs-progs: Use helper function to access 
btrfs_super_block->sys_chunk_array_size
  btrfs-progs: convert: Prevent accounting blocks beyond end of device
  btrfs-progs: convert: Fix migrate_super_block() to work with 64k 
sectorsize

David Sterba (35):
  btrfs-progs: remove extra newline from messages
  btrfs-progs: use symbolic name for first inode number when searching
  btrfs-progs: send: use splice syscall instead of read/write to transfer 
buffer
  btrfs-progs: send: rename thread callback to read data from kernel
  btrfs-progs: make incompat bit wrappers more compact
  btrfs-progs: receive: rename receive context variable
  btrfs-progs: check: use on-stack path buffer in check_fs_first_inode
  btrfs-progs: check: use on-stack path buffer in check_fs_root_v2
  btrfs-progs: check: use on-stack path buffer in check_fs_roots_v2
  btrfs-progs: send dump: introduce helper for printing escaped path
  btrfs-progs: send dump: print escaped path
  btrfs-progs: send dump: use reentrant variant of localtime
  btrfs-progs: tests: add more gobal option to test 001-btrfs
  btrfs-progs: docs: update receive help and manual page
  btrfs-progs: build: extend pattern rules for standalone directories
  btrfs-progs: move btrfs-convert to own directory
  btrfs-progs: move mkfs.btrfs sources to own directory
  btrfs-progs: tests: check for partscan support in 
misc/006-partitioned-loopdev
  btrfs-progs: run mkfs tests in CI
  btrfs-progs: mkfs: annotation of a case
  btrfs-progs: docs: clarify trim after mkfs -K
  btrfs-progs: docs: make documentation updates workflow more clear
  btrfs-progs: dev stats: adjust some error messages
  btrfs-progs: dev stats: use char type path
  btrfs-progs: dev stats: use table based printing of items
  btrfs-progs: dev stats: add long option for -z
  btrfs-progs: docs: update dev stats help and manual page
  btrfs-progs: help: fix printing of aliased commands
  btrfs-progs: fixup API after change in subvol_uuid_search
  btrfs-progs: library: bump to 0.1.2
  btrfs-progs: handle failed strdup in subvol_uuid_search2
  btrfs-progs: dev stats: update option name for checking non-zero status
  btrfs-progs: defrag: cleanup temporary errno value
  btrfs-progs: defrag: warn when deframgenting directories without -r
  btrfs-progs: update CHANGES for v4.9

Goldwyn Rodrigues (5):
  btrfs-progs: Correct value printed by assertions/BUG_ON/WARN_ON
  btrfs-progs: Remove duplicate printfs in warning_trace()/assert_trace()
  btrfs-progs: check: fix extents after finding all errors
  btrfs-progs: Initialize ret to suppress compiler warning
  btrfs-progs: find_free_dev_extent() closer to kernel code

Lu Fengqi (11):
  btrfs-progs: check: introduce function to find dir_item
  btrfs-progs: check: introduce function to check inode_ref
  btrfs-progs: check: introduce function to check inode_extref
  btrfs-progs: check: introduce function to find inode_ref
  btrfs-progs: check: introduce function to check dir_item
  btrfs-progs: check: introduce function to check file extent
  btrfs-progs: check: introduce function to check inode item
  btrfs-progs: check: introduce function to check fs root
  btrfs-progs: check: introduce function to check root ref
  btrfs-progs: check: introduce low_memory mode fs_tree check
  btrfs-progs: check: fix the return value bug of cmd_check()

Noah Massey (1):
  btrfs-progs: docs: fix typo in 

[RFC] btrfs: lockdep says "possible recursive locking detected" in btrfs_clear_lock_blocking_rw()

2016-12-14 Thread Sebastian Andrzej Siewior
With lockdep enabled I managed to trigger the following lockdep splat:
| =
| [ INFO: possible recursive locking detected ]
| 4.9.0-rt0 #804 Tainted: GW  
| -
| kworker/u16:4/154 is trying to acquire lock:
|  (btrfs-fs-00){+.+...}, at: [] 
btrfs_clear_lock_blocking_rw+0x71/0x120
| 
| but task is already holding lock:
|  (btrfs-fs-00){+.+...}, at: [] 
btrfs_clear_lock_blocking_rw+0x71/0x120
| 
| other info that might help us debug this:
|  Possible unsafe locking scenario:
|
|CPU0
|
|   lock(btrfs-fs-00);
|   lock(btrfs-fs-00);
| 
|  *** DEADLOCK ***
|
|  May be due to missing lock nesting notation
|
| 6 locks held by kworker/u16:4/154:
|  #0:  ("%s-%s""btrfs", name){.+.+.+}, at: [] 
process_one_work+0x1f3/0x7b0
|  #1:  ((>normal_work)){+.+.+.}, at: [] 
process_one_work+0x1f3/0x7b0
|  #2:  (sb_internal){.+.+..}, at: [] 
start_transaction+0x2f1/0x590
|  #3:  (btrfs-fs-02){+.+...}, at: [] 
btrfs_clear_lock_blocking_rw+0x71/0x120
|  #4:  (btrfs-fs-01){+.+...}, at: [] 
btrfs_clear_lock_blocking_rw+0x71/0x120
|  #5:  (btrfs-fs-00){+.+...}, at: [] 
btrfs_clear_lock_blocking_rw+0x71/0x120
| 
| stack backtrace:
| CPU: 1 PID: 154 Comm: kworker/u16:4 Tainted: GW   4.9.0-rt1+ #804
| Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
| Workqueue: btrfs-delalloc btrfs_delalloc_helper
|  c9000123b7d0 8141a2a5 829d6db0 829d6db0
|  c9000123b890 810c19dd 02fe 0006
|  3c272f80 82308200 ce68145f590e60eb 880039c108c0
| Call Trace:
|  [] dump_stack+0x86/0xc1
|  [] __lock_acquire+0x6dd/0x11d0
|  [] lock_acquire+0x116/0x240
|  [] rt_read_lock+0x45/0x60
|  [] btrfs_clear_lock_blocking_rw+0x71/0x120
|  [] btrfs_clear_path_blocking+0x94/0xb0
|  [] btrfs_next_old_leaf+0x3df/0x420
|  [] btrfs_next_leaf+0xb/0x10
|  [] __btrfs_drop_extents+0x1cb/0xd50
|  [] cow_file_range_inline+0x191/0x6c0
|  [] compress_file_range.constprop.68+0x314/0x710
|  [] async_cow_start+0x30/0x50
|  [] btrfs_scrubparity_helper+0xfd/0x620
|  [] btrfs_delalloc_helper+0x9/0x10
|  [] process_one_work+0x26e/0x7b0
|  [] worker_thread+0x46/0x560
|  [] kthread+0xee/0x110
|  [] ret_from_fork+0x2a/0x40

I can trigger it on -RT but it won't show up on a vanilla kernel. I
don't see obvious difference here (between RT and !RT). We do have more
preemption points and a spin_lock() does not disable preemption (so any
assumption on spin_lock() disabling preemption will fail).
With all btrfs events enabled, this did not trigger. With the following
patch

--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -41,6 +41,7 @@ void btrfs_set_lock_blocking_rw(struct extent_buffer *eb, int 
rw)
 */
if (eb->lock_nested && current->pid == eb->lock_owner)
return;
+   trace_printk("eb %p rw %d\n", eb, rw);
if (rw == BTRFS_WRITE_LOCK) {
if (atomic_read(>blocking_writers) == 0) {
WARN_ON(atomic_read(>spinning_writers) != 1);
@@ -73,6 +74,7 @@ void btrfs_clear_lock_blocking_rw(struct extent_buffer *eb, 
int rw)
if (eb->lock_nested && current->pid == eb->lock_owner)
return;
 
+   trace_printk("eb %p rw %d\n", eb, rw);
if (rw == BTRFS_WRITE_LOCK_BLOCKING) {
BUG_ON(atomic_read(>blocking_writers) != 1);
write_lock(>lock);

I manage to collect this (the last few lines from the kworker):

#  _-=> irqs-off
# / _=> need-resched
#|/  _-=> need-resched_lazy
#|| / _---=> hardirq/softirq
#||| / _--=> preempt-depth
# / _-=> preempt-lazy-depth
#| / _-=> migrate-disable   
#|| /delay
#   TASK-PID   CPU#  |||   TIMESTAMP  FUNCTION
#  | |   |   |||  | |
   kworker/u16:4-154   [001] .1160.632361: btrfs_set_lock_blocking_rw: 
eb 880039ebac00 rw 1
   kworker/u16:4-154   [001] ...60.632362: 
btrfs_clear_lock_blocking_rw: eb 880039ebac00 rw 3
   kworker/u16:4-154   [001] .1160.632366: btrfs_set_lock_blocking_rw: 
eb 880039ebac00 rw 1
   kworker/u16:4-154   [001] ...60.632367: 
btrfs_clear_lock_blocking_rw: eb 880039ebac00 rw 3
   kworker/u16:4-154   [001] .1160.632367: btrfs_set_lock_blocking_rw: 
eb 880039ebac00 rw 1
   kworker/u16:4-154   [001] ...60.632368: btrfs_set_lock_blocking_rw: 
eb 880039ebac00 rw 3
   kworker/u16:4-154   [001] ...60.632369: 
btrfs_clear_lock_blocking_rw: eb 880039ebac00 rw 3
   kworker/u16:4-154   [001] .1260.632371: btrfs_set_lock_blocking_rw: 
eb 880039ebb000 rw 1
   kworker/u16:4-154   [001] 

[PATCH 2/2] btrfs: swap free() and trace point in run_ordered_work()

2016-12-14 Thread Sebastian Andrzej Siewior
The previous patch removed a trace point due to a use after free problem
with tracing enabled. While looking at the backtrace it took me a while
to find the right spot. While doing so I noticed that this trace point
could be used after one of two clean-up functions were invoked:
- run_one_async_free()
- async_cow_free()

Both of them free the `work' item so a later use in the tracepoint is
not possible.
This patch swaps the order so we first have the trace point and then
free the struct.

Signed-off-by: Sebastian Andrzej Siewior 
---
 fs/btrfs/async-thread.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index d0dfc3d2e199..6f4631bf74f8 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -288,8 +288,8 @@ static void run_ordered_work(struct __btrfs_workqueue *wq)
 * we don't want to call the ordered free functions
 * with the lock held though
 */
-   work->ordered_free(work);
trace_btrfs_all_work_done(work);
+   work->ordered_free(work);
}
spin_unlock_irqrestore(lock, flags);
 }
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-14 Thread Sebastian Andrzej Siewior
For btrfs_scrubparity_helper() the ->func() is set to
scrub_parity_bio_endio_worker(). This functions invokes
scrub_free_parity() which kfrees() the `work' object. All is good as
long as trace events are not enabled because we boom with a backtrace
like this:
| Workqueue: btrfs-endio btrfs_endio_helper
| RIP: 0010:[]  [] 
trace_event_raw_event_btrfs__work__done+0x4e/0xa0
| Call Trace:
|  [] btrfs_scrubparity_helper+0x59d/0x780
|  [] btrfs_endio_helper+0x9/0x10
|  [] process_one_work+0x26e/0x7b0
|  [] worker_thread+0x46/0x560
|  [] kthread+0xee/0x110
|  [] ret_from_fork+0x2a/0x40

So in order to avoid this, I remove the trace point.

Signed-off-by: Sebastian Andrzej Siewior 
---
 fs/btrfs/async-thread.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index e0f071f6b5a7..d0dfc3d2e199 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -318,8 +318,6 @@ static void normal_work_helper(struct btrfs_work *work)
set_bit(WORK_DONE_BIT, >flags);
run_ordered_work(wq);
}
-   if (!need_order)
-   trace_btrfs_all_work_done(work);
 }
 
 void btrfs_init_work(struct btrfs_work *work, btrfs_work_func_t uniq_func,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Lowmem fsck false alert fixes

2016-12-14 Thread David Sterba
On Mon, Dec 05, 2016 at 05:07:52PM +0800, Qu Wenruo wrote:
> Btrfs-progs test case 023 will cause assert and a lot of false alerts
> for lowmem mode.
> 
> The problems are caused by several reasons, from bad handler for tree
> reloc root(calling btrfs_read_fs_root on tree reloc tree) to too
> restrict check.
> 
> Fix the lowmem mode bugs.
> 
> There is another bug which affects both original mode and lowmem mode,
> it seems to be caused by this commit:
> commit 00e769d04c2c83029d6c71fbded133597d93ad55
> Author: Goldwyn Rodrigues 
> Date:   Tue Nov 29 10:24:52 2016 -0600
> 
> btrfs-progs: Correct value printed by assertions/BUG_ON/WARN_ON
> 
> Informed Goldwyn to fix it.
> So the fix for the common assert is not included in this patchset.
> 
> Qu Wenruo (4):
>   btrfs-progs: check: Fix assert when using lowmem on fs with tree reloc
> tree
>   btrfs-progs: check: Fix lowmem mode stack overflow caused by fsck/023
>   btrfs-progs: check: Fix lowmem false alert on tree reloc tree
>   btrfs-progs: check: Fix false alert on generation mismatch for tree
> reloc tree

1-4 applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs-progs: btrfs-convert: Prevent accounting blocks beyond end of device

2016-12-14 Thread David Sterba
On Fri, Dec 09, 2016 at 09:03:57AM +0800, Qu Wenruo wrote:
> Hi Chandan,
> 
> Thanks for the patch.
> 
> At 12/08/2016 09:56 PM, Chandan Rajendra wrote:
> > When looping across data block bitmap, __ext2_add_one_block() may add
> > blocks which do not exist on the underlying disk. This commit prevents
> > this from happening by checking the block index against the maximum
> > block count that was present in the ext4 filesystem instance that is
> > being converted.
> 
> The patch looks good to me.
> 
> Reviewed-by: Qu Wenruo 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs-convert: Fix migrate_super_block() to work with 64k sectorsize

2016-12-14 Thread David Sterba
On Fri, Dec 09, 2016 at 09:09:29AM +0800, Qu Wenruo wrote:
> 
> 
> At 12/08/2016 09:56 PM, Chandan Rajendra wrote:
> > migrate_super_block() uses sectorsize to refer to the size of the
> > superblock. Hence on 64k sectorsize filesystems, it ends up computing
> > checksum beyond the super block length (i.e.
> > BTRFS_SUPER_INFO_SIZE). This commit fixes the bug by using
> > BTRFS_SUPER_INFO_SIZE instead of sectorsize of the underlying
> > filesystem.
> >
> > Signed-off-by: Chandan Rajendra 
> 
> Reviewed-by: Qu Wenruo 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: page allocation stall in kernel 4.9 when copying files from one btrfs hdd to another

2016-12-14 Thread admin
Hi,

I verified the log files and see no prior oom killer invocation. Unfortunately 
the machine has been rebooted since. Next time it happens, I will also look in 
dmesg.

Thanks,
David Arendt 


Michal Hocko – Wed., 14. December 2016 11:31
> Btw. the stall should be preceded by the OOM killer invocation. Could
> you share the OOM report please. I am asking because such an OOM killer
> would be clearly pre-mature as per your meminfo. I am trying to change
> that code and seeing your numbers might help me.
> 
> Thanks!
> 
> On Wed 14-12-16 11:17:43, Michal Hocko wrote:
> > On Tue 13-12-16 18:11:01, David Arendt wrote:
> > > Hi,
> > > 
> > > I receive the following page allocation stall while copying lots of
> > > large files from one btrfs hdd to another.
> > > 
> > > Dec 13 13:04:29 server kernel: kworker/u16:8: page allocation stalls for 
> > > 12260ms, order:0, mode:0x2400840(GFP_NOFS|__GFP_NOFAIL)
> > > Dec 13 13:04:29 server kernel: CPU: 0 PID: 24959 Comm: kworker/u16:8 
> > > Tainted: P   O4.9.0 #1
> > [...]
> > > Dec 13 13:04:29 server kernel: Call Trace:
> > > Dec 13 13:04:29 server kernel:  [] ? 
> > > dump_stack+0x46/0x5d
> > > Dec 13 13:04:29 server kernel:  [] ? 
> > > warn_alloc+0x111/0x130
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > __alloc_pages_nodemask+0xbe8/0xd30
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > pagecache_get_page+0xe4/0x230
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > alloc_extent_buffer+0x10b/0x400
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > btrfs_alloc_tree_block+0x125/0x560
> > 
> > OK, so this is
> > find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL)
> > 
> > The main question is whether this really needs to be NOFS request...
> > 
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > read_extent_buffer_pages+0x21f/0x280
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > __btrfs_cow_block+0x141/0x580
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > btrfs_cow_block+0x100/0x150
> > > Dec 13 13:04:33 server kernel:  [] ?  
> > > btrfs_search_slot+0x1e9/0x9c0
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > __set_extent_bit+0x512/0x550
> > > Dec 13 13:04:33 server kernel:  [] ? 
> > > lookup_inline_extent_backref+0xf5/0x5e0
> > > Dec 13 13:04:34 server kernel:  [] ? 
> > > set_extent_bit+0x24/0x30
> > > Dec 13 13:04:34 server kernel:  [] ? 
> > > update_block_group.isra.34+0x114/0x380
> > > Dec 13 13:04:34 server kernel:  [] ? 
> > > __btrfs_free_extent.isra.35+0xf4/0xd20
> > > Dec 13 13:04:34 server kernel:  [] ? 
> > > btrfs_merge_delayed_refs+0x61/0x5d0
> > > Dec 13 13:04:34 server kernel:  [] ? 
> > > __btrfs_run_delayed_refs+0x902/0x10a0
> > > Dec 13 13:04:34 server kernel:  [] ? 
> > > btrfs_run_delayed_refs+0x90/0x2a0
> > > Dec 13 13:04:34 server kernel:  [] ? 
> > > delayed_ref_async_start+0x84/0xa0
> > 
> > What would cause the reclaim recursion?
> > 
> > > Dec 13 13:04:34 server kernel: Mem-Info:
> > > Dec 13 13:04:34 server kernel: active_anon:20 inactive_anon:34
> > > isolated_anon:0\x0a active_file:7370032 inactive_file:450105
> > > isolated_file:320\x0a unevictable:0 dirty:522748 writeback:189
> > > unstable:0\x0a slab_reclaimable:178255 slab_unreclaimable:124617\x0a
> > > mapped:4236 shmem:0 pagetables:1163 bounce:0\x0a free:38224 free_pcp:241
> > > free_cma:0
> > 
> > This speaks for itself. There is a lot of dirty data, basically no
> > anonymous memory and GFP_NOFS cannot do much to reclaim obviously. This
> > is either a configuraion bug as somebody noted down the thread (setting
> > the dirty_ratio) or suboptimality of the btrfs code which might request
> > NOFS even though it is not strictly necessary. This would be more for
> > btrfs developers.
> > -- 
> > Michal Hocko
> > SUSE Labs
> 
> -- 
> Michal Hocko
> SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs-progs: qgroup: add sync option to 'qgroup show'

2016-12-14 Thread David Sterba
On Wed, Dec 07, 2016 at 04:55:15PM +0900, Tsutomu Itoh wrote:
> The 'qgroup show' command does not synchronize filesystem.
> Therefore, 'qgroup show' may not display the correct value unless
> synchronized with 'filesystem sync' command etc.
> 
> So add the '--sync' and '--no-sync' options so that we can choose
> whether or not to synchronize when executing the command.
> 
> Signed-off-by: Tsutomu Itoh 
> ---
> v2: use getopt_long with enum instead of single letter (suggested by Qu)
> ---
>  Documentation/btrfs-qgroup.asciidoc |  6 ++
>  cmds-qgroup.c   | 33 +
>  2 files changed, 35 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/btrfs-qgroup.asciidoc 
> b/Documentation/btrfs-qgroup.asciidoc
> index 438dbc7..9c65795 100644
> --- a/Documentation/btrfs-qgroup.asciidoc
> +++ b/Documentation/btrfs-qgroup.asciidoc
> @@ -126,6 +126,12 @@ Prefix \'+' means ascending order and \'-' means 
> descending order of .
>  If no prefix is given, use ascending order by default.
>  +
>  If multiple s is given, use comma to separate.
> ++
> +--sync
> +To retrieve information after updating the status of qgroups,
> +invoke sync before getting information.

This could be more specific, that it's a filesystem sync.

> +--no-sync
> +Do not invoke sync before getting information (default).

I'm not sure we need this option, how is it supposed to be used?

> @@ -311,8 +313,15 @@ static int cmd_qgroup_show(int argc, char **argv)
>  
>   while (1) {
>   int c;
> + enum {
> + GETOPT_VAL_SORT = 256,
> + GETOPT_VAL_SYNC,
> + GETOPT_VAL_NO_SYNC
> + };
>   static const struct option long_options[] = {
> - {"sort", required_argument, NULL, 'S'},
> + {"sort", required_argument, NULL, GETOPT_VAL_SORT},

This change is unrelated to the patch, please make a separate patch for
that.

Otherwise looks good.

> + {"sync", no_argument, NULL, GETOPT_VAL_SYNC},
> + {"no-sync", no_argument, NULL, GETOPT_VAL_NO_SYNC},
>   { NULL, 0, NULL, 0 }
>   };
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] duperemove: test presence of dedupe ioctl

2016-12-14 Thread Christoph Hellwig
On Fri, Dec 09, 2016 at 09:56:45AM -0800, Darrick J. Wong wrote:
> Since a zero-length dedupe operation is guaranteed to succeed, use that
> to test whether or not this filesystem supports dedupe.
> 
> Signed-off-by: Darrick J. Wong 
> ---
>  file_scan.c |   47 +--
>  1 file changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/file_scan.c b/file_scan.c
> index 617f166..a34453e 100644
> --- a/file_scan.c
> +++ b/file_scan.c
> @@ -45,11 +45,7 @@
>  #include "file_scan.h"
>  #include "dbfile.h"
>  #include "util.h"
> -
> -/* This is not in linux/magic.h */
> -#ifndef  XFS_SB_MAGIC
> -#define  XFS_SB_MAGIC0x58465342  /* 'XFSB' */
> -#endif
> +#include "btrfs-ioctl.h"
>  
>  static char path[PATH_MAX] = { 0, };
>  static char *pathp = path;
> @@ -189,6 +185,39 @@ static int walk_dir(const char *name)
>   return ret;
>  }
>  
> +struct fake_btrfs_ioctl_same_args {
> + struct btrfs_ioctl_same_args args;
> + struct btrfs_ioctl_same_extent_info info;
> +};

Why does this need a fake structure here?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: page allocation stall in kernel 4.9 when copying files from one btrfs hdd to another

2016-12-14 Thread Michal Hocko
Btw. the stall should be preceded by the OOM killer invocation. Could
you share the OOM report please. I am asking because such an OOM killer
would be clearly pre-mature as per your meminfo. I am trying to change
that code and seeing your numbers might help me.

Thanks!

On Wed 14-12-16 11:17:43, Michal Hocko wrote:
> On Tue 13-12-16 18:11:01, David Arendt wrote:
> > Hi,
> > 
> > I receive the following page allocation stall while copying lots of
> > large files from one btrfs hdd to another.
> > 
> > Dec 13 13:04:29 server kernel: kworker/u16:8: page allocation stalls for 
> > 12260ms, order:0, mode:0x2400840(GFP_NOFS|__GFP_NOFAIL)
> > Dec 13 13:04:29 server kernel: CPU: 0 PID: 24959 Comm: kworker/u16:8 
> > Tainted: P   O4.9.0 #1
> [...]
> > Dec 13 13:04:29 server kernel: Call Trace:
> > Dec 13 13:04:29 server kernel:  [] ? dump_stack+0x46/0x5d
> > Dec 13 13:04:29 server kernel:  [] ? 
> > warn_alloc+0x111/0x130
> > Dec 13 13:04:33 server kernel:  [] ? 
> > __alloc_pages_nodemask+0xbe8/0xd30
> > Dec 13 13:04:33 server kernel:  [] ? 
> > pagecache_get_page+0xe4/0x230
> > Dec 13 13:04:33 server kernel:  [] ? 
> > alloc_extent_buffer+0x10b/0x400
> > Dec 13 13:04:33 server kernel:  [] ? 
> > btrfs_alloc_tree_block+0x125/0x560
> 
> OK, so this is
>   find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL)
> 
> The main question is whether this really needs to be NOFS request...
> 
> > Dec 13 13:04:33 server kernel:  [] ? 
> > read_extent_buffer_pages+0x21f/0x280
> > Dec 13 13:04:33 server kernel:  [] ? 
> > __btrfs_cow_block+0x141/0x580
> > Dec 13 13:04:33 server kernel:  [] ? 
> > btrfs_cow_block+0x100/0x150
> > Dec 13 13:04:33 server kernel:  [] ?  
> > btrfs_search_slot+0x1e9/0x9c0
> > Dec 13 13:04:33 server kernel:  [] ? 
> > __set_extent_bit+0x512/0x550
> > Dec 13 13:04:33 server kernel:  [] ? 
> > lookup_inline_extent_backref+0xf5/0x5e0
> > Dec 13 13:04:34 server kernel:  [] ? 
> > set_extent_bit+0x24/0x30
> > Dec 13 13:04:34 server kernel:  [] ? 
> > update_block_group.isra.34+0x114/0x380
> > Dec 13 13:04:34 server kernel:  [] ? 
> > __btrfs_free_extent.isra.35+0xf4/0xd20
> > Dec 13 13:04:34 server kernel:  [] ? 
> > btrfs_merge_delayed_refs+0x61/0x5d0
> > Dec 13 13:04:34 server kernel:  [] ? 
> > __btrfs_run_delayed_refs+0x902/0x10a0
> > Dec 13 13:04:34 server kernel:  [] ? 
> > btrfs_run_delayed_refs+0x90/0x2a0
> > Dec 13 13:04:34 server kernel:  [] ? 
> > delayed_ref_async_start+0x84/0xa0
> 
> What would cause the reclaim recursion?
> 
> > Dec 13 13:04:34 server kernel: Mem-Info:
> > Dec 13 13:04:34 server kernel: active_anon:20 inactive_anon:34
> > isolated_anon:0\x0a active_file:7370032 inactive_file:450105
> > isolated_file:320\x0a unevictable:0 dirty:522748 writeback:189
> > unstable:0\x0a slab_reclaimable:178255 slab_unreclaimable:124617\x0a
> > mapped:4236 shmem:0 pagetables:1163 bounce:0\x0a free:38224 free_pcp:241
> > free_cma:0
> 
> This speaks for itself. There is a lot of dirty data, basically no
> anonymous memory and GFP_NOFS cannot do much to reclaim obviously. This
> is either a configuraion bug as somebody noted down the thread (setting
> the dirty_ratio) or suboptimality of the btrfs code which might request
> NOFS even though it is not strictly necessary. This would be more for
> btrfs developers.
> -- 
> Michal Hocko
> SUSE Labs

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: page allocation stall in kernel 4.9 when copying files from one btrfs hdd to another

2016-12-14 Thread Michal Hocko
On Tue 13-12-16 18:11:01, David Arendt wrote:
> Hi,
> 
> I receive the following page allocation stall while copying lots of
> large files from one btrfs hdd to another.
> 
> Dec 13 13:04:29 server kernel: kworker/u16:8: page allocation stalls for 
> 12260ms, order:0, mode:0x2400840(GFP_NOFS|__GFP_NOFAIL)
> Dec 13 13:04:29 server kernel: CPU: 0 PID: 24959 Comm: kworker/u16:8 Tainted: 
> P   O4.9.0 #1
[...]
> Dec 13 13:04:29 server kernel: Call Trace:
> Dec 13 13:04:29 server kernel:  [] ? dump_stack+0x46/0x5d
> Dec 13 13:04:29 server kernel:  [] ? warn_alloc+0x111/0x130
> Dec 13 13:04:33 server kernel:  [] ? 
> __alloc_pages_nodemask+0xbe8/0xd30
> Dec 13 13:04:33 server kernel:  [] ? 
> pagecache_get_page+0xe4/0x230
> Dec 13 13:04:33 server kernel:  [] ? 
> alloc_extent_buffer+0x10b/0x400
> Dec 13 13:04:33 server kernel:  [] ? 
> btrfs_alloc_tree_block+0x125/0x560

OK, so this is
find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL)

The main question is whether this really needs to be NOFS request...

> Dec 13 13:04:33 server kernel:  [] ? 
> read_extent_buffer_pages+0x21f/0x280
> Dec 13 13:04:33 server kernel:  [] ? 
> __btrfs_cow_block+0x141/0x580
> Dec 13 13:04:33 server kernel:  [] ? 
> btrfs_cow_block+0x100/0x150
> Dec 13 13:04:33 server kernel:  [] ?  
> btrfs_search_slot+0x1e9/0x9c0
> Dec 13 13:04:33 server kernel:  [] ? 
> __set_extent_bit+0x512/0x550
> Dec 13 13:04:33 server kernel:  [] ? 
> lookup_inline_extent_backref+0xf5/0x5e0
> Dec 13 13:04:34 server kernel:  [] ? 
> set_extent_bit+0x24/0x30
> Dec 13 13:04:34 server kernel:  [] ? 
> update_block_group.isra.34+0x114/0x380
> Dec 13 13:04:34 server kernel:  [] ? 
> __btrfs_free_extent.isra.35+0xf4/0xd20
> Dec 13 13:04:34 server kernel:  [] ? 
> btrfs_merge_delayed_refs+0x61/0x5d0
> Dec 13 13:04:34 server kernel:  [] ? 
> __btrfs_run_delayed_refs+0x902/0x10a0
> Dec 13 13:04:34 server kernel:  [] ? 
> btrfs_run_delayed_refs+0x90/0x2a0
> Dec 13 13:04:34 server kernel:  [] ? 
> delayed_ref_async_start+0x84/0xa0

What would cause the reclaim recursion?

> Dec 13 13:04:34 server kernel: Mem-Info:
> Dec 13 13:04:34 server kernel: active_anon:20 inactive_anon:34
> isolated_anon:0\x0a active_file:7370032 inactive_file:450105
> isolated_file:320\x0a unevictable:0 dirty:522748 writeback:189
> unstable:0\x0a slab_reclaimable:178255 slab_unreclaimable:124617\x0a
> mapped:4236 shmem:0 pagetables:1163 bounce:0\x0a free:38224 free_pcp:241
> free_cma:0

This speaks for itself. There is a lot of dirty data, basically no
anonymous memory and GFP_NOFS cannot do much to reclaim obviously. This
is either a configuraion bug as somebody noted down the thread (setting
the dirty_ratio) or suboptimality of the btrfs code which might request
NOFS even though it is not strictly necessary. This would be more for
btrfs developers.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html