Re: [PATCH 0/6] btrfs dax IO

2016-12-07 Thread Xin Zhou
Hi Liu,
 
>From the patch, is the snapshot disabled by disabling the COW in the mounting 
>path?
It seems the create_snapshot() in ioctl.c does not get changed.

I experienced some similar system but am a bit new to the brtfs code.
  
Thanks, 
Xin
 
 

Subject: [PATCH 0/6] btrfs dax IOFrom: Liu Bo Date: Wed, 
7 Dec 2016 13:45:04 -0800Cc: Chris Mason , Jan Kara , 
David Sterba 
This is a prelimanary patch set to add dax support for btrfs, with
this we can do normal read/write to dax files and can mmap dax files
to userspace so that applications have the ability to access
persistent memory directly.

Please note that currently this is limited to nocow, i.e. all dax
inodes do not have COW behaviour.

COW:no
mutliple device:no
clone/reflink:  no
snapshot:   no
compression:no
checksum:   no

Right now snapshot is disabled while mounting with -odax, but snapshot
can be created without -odax, and writing to a dax file in snapshot
will get -EIO.

Clone/reflink is dealt with as same as snapshot, -EIO will be returned
when writing to shared extents.

This has adopted the latest iomap framework for dax read/write
and dax mmap.

 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code

2016-12-07 Thread Qu Wenruo



At 12/08/2016 01:27 PM, Eryu Guan wrote:

On Thu, Dec 08, 2016 at 10:04:56AM +0800, Qu Wenruo wrote:

Introduce new _require_btrfs_qgroup_report function, which will check
the accessibility to "btrfs check --qgroup-report", then set a global
flag to info _check_scratch_fs() to do extra qgroup check.

Signed-off-by: Qu Wenruo 
---
 common/rc   | 22 ++


This needs rebase too.


 tests/btrfs/022 |  5 +
 tests/btrfs/028 |  5 ++---
 tests/btrfs/042 |  6 ++
 tests/btrfs/099 |  1 +
 tests/btrfs/104 | 20 +---
 tests/btrfs/122 | 10 +++---
 tests/btrfs/123 |  5 ++---
 8 files changed, 42 insertions(+), 32 deletions(-)

diff --git a/common/rc b/common/rc
index 1703232..bce3a09 100644
--- a/common/rc
+++ b/common/rc
@@ -2624,6 +2624,20 @@ _check_btrfs_filesystem()
 mountpoint=`_umount_or_remount_ro $device`
 fi

+# Check qgroup numbers
+if [ "$BTRFS_NEED_QGROUP_REPORT" == "yes" ];then


So we can bypass the _require_btrfs_qgroup_report check if we set
BTRFS_NEED_QGROUP_REPORT to "yes" directly, right? How about doing
something like _require_scratch do, e.g. touching some signal file in
$RESULT_DIR and only do qgroup check if that file exists?


Nice idea.




+   btrfsck $device --qgroup-report > $tmp.qgroup_report 2>&1


Shouldn't "$BTRFS_UTIL_PROG check $device ..." be used for new code? I
might be wrong on this, I think btrfsck is deprecated.


Oh, the code in common/btrfs is just too old.
I'll update them together in next version.

Thanks,
Qu



Thanks,
Eryu





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fstests: common: rename and enhance _require_btrfs to _require_btrfs_subcommand

2016-12-07 Thread Qu Wenruo



At 12/08/2016 12:00 PM, Eryu Guan wrote:

On Thu, Dec 08, 2016 at 10:04:55AM +0800, Qu Wenruo wrote:

Rename _require_btrfs() to _require_btrfs_subcommand() to avoid
confusion, as all other _require_btrfs_* has a quite clear suffix, like
_require_btrfs_mkfs_feature() or _require_btrfs_fs_feature().

Also enhance _require_btrfs_subcommand() to accept 2nd level commands or
options.
Options will be determined by the first "-" char.
This is quite useful for case like "btrfs inspect-internal dump-tree"
and "btrfs check --qgroup-report".

Signed-off-by: Qu Wenruo 
---
 common/rc   | 29 -


Can you rebase on top of current master please? We've moved
btrfs-specific functions to common/btrfs


Finally!
Good news.




 tests/btrfs/004 |  3 ++-
 tests/btrfs/048 |  2 +-
 tests/btrfs/059 |  2 +-
 tests/btrfs/131 |  2 +-
 5 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/common/rc b/common/rc
index 8c99306..1703232 100644
--- a/common/rc
+++ b/common/rc
@@ -3019,15 +3019,34 @@ _require_deletable_scratch_dev_pool()
 }

 # We check for btrfs and (optionally) features of the btrfs command
-_require_btrfs()
+# _require_btrfs_subcommand  [|]
+# It can handle both subfunction like "inspect-internal dump-tree"
+# and options like "check --qgroup-report"
+_require_btrfs_subcommand()


I'd prefer a name similar to _require_xfs_io_command, e.g.
_require_btrfs_command, "subcommand" seems not necessary to me.



Right, the subcommand seems not that handy compared to other _require_*.



 {
-   cmd=$1
-   _require_command "$BTRFS_UTIL_PROG" btrfs
if [ -z "$1" ]; then
-   return 1;
+   echo "Usage: _require_btrfs_subcommand command [subcommand]" 
1>&2
+   exit 1
fi
-   $BTRFS_UTIL_PROG $cmd --help >/dev/null 2>&1
+   cmd=$1
+   param=$2
+
+   _require_command "$BTRFS_UTIL_PROG" btrfs
+   $BTRFS_UTIL_PROG $cmd --help &>/dev/null
[ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd)"
+
+   test -z "$param" && return
+
+   # if $param is an option, replace leading "-"s for grep
+   if [ ${param:0:1} == "-" ]; then
+   param=$(echo $param | sed 's/^-*//')
+   $BTRFS_UTIL_PROG $cmd --help | grep $param > /dev/null || \


Use "grep -w" to be safer? And "-q" instead of "> /dev/null"


Right, -w is much safer. I'll use "-q" in next version.




+   _not_run "$BTRFS_UTIL_PROG too old (must support $cmd 
$param)"


$param here is without leading "-", so the _notrun message is kind of
misleading. And _notrun not _not_run :)


Right, I'll using safe_param.

Thanks,
Qu



Thanks,
Eryu





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: tests: add test for --sync option of qgroup show

2016-12-07 Thread Tsutomu Itoh
Simple test script for the following patch.

   btrfs-progs: qgroup: add sync option to 'qgroup show'

Signed-off-by: Tsutomu Itoh 
---
 tests/cli-tests/005-qgroup-show-sync/test.sh | 30 
 1 file changed, 30 insertions(+)
 create mode 100755 tests/cli-tests/005-qgroup-show-sync/test.sh

diff --git a/tests/cli-tests/005-qgroup-show-sync/test.sh 
b/tests/cli-tests/005-qgroup-show-sync/test.sh
new file mode 100755
index 000..2be684d
--- /dev/null
+++ b/tests/cli-tests/005-qgroup-show-sync/test.sh
@@ -0,0 +1,30 @@
+#!/bin/bash
+#
+# simple test of qgroup show --sync and --no-sync options
+
+source $TOP/tests/common
+
+check_prereq mkfs.btrfs
+check_prereq btrfs
+
+setup_root_helper
+prepare_test_dev 1g
+
+run_check $TOP/mkfs.btrfs -f $IMAGE
+run_check_mount_test_dev
+
+run_check $SUDO_HELPER $TOP/btrfs subvolume create $TEST_MNT/Sub
+run_check $SUDO_HELPER $TOP/btrfs quota enable $TEST_MNT/Sub
+
+for opt in '' '--' '--sync' '--no-sync'; do
+   run_check $SUDO_HELPER $TOP/btrfs qgroup limit 300M $TEST_MNT/Sub
+   run_check $SUDU_HELPER dd if=/dev/zero of=$TEST_MNT/Sub/file bs=1M 
count=200
+
+   run_check $SUDO_HELPER $TOP/btrfs qgroup show -re $opt $TEST_MNT/Sub
+
+   run_check $SUDO_HELPER $TOP/btrfs qgroup limit none $TEST_MNT/Sub
+   run_check rm -f $TEST_MNT/Sub/file
+   run_check $TOP/btrfs filesystem sync $TEST_MNT/Sub
+done
+
+run_check_umount_test_dev
-- 
2.9.3
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code

2016-12-07 Thread Eryu Guan
On Thu, Dec 08, 2016 at 10:04:56AM +0800, Qu Wenruo wrote:
> Introduce new _require_btrfs_qgroup_report function, which will check
> the accessibility to "btrfs check --qgroup-report", then set a global
> flag to info _check_scratch_fs() to do extra qgroup check.
> 
> Signed-off-by: Qu Wenruo 
> ---
>  common/rc   | 22 ++

This needs rebase too.

>  tests/btrfs/022 |  5 +
>  tests/btrfs/028 |  5 ++---
>  tests/btrfs/042 |  6 ++
>  tests/btrfs/099 |  1 +
>  tests/btrfs/104 | 20 +---
>  tests/btrfs/122 | 10 +++---
>  tests/btrfs/123 |  5 ++---
>  8 files changed, 42 insertions(+), 32 deletions(-)
> 
> diff --git a/common/rc b/common/rc
> index 1703232..bce3a09 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -2624,6 +2624,20 @@ _check_btrfs_filesystem()
>  mountpoint=`_umount_or_remount_ro $device`
>  fi
>  
> +# Check qgroup numbers
> +if [ "$BTRFS_NEED_QGROUP_REPORT" == "yes" ];then

So we can bypass the _require_btrfs_qgroup_report check if we set
BTRFS_NEED_QGROUP_REPORT to "yes" directly, right? How about doing
something like _require_scratch do, e.g. touching some signal file in
$RESULT_DIR and only do qgroup check if that file exists?

> + btrfsck $device --qgroup-report > $tmp.qgroup_report 2>&1

Shouldn't "$BTRFS_UTIL_PROG check $device ..." be used for new code? I
might be wrong on this, I think btrfsck is deprecated.

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] fstests: common: rename and enhance _require_btrfs to _require_btrfs_subcommand

2016-12-07 Thread Eryu Guan
On Thu, Dec 08, 2016 at 10:04:55AM +0800, Qu Wenruo wrote:
> Rename _require_btrfs() to _require_btrfs_subcommand() to avoid
> confusion, as all other _require_btrfs_* has a quite clear suffix, like
> _require_btrfs_mkfs_feature() or _require_btrfs_fs_feature().
> 
> Also enhance _require_btrfs_subcommand() to accept 2nd level commands or
> options.
> Options will be determined by the first "-" char.
> This is quite useful for case like "btrfs inspect-internal dump-tree"
> and "btrfs check --qgroup-report".
> 
> Signed-off-by: Qu Wenruo 
> ---
>  common/rc   | 29 -

Can you rebase on top of current master please? We've moved
btrfs-specific functions to common/btrfs

>  tests/btrfs/004 |  3 ++-
>  tests/btrfs/048 |  2 +-
>  tests/btrfs/059 |  2 +-
>  tests/btrfs/131 |  2 +-
>  5 files changed, 29 insertions(+), 9 deletions(-)
> 
> diff --git a/common/rc b/common/rc
> index 8c99306..1703232 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -3019,15 +3019,34 @@ _require_deletable_scratch_dev_pool()
>  }
>  
>  # We check for btrfs and (optionally) features of the btrfs command
> -_require_btrfs()
> +# _require_btrfs_subcommand  [|]
> +# It can handle both subfunction like "inspect-internal dump-tree"
> +# and options like "check --qgroup-report"
> +_require_btrfs_subcommand()

I'd prefer a name similar to _require_xfs_io_command, e.g.
_require_btrfs_command, "subcommand" seems not necessary to me.

>  {
> - cmd=$1
> - _require_command "$BTRFS_UTIL_PROG" btrfs
>   if [ -z "$1" ]; then
> - return 1;
> + echo "Usage: _require_btrfs_subcommand command [subcommand]" 
> 1>&2
> + exit 1
>   fi
> - $BTRFS_UTIL_PROG $cmd --help >/dev/null 2>&1
> + cmd=$1
> + param=$2
> +
> + _require_command "$BTRFS_UTIL_PROG" btrfs
> + $BTRFS_UTIL_PROG $cmd --help &>/dev/null
>   [ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd)"
> +
> + test -z "$param" && return
> +
> + # if $param is an option, replace leading "-"s for grep
> + if [ ${param:0:1} == "-" ]; then
> + param=$(echo $param | sed 's/^-*//')
> + $BTRFS_UTIL_PROG $cmd --help | grep $param > /dev/null || \

Use "grep -w" to be safer? And "-q" instead of "> /dev/null"

> + _not_run "$BTRFS_UTIL_PROG too old (must support $cmd 
> $param)"

$param here is without leading "-", so the _notrun message is kind of
misleading. And _notrun not _not_run :)

Thanks,
Eryu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] Btrfs: add mount option for dax

2016-12-07 Thread kbuild test robot
Hi Liu,

[auto build test WARNING on tip/perf/core]
[also build test WARNING on v4.9-rc8 next-20161207]
[cannot apply to btrfs/next]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Liu-Bo/btrfs-dax-IO/20161208-082651
config: x86_64-randconfig-s2-12081004 (attached as .config)
compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   fs/btrfs/super.c: In function 'btrfs_parse_options':
>> fs/btrfs/super.c:414: warning: unused variable 'set_bdev'
   fs/btrfs/super.o: warning: objtool: btrfs_statfs()+0x2af: function has 
unreachable instruction

vim +/set_bdev +414 fs/btrfs/super.c

   398   * XXX JDM: This needs to be cleaned up for remount.
   399   */
   400  int btrfs_parse_options(struct btrfs_root *root, char *options,
   401  unsigned long new_flags)
   402  {
   403  struct btrfs_fs_info *info = root->fs_info;
   404  substring_t args[MAX_OPT_ARGS];
   405  char *p, *num, *orig = NULL;
   406  u64 cache_gen;
   407  int intarg;
   408  int ret = 0;
   409  char *compress_type;
   410  bool compress_force = false;
   411  enum btrfs_compression_type saved_compress_type;
   412  bool saved_compress_force;
   413  int no_compress = 0;
 > 414  int set_bdev = 0;
   415  
   416  cache_gen = 
btrfs_super_cache_generation(root->fs_info->super_copy);
   417  if (btrfs_fs_compat_ro(root->fs_info, FREE_SPACE_TREE))
   418  btrfs_set_opt(info->mount_opt, FREE_SPACE_TREE);
   419  else if (cache_gen)
   420  btrfs_set_opt(info->mount_opt, SPACE_CACHE);
   421  
   422  /*

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs

2016-12-07 Thread kbuild test robot
Hi Liu,

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.9-rc8]
[cannot apply to btrfs/next next-20161207]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Liu-Bo/btrfs-dax-IO/20161208-082651
config: tile-tilegx_defconfig (attached as .config)
compiler: tilegx-linux-gcc (GCC) 4.6.2
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=tile 

All errors (new ones prefixed by >>):

>> ERROR: "dax_pfn_mkwrite" [fs/btrfs/btrfs.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs

2016-12-07 Thread Janos Toth F.
I realize this is related very loosely (if at all) to this topic but
what about these two possible features:
- a mount option, or
- an attribute (which could be set on directories and/or sub-volumes
and applied to any new files created below these)
which effectively forces every read/write operations to behave like
the file was explicitly opened with DirectIO by the application (even
if the application has no DirectIO support)?

This could achieve something loosely similar to DAX while keeping more
of the "advanced" Btrfs features (I think only compression is ruled
out by DIO).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] fstests: common: rename and enhance _require_btrfs to _require_btrfs_subcommand

2016-12-07 Thread Qu Wenruo
Rename _require_btrfs() to _require_btrfs_subcommand() to avoid
confusion, as all other _require_btrfs_* has a quite clear suffix, like
_require_btrfs_mkfs_feature() or _require_btrfs_fs_feature().

Also enhance _require_btrfs_subcommand() to accept 2nd level commands or
options.
Options will be determined by the first "-" char.
This is quite useful for case like "btrfs inspect-internal dump-tree"
and "btrfs check --qgroup-report".

Signed-off-by: Qu Wenruo 
---
 common/rc   | 29 -
 tests/btrfs/004 |  3 ++-
 tests/btrfs/048 |  2 +-
 tests/btrfs/059 |  2 +-
 tests/btrfs/131 |  2 +-
 5 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/common/rc b/common/rc
index 8c99306..1703232 100644
--- a/common/rc
+++ b/common/rc
@@ -3019,15 +3019,34 @@ _require_deletable_scratch_dev_pool()
 }
 
 # We check for btrfs and (optionally) features of the btrfs command
-_require_btrfs()
+# _require_btrfs_subcommand  [|]
+# It can handle both subfunction like "inspect-internal dump-tree"
+# and options like "check --qgroup-report"
+_require_btrfs_subcommand()
 {
-   cmd=$1
-   _require_command "$BTRFS_UTIL_PROG" btrfs
if [ -z "$1" ]; then
-   return 1;
+   echo "Usage: _require_btrfs_subcommand command [subcommand]" 
1>&2
+   exit 1
fi
-   $BTRFS_UTIL_PROG $cmd --help >/dev/null 2>&1
+   cmd=$1
+   param=$2
+
+   _require_command "$BTRFS_UTIL_PROG" btrfs
+   $BTRFS_UTIL_PROG $cmd --help &>/dev/null
[ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd)"
+
+   test -z "$param" && return
+
+   # if $param is an option, replace leading "-"s for grep
+   if [ ${param:0:1} == "-" ]; then
+   param=$(echo $param | sed 's/^-*//')
+   $BTRFS_UTIL_PROG $cmd --help | grep $param > /dev/null || \
+   _not_run "$BTRFS_UTIL_PROG too old (must support $cmd 
$param)"
+   return
+   fi
+
+   $BTRFS_UTIL_PROG $cmd $param --help &>/dev/null
+   [ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd 
$param)"
 }
 
 # Check that fio is present, and it is able to execute given jobfile
diff --git a/tests/btrfs/004 b/tests/btrfs/004
index 905770a..e60a034 100755
--- a/tests/btrfs/004
+++ b/tests/btrfs/004
@@ -51,7 +51,8 @@ _supported_fs btrfs
 _supported_os Linux
 _require_scratch
 _require_no_large_scratch_dev
-_require_btrfs inspect-internal
+_require_btrfs_subcommand inspect-internal logical-resolve
+_require_btrfs_subcommand inspect-internal inode-resolve
 _require_command "/usr/sbin/filefrag" filefrag
 
 rm -f $seqres.full
diff --git a/tests/btrfs/048 b/tests/btrfs/048
index 0b907b0..ac731d1 100755
--- a/tests/btrfs/048
+++ b/tests/btrfs/048
@@ -48,7 +48,7 @@ _supported_fs btrfs
 _supported_os Linux
 _require_test
 _require_scratch
-_require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 send_files_dir=$TEST_DIR/btrfs-test-$seq
 
diff --git a/tests/btrfs/059 b/tests/btrfs/059
index 8f106d2..fd67ebb 100755
--- a/tests/btrfs/059
+++ b/tests/btrfs/059
@@ -51,7 +51,7 @@ _supported_fs btrfs
 _supported_os Linux
 _require_test
 _require_scratch
-_require_btrfs "property"
+_require_btrfs_subcommand "property"
 
 rm -f $seqres.full
 
diff --git a/tests/btrfs/131 b/tests/btrfs/131
index d1a11d2..d7c7f12 100755
--- a/tests/btrfs/131
+++ b/tests/btrfs/131
@@ -48,7 +48,7 @@ rm -f $seqres.full
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
-_require_btrfs inspect-internal
+_require_btrfs_subcommand inspect-internal dump-super
 
 mkfs_v1()
 {
-- 
2.7.4



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] fstests: btrfs: Use _require_btrfs_qgroup_report to replace open code

2016-12-07 Thread Qu Wenruo
Introduce new _require_btrfs_qgroup_report function, which will check
the accessibility to "btrfs check --qgroup-report", then set a global
flag to info _check_scratch_fs() to do extra qgroup check.

Signed-off-by: Qu Wenruo 
---
 common/rc   | 22 ++
 tests/btrfs/022 |  5 +
 tests/btrfs/028 |  5 ++---
 tests/btrfs/042 |  6 ++
 tests/btrfs/099 |  1 +
 tests/btrfs/104 | 20 +---
 tests/btrfs/122 | 10 +++---
 tests/btrfs/123 |  5 ++---
 8 files changed, 42 insertions(+), 32 deletions(-)

diff --git a/common/rc b/common/rc
index 1703232..bce3a09 100644
--- a/common/rc
+++ b/common/rc
@@ -2624,6 +2624,20 @@ _check_btrfs_filesystem()
 mountpoint=`_umount_or_remount_ro $device`
 fi
 
+# Check qgroup numbers
+if [ "$BTRFS_NEED_QGROUP_REPORT" == "yes" ];then
+   btrfsck $device --qgroup-report > $tmp.qgroup_report 2>&1
+   if grep -qE "Counts for qgroup.*are different" $tmp.qgroup_report ; 
then
+   echo "_check_btrfs_filesystem: filesystem on $device has wrong 
qgroup numbers (see $seqres.full)"
+   echo "_check_btrfs_filesystem: filesystem on $device has wrong 
qgroup numbers" \
+   >> $seqres.full
+   echo "*** qgroup_report.$FSTYP output ***"  >>$seqres.full
+   cat $tmp.qgroup_report  >>$seqres.full
+   echo "*** qgroup_report.$FSTYP output ***"  >>$seqres.full
+fi
+fi
+rm -f $tmp.qgroup_report
+
 btrfsck $device >$tmp.fsck 2>&1
 if [ $? -ne 0 ]
 then
@@ -3049,6 +3063,14 @@ _require_btrfs_subcommand()
[ $? -eq 0 ] || _notrun "$BTRFS_UTIL_PROG too old (must support $cmd 
$param)"
 }
 
+# Require "btrfs check --qgroup-report" fucntion and will check the qgroup
+# numbers at _check_scratch_fs()
+_require_btrfs_qgroup_report()
+{
+   _require_btrfs_subcommand check --qgroup-report
+   export BTRFS_NEED_QGROUP_REPORT="yes"
+}
+
 # Check that fio is present, and it is able to execute given jobfile
 _require_fio()
 {
diff --git a/tests/btrfs/022 b/tests/btrfs/022
index 56d4f3d..2f21a78 100755
--- a/tests/btrfs/022
+++ b/tests/btrfs/022
@@ -43,6 +43,7 @@ _cleanup()
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
+_require_btrfs_qgroup_report
 
 rm -f $seqres.full
 
@@ -125,20 +126,24 @@ _scratch_mkfs > /dev/null 2>&1
 _scratch_mount
 _basic_test
 _scratch_unmount
+_check_scratch_fs
 
 _scratch_mkfs > /dev/null 2>&1
 _scratch_mount
 _rescan_test
 _scratch_unmount
+_check_scratch_fs
 
 _scratch_mkfs > /dev/null 2>&1
 _scratch_mount
 _limit_test_exceed
 _scratch_unmount
+_check_scratch_fs
 
 _scratch_mkfs > /dev/null 2>&1
 _scratch_mount
 _limit_test_noexceed
+_check_scratch_fs
 
 # success, all done
 echo "Silence is golden"
diff --git a/tests/btrfs/028 b/tests/btrfs/028
index 1425609..a3d9a27 100755
--- a/tests/btrfs/028
+++ b/tests/btrfs/028
@@ -51,6 +51,7 @@ rm -f $seqres.full
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
+_require_btrfs_qgroup_report
 
 _scratch_mkfs
 _scratch_mount
@@ -86,9 +87,7 @@ _run_btrfs_util_prog filesystem sync $SCRATCH_MNT
 
 _scratch_unmount
 
-# generate a qgroup report and look for inconsistent groups
-$BTRFS_UTIL_PROG check --qgroup-report $SCRATCH_DEV 2>&1 | \
-   grep -E "Counts for qgroup.*are different"
+# qgroup will be checked at _check_scratch_fs() by fstest.
 echo "Silence is golden"
 status=0
 
diff --git a/tests/btrfs/042 b/tests/btrfs/042
index 498ccc9..dc9b762 100755
--- a/tests/btrfs/042
+++ b/tests/btrfs/042
@@ -43,6 +43,7 @@ _cleanup()
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
+_require_btrfs_qgroup_report
 
 rm -f $seqres.full
 
@@ -84,10 +85,7 @@ for i in `seq 10 -1 1`; do
total_written=$(($total_written+$filesize))
 done
 
-#check if total written exceeds limit
-if [ $total_written -gt $LIMIT_SIZE ];then
-   _fail "total written should be less than $LIMIT_SIZE"
-fi
+# qgroup will be checked automatically at _check_scratch_fs() by fstest
 
 # success, all done
 echo "Silence is golden"
diff --git a/tests/btrfs/099 b/tests/btrfs/099
index 70f07b5..65ea79b 100755
--- a/tests/btrfs/099
+++ b/tests/btrfs/099
@@ -46,6 +46,7 @@ _cleanup()
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
+_require_btrfs_qgroup_report
 
 # Use big blocksize to ensure there is still enough space left for metadata
 # space reserve.
diff --git a/tests/btrfs/104 b/tests/btrfs/104
index 6afaa02..e6a6d3b 100755
--- a/tests/btrfs/104
+++ b/tests/btrfs/104
@@ -58,6 +58,7 @@ rm -f $seqres.full
 _supported_fs btrfs
 _supported_os Linux
 _require_scratch
+_require_btrfs_qgroup_report
 
 rm -f $seqres.full
 
@@ -145,21 +146,10 @@ _scratch_cycle_mount
 # referenced above.
 _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap1
 
-# There is no way from userspace to force btrfs_drop_snapshot to run
-# at a given time (even via mount/unmount). We must wait for it to
-# 

Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs

2016-12-07 Thread kbuild test robot
Hi Liu,

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.9-rc8]
[cannot apply to btrfs/next next-20161207]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Liu-Bo/btrfs-dax-IO/20161208-082651
config: i386-randconfig-s0-201649 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   fs/built-in.o: In function `btrfs_filemap_pfn_mkwrite':
>> file.c:(.text+0x20188f): undefined reference to `dax_pfn_mkwrite'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 3/6] Btrfs: refactor btrfs_file_write_iter

2016-12-07 Thread kbuild test robot
Hi Liu,

[auto build test WARNING on tip/perf/core]
[also build test WARNING on v4.9-rc8 next-20161207]
[cannot apply to btrfs/next]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Liu-Bo/btrfs-dax-IO/20161208-082651
config: x86_64-randconfig-x001-201649 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

Note: it may well be a FALSE warning. FWIW you are at least aware of it now.
http://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

All warnings (new ones prefixed by >>):

   fs/btrfs/file.c: In function 'btrfs_file_write_iter':
>> fs/btrfs/file.c:1823:5: warning: 'oldsize' may be used uninitialized in this 
>> function [-Wmaybe-uninitialized]
 if (oldsize < pos)
^
   fs/btrfs/file.c:1809:9: note: 'oldsize' was declared here
 loff_t oldsize;
^~~

vim +/oldsize +1823 fs/btrfs/file.c

  1807  struct inode *inode = file_inode(file);
  1808  loff_t pos;
  1809  loff_t oldsize;
  1810  ssize_t ret;
  1811  
  1812  inode_lock(inode);
  1813  ret = btrfs_file_write_check(iocb, from);
  1814  if (ret)
  1815  goto out;
  1816  
  1817  current->backing_dev_info = inode_to_bdi(inode);
  1818  
  1819  pos = iocb->ki_pos;
  1820  ret = __btrfs_buffered_write(file, from, pos);
  1821  if (ret > 0)
  1822  iocb->ki_pos = pos + ret;
> 1823  if (oldsize < pos)
  1824  pagecache_isize_extended(inode, oldsize,
  1825  i_size_read(inode));
  1826  
  1827  current->backing_dev_info = NULL;
  1828  out:
  1829  inode_unlock(inode);
  1830  return ret;
  1831  }

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] btrfs-progs: Fix disable backtrace assert error

2016-12-07 Thread Qu Wenruo



At 12/07/2016 11:06 PM, Goldwyn Rodrigues wrote:



On 12/06/2016 07:29 PM, Qu Wenruo wrote:

Due to commit 00e769d04c2c83029d6c71(btrfs-progs: Correct value printed
by assertions/BUG_ON/WARN_ON), which changed the assert_trace()
parameter, the condition passed to assert/WARN_ON/BUG_ON are logical
notted for backtrace enabled and disabled case.

Such behavior makes us easier to pass value wrong, and in fact it did
cause us to pass wrong condition for ASSERT().

Instead of passing different conditions for ASSERT/WARN_ON/BUG_ON()
manually, this patch will use BUG_ON() to implement the resting
ASSERT/WARN_ON/BUG(), so we don't need to pass 3 different conditions
but only one.

And to further info the review for the fact that the condition should be
different, rename "assert_trace" to "bugon_trace", as unlike assert, we
will only trigger the bug when condition is true.

Also, move WARN_ON() out of the ifdef branch, as it's completely the
same for both branches.

Cc: Goldwyn Rodrigues 
Signed-off-by: Qu Wenruo 
---
 kerncompat.h | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/kerncompat.h b/kerncompat.h
index e374614..be77608 100644
--- a/kerncompat.h
+++ b/kerncompat.h
@@ -277,7 +277,7 @@ static inline long IS_ERR(const void *ptr)
 #define vfree(x) free(x)

 #ifndef BTRFS_DISABLE_BACKTRACE
-static inline void assert_trace(const char *assertion, const char *filename,
+static inline void bugon_trace(const char *assertion, const char *filename,
  const char *func, unsigned line, long val)
 {
if (!val)


To keep confusion to the minimum, you can call this *condition instead
of *assertion.


Right, I'll update it.




@@ -287,17 +287,20 @@ static inline void assert_trace(const char *assertion, 
const char *filename,
exit(1);
 }

-#define BUG_ON(c) assert_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
-#define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
-#defineASSERT(c) assert_trace(#c, __FILE__, __func__, __LINE__, 
(long)!(c))
-#define BUG() assert_trace(NULL, __FILE__, __func__, __LINE__, 1)
+#define BUG_ON(c) bugon_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
 #else
 #define BUG_ON(c) assert(!(c))
-#define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
-#define ASSERT(c) assert(!(c))
-#define BUG() assert(0)
 #endif

+#define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
+/*
+ * TODO: ASSERT() should be depercated. In case like ASSERT(ret == 0), it
+ * won't output any useful value for ret.
+ * Should be replaced by BUG_ON(ret);
+ */
+#defineASSERT(c) BUG_ON(!(c))


I am not sure of this. As you are stating, this (double negation) will
kill the value of the condition. Won't it be better to remove all
ASSERTs first instead of putting this TODO?


IIRC the ASSERT/BUG_ON will be removed step by step.
And we have about 60+ ASSERT in current code base, not an easy thing to 
fix soon.


So I prefer to mark ASSERT() deprecated and remove them in later cleanups.

Thanks,
Qu





+#define BUG() BUG_ON(1)
+
 #define container_of(ptr, type, member) ({  \
 const typeof( ((type *)0)->member ) *__mptr = (ptr);\
(type *)( (char *)__mptr - offsetof(type,member) );})






--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs

2016-12-07 Thread Liu Bo
On Wed, Dec 07, 2016 at 05:15:42PM -0500, Chris Mason wrote:
> 
> 
> On 12/07/2016 04:45 PM, Liu Bo wrote:
> > This has implemented DAX support for btrfs with nocow and single-device.
> > 
> > DAX is developed for block devices that are memory-like in order to avoid
> > double buffer in both page cache and the storage, so DAX can performs reads 
> > and
> > writes directly to the storage device, and for those who prefer to using
> > filesystem, filesystem dax support can help to map the storage into 
> > userspace
> > for file-mapping.
> > 
> > Since I haven't figure out how to map multiple devices to userspace without
> > pagecache, this DAX support is only for single-device, and I don't think
> > DAX(Direct Access) can work with cow, this is limited to nocow case.  I made
> > this by setting nodatacow in dax mount option.
> 
> Interesting, this is a nice small start.  It might make more sense to limit
> snapshots to readonly in DAX mode until we can figure out how to cow
> properly.

Sounds good and easy to do.

>  I think it can be done, I just need to sit down with the dax code
> to do a good review.
> 
> But bigger picture, if we can't cow and we can't crc and we can't
> multi-device, I'd rather let XFS/ext4 sort out the dax space until we pull
> in more of the btrfs features too.

Well, I agree with that, initially I thought dax doesn't fit with
btrfs's expectation as it's mainly used to bypass kernel stuff and
offers a bridge between application and pmem devices, but one benefit I
forgot to mention in the commit log is that btrfs can do DUP metadata
which is mirroring, and it has a slightly bigger chance than ext4/xfs to
get metadata corruption fixed online.

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: crc32c_le performance hit

2016-12-07 Thread Chris Murphy
# taskset -c 0 btrfs send /mnt/first/subvol.ro/ | btrfs receive /mnt/int/

Attaching top and perf top while this send receive happens. The use of
taskset -c 0 doesn't seem to affect the results.


Chris Murphy
Samples: 51K of event 'cycles:pp', Event count (approx.): 13435020269
Overhead  Shared Object   Symbol
18.04%  btrfs   [.] __crc32c_le
8.50%  [kernel][k] _aesni_dec4
6.75%  [kernel][k] __radix_tree_lookup
5.00%  [kernel][k] copy_user_enhanced_fast_string
2.41%  [kernel][k] memcpy_erms
1.89%  [kernel][k] send_extent_data
1.77%  [kernel][k] aesni_xts_crypt8
0.99%  [kernel][k] memset_erms
0.94%  [kernel][k] update_blocked_averages
0.89%  [kernel][k] __wake_up_bit
0.82%  [kernel][k] crypt_convert
0.77%  [kernel][k] get_page_from_freelist
0.70%  [kernel][k] __list_del_entry
0.47%  [kernel][k] _aesni_enc1
0.46%  [kernel][k] glue_xts_crypt_128bit
0.41%  [kernel][k] __do_page_cache_readahead
0.39%  [kernel][k] radix_tree_lookup
0.36%  [kernel][k] _raw_spin_lock
0.36%  [kernel][k] cfb_imageblit
0.35%  perf[.] dso__find_symbol
0.35%  [kernel][k] crc_96
0.33%  [kernel][k] crc_48
0.33%  [kernel][k] crc_128
0.31%  [kernel][k] crc_42
0.31%  [kernel][k] crc_32
0.31%  [kernel][k] crc_80
0.31%  [kernel][k] __schedule
0.31%  [kernel][k] bad_range
0.30%  [kernel][k] __check_object_size
0.30%  [kernel][k] crc_112
0.30%  [kernel][k] _raw_spin_lock_irqsave
0.30%  [kernel][k] free_hot_cold_page
0.29%  [kernel][k] crc_64
0.29%  [kernel][k] crc_16
0.28%  [kernel][k] module_get_kallsym
0.28%  [kernel][k] generic_bin_search.constprop.37
0.27%  [kernel][k] pipe_write
0.27%  libc-2.24.so[.] vfprintf
0.27%  libc-2.24.so[.] __strcmp_sse2_unaligned
0.27%  [kernel][k] free_pcppages_bulk
0.26%  [kernel][k] blkcipher_walk_next
0.26%  [kernel][k] __btrfs_map_block
0.26%  [kernel][k] __alloc_pages_nodemask
0.26%  [kernel][k] page_cache_prev_hole
0.26%  [kernel][k] unlock_page
0.25%  [kernel][k] __lookup_extent_mapping
0.25%  [kernel][k] kmem_cache_alloc
no symbols passed the given filter.

top - 15:12:02 up 8 min,  2 users,  load average: 0.51, 0.22, 0.12
Tasks: 208 total,   3 running, 205 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.8 us, 19.9 sy,  0.0 ni, 57.3 id, 14.2 wa,  0.9 hi,  0.9 si,  0.0 st
KiB Mem :  3965204 total,  2756344 free,   166196 used,  1042664 buff/cache
KiB Swap:  8388604 total,  8388604 free,0 used.  3528528 avail Mem 

PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND   
  
1367 root  20   0   15928   1180   1040 R  38.6  0.0   0:02.51 btrfs
   
1366 root  20   0   24124   1160   1016 R  27.1  0.0   0:01.67 btrfs
   
1249 root  20   0   0  0  0 S   9.2  0.0   0:00.70 kworker/u8:0 
   
123 root  20   0   0  0  0 S   7.9  0.0   0:00.81 kworker/u8:3  
  
1319 root  20   0   0  0  0 S   5.9  0.0   0:00.20 kworker/u8:5 
   
57 root  20   0   0  0  0 S   5.0  0.0   0:00.85 kworker/u8:1   
 
1370 root  20   0   0  0  0 S   5.0  0.0   0:00.15 kworker/u8:6 
   
113 root  20   0   0  0  0 S   4.3  0.0   0:00.43 kworker/u8:2  
  
153 root  20   0   0  0  0 S   3.3  0.0   0:00.56 kworker/u8:4  
  
815 root  20   0  139696   2964   2708 S   0.7  0.1   0:03.94 agetty
  
1369 chris 20   0  156664   4236   3720 R   0.7  0.1   0:00.07 top  
   
7 

Re: [PATCH 4/6] Btrfs: add DAX support for nocow btrfs

2016-12-07 Thread Chris Mason



On 12/07/2016 04:45 PM, Liu Bo wrote:

This has implemented DAX support for btrfs with nocow and single-device.

DAX is developed for block devices that are memory-like in order to avoid
double buffer in both page cache and the storage, so DAX can performs reads and
writes directly to the storage device, and for those who prefer to using
filesystem, filesystem dax support can help to map the storage into userspace
for file-mapping.

Since I haven't figure out how to map multiple devices to userspace without
pagecache, this DAX support is only for single-device, and I don't think
DAX(Direct Access) can work with cow, this is limited to nocow case.  I made
this by setting nodatacow in dax mount option.


Interesting, this is a nice small start.  It might make more sense to 
limit snapshots to readonly in DAX mode until we can figure out how to 
cow properly.  I think it can be done, I just need to sit down with the 
dax code to do a good review.


But bigger picture, if we can't cow and we can't crc and we can't 
multi-device, I'd rather let XFS/ext4 sort out the dax space until we 
pull in more of the btrfs features too.


-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bug]: possible recursive locking detected

2016-12-07 Thread Liu Bo
Hi,

On Wed, Dec 07, 2016 at 11:11:41AM -0500, Yclept Nemo wrote:
> kernel version: drm-next x86_64
> abrt (fedora) bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1402453

I believe the patch [1] can address this warning.

[1]: https://patchwork.kernel.org/patch/9457035/

Thanks,

-liubo

> dmesg snippet (the full dmesg is 15Mb):
> 
> kernel: =
> kernel: [ INFO: possible recursive locking detected ]
> kernel: 4.9.0-0.rc8.git0.1.fc25.x86_64 #1 Not tainted
> kernel: -
> kernel: gvfsd-metadata/1397 is trying to acquire lock:
> kernel:  (
> kernel: >log_mutex
> kernel: ){+.+...}
> kernel: , at:
> kernel: [] btrfs_log_inode+0x162/0x1190 [btrfs]
> kernel:
> but task is already holding lock:
> kernel:  (
> kernel: >log_mutex
> kernel: ){+.+...}
> kernel: , at:
> kernel: [] btrfs_log_inode+0x162/0x1190 [btrfs]
> kernel:
> other info that might help us debug this:
> kernel:  Possible unsafe locking scenario:
> kernel:CPU0
> kernel:
> kernel:   lock(
> kernel: >log_mutex
> kernel: );
> kernel:   lock(
> kernel: >log_mutex
> kernel: );
> kernel:
>  *** DEADLOCK ***
> kernel:  May be due to missing lock nesting notation
> kernel: 3 locks held by gvfsd-metadata/1397:
> kernel:  #0:
> kernel:  (
> kernel: >i_mutex_dir_key
> kernel: #3
> kernel: ){++}
> kernel: , at:
> kernel: [] btrfs_sync_file+0x163/0x4c0 [btrfs]
> kernel:  #1:
> kernel:  (
> kernel: sb_internal
> kernel: ){.+.+.+}
> kernel: , at:
> kernel: [] start_transaction+0x2f6/0x530 [btrfs]
> kernel:  #2:
> kernel:  (
> kernel: >log_mutex
> kernel: ){+.+...}
> kernel: , at:
> kernel: [] btrfs_log_inode+0x162/0x1190 [btrfs]
> kernel:
> stack backtrace:
> kernel: CPU: 0 PID: 1397 Comm: gvfsd-metadata Not tainted
> 4.9.0-0.rc8.git0.1.fc25.x86_64 #1
> kernel: Hardware name:/LP NF4 Series, BIOS 6.00 PG 01/25/2005
> kernel:  b5c3c383b760 a64772e3 a7be05e0 96512b328000
> kernel:  b5c3c383b828 a611231e b5c3c383b780 0003
> kernel:  c383b7a8 a74e5600 6629c0631375af20 96512b328ca8
> kernel: Call Trace:
> kernel:  [] dump_stack+0x86/0xc3
> kernel:  [] __lock_acquire+0x78e/0x1290
> kernel:  [] ? sched_clock_cpu+0x90/0xc0
> kernel:  [] ? mutex_unlock+0xe/0x10
> kernel:  [] lock_acquire+0xf6/0x1f0
> kernel:  [] ? btrfs_log_inode+0x162/0x1190 [btrfs]
> kernel:  [] mutex_lock_nested+0x86/0x3f0
> kernel:  [] ? btrfs_log_inode+0x162/0x1190 [btrfs]
> kernel:  [] ? __btrfs_release_delayed_node+0x75/0x1c0 
> [btrfs]
> kernel:  [] ? btrfs_log_inode+0x162/0x1190 [btrfs]
> kernel:  [] ?
> btrfs_commit_inode_delayed_inode+0xe9/0x130 [btrfs]
> kernel:  [] btrfs_log_inode+0x162/0x1190 [btrfs]
> kernel:  [] ? __might_sleep+0x4a/0x80
> kernel:  [] btrfs_log_inode+0xd18/0x1190 [btrfs]
> kernel:  [] ? sched_clock_local+0x17/0x80
> kernel:  [] log_new_dir_dentries+0x1e1/0x4c0 [btrfs]
> kernel:  [] btrfs_log_inode_parent+0x898/0x940 [btrfs]
> kernel:  [] ? dget_parent+0x99/0x2a0
> kernel:  [] btrfs_log_dentry_safe+0x62/0x80 [btrfs]
> kernel:  [] btrfs_sync_file+0x312/0x4c0 [btrfs]
> kernel:  [] vfs_fsync_range+0x4b/0xb0
> kernel:  [] do_fsync+0x3d/0x70
> kernel:  [] SyS_fsync+0x10/0x20
> kernel:  [] entry_SYSCALL_64_fastpath+0x1f/0xc2
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] Btrfs: add tracepoint for btrfs_get_blocks_dax_fault

2016-12-07 Thread Liu Bo
These TPs can help us monitor iomap content for dax reads and writes.

Signed-off-by: Liu Bo 
---
 fs/btrfs/inode.c |   9 
 include/trace/events/btrfs.h | 106 +++
 2 files changed, 115 insertions(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9851422..b5bee38 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8067,6 +8067,8 @@ btrfs_get_blocks_dax_fault(struct inode *inode, u64 
start, u64 len,
struct extent_map *em;
int ret = 0;
 
+   trace_btrfs_get_blocks_dax_entry(inode, start, len, create);
+
if (!create && start >= i_size_read(inode)) {
return 0;
}
@@ -8190,12 +8192,19 @@ btrfs_get_blocks_dax_fault(struct inode *inode, u64 
start, u64 len,
 
 map_block:
ret = btrfs_em_to_iomap(root->fs_info, start, len, em, create, iomap);
+   if (!ret) {
+   if (create)
+   trace_btrfs_iomap_alloc(inode, start, len, iomap);
+   else
+   trace_btrfs_iomap_found(inode, start, len, iomap);
+   }
 
 out:
free_extent_map(em);
 
ASSERT(lockstart < lockend);
unlock_extent_cached(_I(inode)->io_tree, lockstart, lockend, 
_state, GFP_NOFS);
+   trace_btrfs_get_blocks_dax_exit(inode, start, len, create);
 
return ret;
 }
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 0e04208..f7eb44f 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct btrfs_root;
 struct btrfs_fs_info;
@@ -1471,6 +1472,111 @@ TRACE_EVENT(qgroup_update_counters,
  __entry->cur_new_count)
 );
 
+DECLARE_EVENT_CLASS(btrfs_iomap_class,
+
+   TP_PROTO(struct inode *inode, u64 offset, u64 len, struct iomap *iomap),
+
+   TP_ARGS(inode, offset, len, iomap),
+
+   TP_STRUCT__entry_btrfs(
+   __field(u64,  ino   )
+   __field(u64,  isize )
+   __field(u64,  disk_isize)
+   __field(u64,  root_objectid )
+   __field(u64,  offset)
+   __field(u64,  len   )
+   __field(u64,  startoff  )
+   __field(u64,  blockstart)
+   __field(u64,  blocklen  )
+   __field(int,  type  )
+   ),
+
+   TP_fast_assign_btrfs(btrfs_sb(inode->i_sb),
+   __entry->ino = btrfs_ino(inode);
+   __entry->isize = i_size_read(inode);
+   __entry->disk_isize = BTRFS_I(inode)->disk_i_size;
+   __entry->root_objectid =
+   BTRFS_I(inode)->root->root_key.objectid;
+   __entry->offset = offset;
+   __entry->len = len;
+   __entry->startoff = iomap ? iomap->offset : 0;
+   __entry->blockstart = iomap ? (iomap->blkno << 9) : 0;
+   __entry->blocklen = iomap ? iomap->length : 0;
+   __entry->type = iomap ? iomap->type : 0;
+   ),
+
+   TP_printk_btrfs("root 0x%llx(%s) ino 0x%llx size 0x%llx "
+   "disk_isize 0x%llx offset 0x%llx len 0x%llx "
+   "startoff 0x%llx blockstart %llu blocklen 0x%llx "
+   " type %d",
+ show_root_type(__entry->root_objectid),
+ __entry->ino,
+ __entry->isize,
+ __entry->disk_isize,
+ __entry->offset,
+ __entry->len,
+ __entry->startoff,
+ __entry->blockstart,
+ __entry->blocklen,
+ __entry->type)
+);
+
+#define DEFINE_IOMAP_EVENT(name)   \
+DEFINE_EVENT(btrfs_iomap_class, name,  \
+   TP_PROTO(struct inode *inode, u64 offset, u64 len,  \
+struct iomap *iomap),  \
+   TP_ARGS(inode, offset, len, iomap));
+
+DEFINE_IOMAP_EVENT(btrfs_iomap_alloc)
+DEFINE_IOMAP_EVENT(btrfs_iomap_found)
+
+DECLARE_EVENT_CLASS(btrfs_get_blocks_dax,
+
+   TP_PROTO(struct inode *inode, u64 offset, u64 len, int create),
+
+   TP_ARGS(inode, offset, len, create),
+
+   TP_STRUCT__entry_btrfs(
+   __field(u64,  root_objectid )
+   __field(u64,  ino   )
+   __field(u64,  isize )
+   __field(u64,  disk_isize)
+   __field(u64,  offset)
+   __field(u64,  len   )
+   __field(int,  create)
+   ),
+
+   

[PATCH 2/6] Btrfs: set single device limit for dax usecase

2016-12-07 Thread Liu Bo
Dax on btrfs is not ready for multiple device.

Signed-off-by: Liu Bo 
---
 fs/btrfs/ioctl.c | 6 ++
 fs/btrfs/super.c | 7 +++
 2 files changed, 13 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 7acbd2c..ab30d88 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2663,6 +2663,12 @@ static long btrfs_ioctl_add_dev(struct btrfs_root *root, 
void __user *arg)
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
 
+   if (btrfs_test_opt(root->fs_info, DAX)) {
+   btrfs_info(root->fs_info,
+  "dax doesn't support multiple devices\n");
+   return -EOPNOTSUPP;
+   }
+
if (atomic_xchg(>fs_info->mutually_exclusive_operation_running,
1)) {
return BTRFS_ERROR_DEV_EXCL_RUN_IN_PROGRESS;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 9b18f3d..8cb94ab 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -475,6 +475,13 @@ int btrfs_parse_options(struct btrfs_root *root, char 
*options,
 #ifdef CONFIG_FS_DAX
case Opt_dax:
btrfs_set_and_info(info, DAX, "setting dax");
+   if (btrfs_super_num_devices(info->super_copy) > 1) {
+   btrfs_info(info,
+ "dax doesn't support multiple 
devices(%llu)\n",
+  
btrfs_super_num_devices(info->super_copy));
+   ret = -EOPNOTSUPP;
+   goto out;
+   }
/*
 * sb->s_blocksize is set to root->sectorsize
 * sb->s_bdev is required, but btrfs doesn't set it
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] Btrfs: refactor btrfs_file_write_iter

2016-12-07 Thread Liu Bo
This adds a helper function btrfs_file_write_check for file checks, permission
check and necessary time and size extention.

With this, we simplify btrfs_file_write_iter by putting details into seperated
buffered_write and direct_write callback.

Signed-off-by: Liu Bo 
---
 fs/btrfs/file.c | 128 ++--
 1 file changed, 78 insertions(+), 50 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 3a14c87..06e55e8 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -44,6 +44,7 @@
 #include "compression.h"
 
 static struct kmem_cache *btrfs_inode_defrag_cachep;
+static ssize_t btrfs_file_write_check(struct kiocb *iocb, struct iov_iter 
*from);
 /*
  * when auto defrag is enabled we
  * queue up these defrag structs to remember which
@@ -1735,20 +1736,25 @@ static noinline ssize_t __btrfs_buffered_write(struct 
file *file,
return num_written ? num_written : ret;
 }
 
-static ssize_t __btrfs_direct_write(struct kiocb *iocb, struct iov_iter *from)
+static ssize_t btrfs_file_direct_write(struct kiocb *iocb,
+  struct iov_iter *from)
 {
struct file *file = iocb->ki_filp;
struct inode *inode = file_inode(file);
loff_t pos = iocb->ki_pos;
-   ssize_t written;
+   ssize_t written = 0;
ssize_t written_buffered;
loff_t endbyte;
int err;
 
-   written = generic_file_direct_write(iocb, from);
+   inode_lock(inode);
+   err = btrfs_file_write_check(iocb, from);
+   if (err)
+   goto out;
 
+   written = generic_file_direct_write(iocb, from);
if (written < 0 || !iov_iter_count(from))
-   return written;
+   goto out;
 
pos += written;
written_buffered = __btrfs_buffered_write(file, from, pos);
@@ -1772,6 +1778,7 @@ static ssize_t __btrfs_direct_write(struct kiocb *iocb, 
struct iov_iter *from)
invalidate_mapping_pages(file->f_mapping, pos >> PAGE_SHIFT,
 endbyte >> PAGE_SHIFT);
 out:
+   inode_unlock(inode);
return written ? written : err;
 }
 
@@ -1793,47 +1800,56 @@ static void update_time_for_write(struct inode *inode)
inode_inc_iversion(inode);
 }
 
-static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
-   struct iov_iter *from)
+static ssize_t btrfs_file_buffered_write(struct kiocb *iocb,
+struct iov_iter *from)
+{
+   struct file *file = iocb->ki_filp;
+   struct inode *inode = file_inode(file);
+   loff_t pos;
+   loff_t oldsize;
+   ssize_t ret;
+
+   inode_lock(inode);
+   ret = btrfs_file_write_check(iocb, from);
+   if (ret)
+   goto out;
+
+   current->backing_dev_info = inode_to_bdi(inode);
+
+   pos = iocb->ki_pos;
+   ret = __btrfs_buffered_write(file, from, pos);
+   if (ret > 0)
+   iocb->ki_pos = pos + ret;
+   if (oldsize < pos)
+   pagecache_isize_extended(inode, oldsize,
+   i_size_read(inode));
+
+   current->backing_dev_info = NULL;
+out:
+   inode_unlock(inode);
+   return ret;
+}
+
+static ssize_t btrfs_file_write_check(struct kiocb *iocb,
+ struct iov_iter *from)
 {
struct file *file = iocb->ki_filp;
struct inode *inode = file_inode(file);
struct btrfs_root *root = BTRFS_I(inode)->root;
-   u64 start_pos;
-   u64 end_pos;
-   ssize_t num_written = 0;
-   bool sync = (file->f_flags & O_DSYNC) || IS_SYNC(file->f_mapping->host);
ssize_t err;
loff_t pos;
size_t count;
loff_t oldsize;
-   int clean_page = 0;
+   u64 start_pos;
+   u64 end_pos;
 
-   inode_lock(inode);
err = generic_write_checks(iocb, from);
-   if (err <= 0) {
-   inode_unlock(inode);
+   if (err <= 0)
return err;
-   }
 
-   current->backing_dev_info = inode_to_bdi(inode);
err = file_remove_privs(file);
-   if (err) {
-   inode_unlock(inode);
-   goto out;
-   }
-
-   /*
-* If BTRFS flips readonly due to some impossible error
-* (fs_info->fs_state now has BTRFS_SUPER_FLAG_ERROR),
-* although we have opened a file as writable, we have
-* to stop this write operation to ensure FS consistency.
-*/
-   if (test_bit(BTRFS_FS_STATE_ERROR, >fs_info->fs_state)) {
-   inode_unlock(inode);
-   err = -EROFS;
-   goto out;
-   }
+   if (err)
+   return err;
 
/*
 * We reserve space for updating the inode when we reserve space for the
@@ -1851,30 +1867,43 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb,
/* Expand hole size to cover write data, preventing 

[PATCH 1/6] Btrfs: add mount option for dax

2016-12-07 Thread Liu Bo
Signed-off-by: Liu Bo 
---
 fs/btrfs/ctree.h |  1 +
 fs/btrfs/super.c | 40 +++-
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 0b8ce2b..e54c6e6 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1317,6 +1317,7 @@ static inline u32 BTRFS_MAX_XATTR_SIZE(const struct 
btrfs_root *root)
 #define BTRFS_MOUNT_FRAGMENT_METADATA  (1 << 25)
 #define BTRFS_MOUNT_FREE_SPACE_TREE(1 << 26)
 #define BTRFS_MOUNT_NOLOGREPLAY(1 << 27)
+#define BTRFS_MOUNT_DAX(1 << 28)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL  (30)
 #define BTRFS_DEFAULT_MAX_INLINE   (2048)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 74ed5aa..9b18f3d 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -323,7 +323,7 @@ enum {
Opt_commit_interval, Opt_barrier, Opt_nodefrag, Opt_nodiscard,
Opt_noenospc_debug, Opt_noflushoncommit, Opt_acl, Opt_datacow,
Opt_datasum, Opt_treelog, Opt_noinode_cache, Opt_usebackuproot,
-   Opt_nologreplay, Opt_norecovery,
+   Opt_nologreplay, Opt_norecovery, Opt_dax,
 #ifdef CONFIG_BTRFS_DEBUG
Opt_fragment_data, Opt_fragment_metadata, Opt_fragment_all,
 #endif
@@ -383,6 +383,7 @@ static const match_table_t tokens = {
{Opt_rescan_uuid_tree, "rescan_uuid_tree"},
{Opt_fatal_errors, "fatal_errors=%s"},
{Opt_commit_interval, "commit=%d"},
+   {Opt_dax, "dax"},
 #ifdef CONFIG_BTRFS_DEBUG
{Opt_fragment_data, "fragment=data"},
{Opt_fragment_metadata, "fragment=metadata"},
@@ -410,6 +411,7 @@ int btrfs_parse_options(struct btrfs_root *root, char 
*options,
enum btrfs_compression_type saved_compress_type;
bool saved_compress_force;
int no_compress = 0;
+   int set_bdev = 0;
 
cache_gen = btrfs_super_cache_generation(root->fs_info->super_copy);
if (btrfs_fs_compat_ro(root->fs_info, FREE_SPACE_TREE))
@@ -470,6 +472,40 @@ int btrfs_parse_options(struct btrfs_root *root, char 
*options,
btrfs_clear_opt(info->mount_opt, NODATACOW);
btrfs_clear_opt(info->mount_opt, NODATASUM);
break;
+#ifdef CONFIG_FS_DAX
+   case Opt_dax:
+   btrfs_set_and_info(info, DAX, "setting dax");
+   /*
+* sb->s_blocksize is set to root->sectorsize
+* sb->s_bdev is required, but btrfs doesn't set it
+* because of multi-device, so here we set it
+* temporarily.
+* We allows only one device in dax case.
+*/
+   if (!info->sb->s_bdev) {
+   info->sb->s_bdev = 
info->fs_devices->latest_bdev;
+   set_bdev = 1;
+   }
+   ret = bdev_dax_supported(info->sb, 
info->sb->s_blocksize);
+   if (set_bdev)
+   info->sb->s_bdev = NULL;
+   if (ret)
+   goto out;
+
+   /* dax inode doesn't need inline. */
+   info->max_inline = 0;
+   btrfs_info(info, "max_inline at %llu", 
info->max_inline);
+
+   btrfs_clear_opt(info->mount_opt, COMPRESS);
+   btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
+   btrfs_set_opt(info->mount_opt, NODATACOW);
+   btrfs_set_opt(info->mount_opt, NODATASUM);
+   btrfs_info(info,
+  "setting nodatacow, compression disabled");
+
+   /* dax doesn't expect other fancy options. */
+   goto out;
+#endif
case Opt_nodatacow:
if (!btrfs_test_opt(info, NODATACOW)) {
if (!btrfs_test_opt(info, COMPRESS) ||
@@ -1232,6 +1268,8 @@ static int btrfs_show_options(struct seq_file *seq, 
struct dentry *dentry)
seq_puts(seq, ",nodatasum");
if (btrfs_test_opt(info, NODATACOW))
seq_puts(seq, ",nodatacow");
+   if (btrfs_test_opt(info, DAX))
+   seq_puts(seq, ",dax");
if (btrfs_test_opt(info, NOBARRIER))
seq_puts(seq, ",nobarrier");
if (info->max_inline != BTRFS_DEFAULT_MAX_INLINE)
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] Btrfs: add DAX support for nocow btrfs

2016-12-07 Thread Liu Bo
This has implemented DAX support for btrfs with nocow and single-device.

DAX is developed for block devices that are memory-like in order to avoid
double buffer in both page cache and the storage, so DAX can performs reads and
writes directly to the storage device, and for those who prefer to using
filesystem, filesystem dax support can help to map the storage into userspace
for file-mapping.

Since I haven't figure out how to map multiple devices to userspace without
pagecache, this DAX support is only for single-device, and I don't think
DAX(Direct Access) can work with cow, this is limited to nocow case.  I made
this by setting nodatacow in dax mount option.

Signed-off-by: Liu Bo 
---
 fs/btrfs/Kconfig |   1 +
 fs/btrfs/ctree.h |   5 +
 fs/btrfs/file.c  | 214 ++---
 fs/btrfs/inode.c | 576 +--
 fs/btrfs/ioctl.c |  20 +-
 5 files changed, 780 insertions(+), 36 deletions(-)

diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig
index 80e9c18..297d7509 100644
--- a/fs/btrfs/Kconfig
+++ b/fs/btrfs/Kconfig
@@ -9,6 +9,7 @@ config BTRFS_FS
select RAID6_PQ
select XOR_BLOCKS
select SRCU
+   select FS_IOMAP
 
help
  Btrfs is a general purpose copy-on-write filesystem with extents,
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index e54c6e6..a80b65d 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -38,6 +38,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "extent_io.h"
 #include "extent_map.h"
 #include "async-thread.h"
@@ -3081,6 +3083,8 @@ void btrfs_extent_item_to_extent_map(struct inode *inode,
 struct extent_map *em);
 
 /* inode.c */
+extern struct iomap_ops btrfs_iomap_ops;
+
 struct btrfs_delalloc_work {
struct inode *inode;
int delay_iput;
@@ -3096,6 +3100,7 @@ void btrfs_wait_and_free_delalloc_work(struct 
btrfs_delalloc_work *work);
 struct extent_map *btrfs_get_extent_fiemap(struct inode *inode, struct page 
*page,
   size_t pg_offset, u64 start, u64 len,
   int create);
+
 noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len,
  u64 *orig_start, u64 *orig_block_len,
  u64 *ram_bytes);
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 06e55e8..2d6ee1e 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1782,22 +1782,54 @@ static ssize_t btrfs_file_direct_write(struct kiocb 
*iocb,
return written ? written : err;
 }
 
-static void update_time_for_write(struct inode *inode)
+static ssize_t btrfs_file_dax_write(struct kiocb *iocb,
+   struct iov_iter *from)
 {
-   struct timespec now;
+   struct file *file = iocb->ki_filp;
+   struct inode *inode = file_inode(file);
+   struct btrfs_root *root = BTRFS_I(inode)->root;
+   ssize_t ret;
 
-   if (IS_NOCMTIME(inode))
-   return;
+   inode_lock(inode);
+   ret = btrfs_file_write_check(iocb, from);
+   if (ret)
+   goto out;
 
-   now = current_time(inode);
-   if (!timespec_equal(>i_mtime, ))
-   inode->i_mtime = now;
+   ret = iomap_dax_rw(iocb, from, _iomap_ops);
+   if (ret > 0 && iocb->ki_pos > i_size_read(inode)) {
+   struct btrfs_trans_handle *trans = NULL;
+   ssize_t err;
 
-   if (!timespec_equal(>i_ctime, ))
-   inode->i_ctime = now;
+   trans = btrfs_start_transaction(root, 1);
+   if (IS_ERR(trans)) {
+   /* lets bail out and pretend the write failed */
+   ret = PTR_ERR(trans);
+   goto out;
+   }
 
-   if (IS_I_VERSION(inode))
-   inode_inc_iversion(inode);
+   /* iocb->ki_pos has been updated to new size in iomap_dax_rw. */
+   i_size_write(inode, iocb->ki_pos);
+
+   /* update i_disksize accordingly. */
+   btrfs_ordered_update_i_size(inode, iocb->ki_pos, NULL);
+
+   err = btrfs_update_inode_fallback(trans, root, inode);
+   btrfs_end_transaction(trans, root);
+   if (err) {
+   /* lets bail out and pretend the write failed */
+   ret = err;
+   goto out;
+   }
+
+   /*
+* no pagecache involved, thus no need to call
+* pagecache_isize_extended
+*/
+   }
+
+out:
+   inode_unlock(inode);
+   return ret;
 }
 
 static ssize_t btrfs_file_buffered_write(struct kiocb *iocb,
@@ -1830,6 +1862,24 @@ static ssize_t btrfs_file_buffered_write(struct kiocb 
*iocb,
return ret;
 }
 
+static void update_time_for_write(struct inode *inode)
+{
+   struct timespec now;
+
+   if 

[PATCH 0/6] btrfs dax IO

2016-12-07 Thread Liu Bo
This is a prelimanary patch set to add dax support for btrfs, with
this we can do normal read/write to dax files and can mmap dax files
to userspace so that applications have the ability to access
persistent memory directly.

Please note that currently this is limited to nocow, i.e. all dax
inodes do not have COW behaviour.

COW:no
mutliple device:no
clone/reflink:  no
snapshot:   no
compression:no
checksum:   no

Right now snapshot is disabled while mounting with -odax, but snapshot
can be created without -odax, and writing to a dax file in snapshot
will get -EIO.

Clone/reflink is dealt with as same as snapshot, -EIO will be returned
when writing to shared extents.

This has adopted the latest iomap framework for dax read/write
and dax mmap.

With kernel command option "memmap=", I've had the whole patch set
tested with fstests, except those issues caused by failure of creating
snapshot/reflink and requirement for mutliple device, fstests said OK.

To test it, simply use kernel cmd option "memmap=", mkfs.btrfs and
mount -odax, then you're ready to go run anything upon dax-version
btrfs.

Liu Bo (6):
  Btrfs: add mount option for dax
  Btrfs: set single device limit for dax usecase
  Btrfs: refactor btrfs_file_write_iter
  Btrfs: add DAX support for nocow btrfs
  Btrfs: add mmap_sem to avoid race between page faults and
truncate/hole_punch
  Btrfs: add tracepoint for btrfs_get_blocks_dax_fault

 fs/btrfs/Kconfig |   1 +
 fs/btrfs/btrfs_inode.h   |   7 +
 fs/btrfs/ctree.h |   6 +
 fs/btrfs/file.c  | 342 +++-
 fs/btrfs/inode.c | 599 +--
 fs/btrfs/ioctl.c |  26 +-
 fs/btrfs/super.c |  47 +++-
 include/trace/events/btrfs.h | 106 
 8 files changed, 1048 insertions(+), 86 deletions(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] Btrfs: add mmap_sem to avoid race between page faults and truncate/hole_punch

2016-12-07 Thread Liu Bo
How to serialise page_faults against truncate/hole punch?

For truncate, we firstly update isize and then truncate pagecache in
order to avoid race against page fault.
For punch_hole, we use lock_extent and truncate pagecache.

Although we have these rules to avoid the race, it's not easy to understand how
they do that.  This adds a new rw_semaphore mmap_sem in inode and grab it for
writing over truncate, hole punching and for reading over page faults.

Signed-off-by: Liu Bo 
---
 fs/btrfs/btrfs_inode.h |  7 +++
 fs/btrfs/file.c| 40 +++-
 fs/btrfs/inode.c   | 14 --
 3 files changed, 42 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h
index 1a8fa46..f3674fd 100644
--- a/fs/btrfs/btrfs_inode.h
+++ b/fs/btrfs/btrfs_inode.h
@@ -195,6 +195,13 @@ struct btrfs_inode {
 */
struct rw_semaphore dio_sem;
 
+   /*
+* To serialise page fault with truncate/punch_hole operations.
+* We have to make sure that new page cannot be faulted in a section
+* of the inode that is being punched.
+*/
+   struct rw_semaphore mmap_sem;
+
struct inode vfs_inode;
 };
 
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 2d6ee1e..a5c375a 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2298,11 +2298,12 @@ static int btrfs_filemap_page_mkwrite(struct 
vm_area_struct *vma,
goto out;
}
 
+   down_read(_I(inode)->mmap_sem);
if (IS_DAX(inode))
ret = iomap_dax_fault(vma, vmf, _iomap_ops);
else
ret = btrfs_page_mkwrite(vma, vmf);
-
+   up_read(_I(inode)->mmap_sem);
 out:
sb_end_pagefault(inode->i_sb);
return ret;
@@ -2316,10 +2317,12 @@ static int btrfs_filemap_fault(struct vm_area_struct 
*vma, struct vm_fault *vmf)
if ((vmf->flags & FAULT_FLAG_WRITE) && IS_DAX(inode))
return btrfs_filemap_page_mkwrite(vma, vmf);
 
+   down_read(_I(inode)->mmap_sem);
if (IS_DAX(inode))
ret = iomap_dax_fault(vma, vmf, _iomap_ops);
else
ret = filemap_fault(vma, vmf);
+   up_read(_I(inode)->mmap_sem);
 
return ret;
 }
@@ -2335,17 +2338,13 @@ static int btrfs_filemap_pfn_mkwrite(struct 
vm_area_struct *vma,
sb_start_pagefault(sb);
file_update_time(vma->vm_file);
 
-   /*
-* How to serialise against truncate/hole punch similar to page_mkwrite?
-* For truncate, we firstly update isize and then truncate pagecache in
-* order to avoid race against page fault.
-* For punch_hole, we use lock_extent and truncate pagecache.
-*/
+   down_read(_I(inode)->mmap_sem);
size = (i_size_read(inode) + PAGE_SIZE - 1) >> PAGE_SHIFT;
if (vmf->pgoff >= size)
ret = VM_FAULT_SIGBUS;
else
ret = dax_pfn_mkwrite(vma, vmf);
+   up_read(_I(inode)->mmap_sem);
 
sb_end_pagefault(sb);
return ret;
@@ -2576,6 +2575,13 @@ static int btrfs_punch_hole(struct inode *inode, loff_t 
offset, loff_t len)
 BTRFS_I(inode)->root->sectorsize) - 1;
same_block = (BTRFS_BYTES_TO_BLKS(root->fs_info, offset))
== (BTRFS_BYTES_TO_BLKS(root->fs_info, offset + len - 1));
+
+   /*
+* Prevent page faults from reinstantiating pages we have released
+* from page cache.
+*/
+   down_write(_I(inode)->mmap_sem);
+
/*
 * We needn't truncate any block which is beyond the end of the file
 * because we are sure there is no data there.
@@ -2591,17 +2597,15 @@ static int btrfs_punch_hole(struct inode *inode, loff_t 
offset, loff_t len)
} else {
ret = 0;
}
-   goto out_only_mutex;
+   goto out_mmap;
}
 
/* zero back part of the first block */
if (offset < ino_size) {
truncated_block = true;
ret = btrfs_truncate_block(inode, offset, 0, 0);
-   if (ret) {
-   inode_unlock(inode);
-   return ret;
-   }
+   if (ret)
+   goto out_mmap;
}
 
/* Check the aligned pages after the first unaligned page,
@@ -2614,10 +2618,10 @@ static int btrfs_punch_hole(struct inode *inode, loff_t 
offset, loff_t len)
offset = lockstart;
ret = find_first_non_hole(inode, , );
if (ret < 0)
-   goto out_only_mutex;
+   goto out_mmap;
if (ret && !len) {
ret = 0;
-   goto out_only_mutex;
+   goto out_mmap;
}
lockstart = offset;
}
@@ -2628,7 +2632,7 @@ static int btrfs_punch_hole(struct inode 

Dear Friend.

2016-12-07 Thread james kabore
Good Day to you,


I am Mr.JAMES KABORE, Chief Operating Officer with my Bank, This
letter must come to you as a big surprise, but only a day people meet
and become friends and business partners.

An Iraqi Foreign Oil consultant/contractor Mr.Thomas Stone made a
fixed deposit with our bank in 2003, valued at US$20.5.million.

He was among the dead victims of island of Java magnitude 6.2
earthquake, Indonesia, just outside the city of Bantul. on May 27,
2006.

Mr.Thomas stone did not mention any Next of Kin/Heir when the account
was opened, and am his account officer/adviser,uptill now our bank did
not know about Mr.Thomas Stone death, and if they happen to know, my
Bank Directors will take the funds for their personal use.I will like
you to know also that this deal is for those that have foresights and
initiatives.

I am now seeking your co-operation to present you as Next of Kin to
the account and my bank headquarters will release the money to you
under a legitimate arrangement. We shall discuss how to share our
percentages when i receive your positive reply.You will be in full
control of this money when transfered into your account until i join
you there, why i want to do this deal is because in 4 months time i
will resign from the Bank and start up my own company.

If you are interested to work with me with all sincerity, kindly send
to me the following details;


(1) Your direct mobile number...
(2) Your full names and country resident address...
(3) Your private E-mail box
(4) Your Age..
(5) Your Occupation.
(6) Your nationality.


As soon as i receive your response with following information, i will
then send you form to fill and send to the Bank legal department for
the release of the money to your nominated account without wasting any
time.

I will be waiting for your response .
Regards

Mr.JAMES KABORE.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs_destroy_inode warn (outstanding extents)

2016-12-07 Thread Dave Jones
On Sat, Dec 03, 2016 at 11:48:33AM -0500, Dave Jones wrote:

 > The interesting process here seems to be kworker/u8:17, and the trace
 > captures some of what that was doing before that bad page was hit.

I'm travelling next week, so I'm trying to braindump the stuff I've
found so far and summarise so I can pick it back up later if no-one else
figures it out first.

I've hit the bad page map spew with enough regularity that I've now got a 
handful of
good traces.

http://codemonkey.org.uk/junk/btrfs/bad-page-state1.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state2.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state3.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state4.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state5.txt
http://codemonkey.org.uk/junk/btrfs/bad-page-state6.txt

It smells to me like a race between truncate and the writeback
workqueue. The variety of traces here seem to show both sides
of the race, sometimes it's kworker, sometimes a trinity child process.

bad-page-state3.txt onwards have some bonus trace_printk's from
btrfs_setsize as I was curious what sizes we were passing down to
truncate. The only patterns I see are going from very large to very
small sizes. Perhaps that causes truncate to generate so much
writeback that it makes the race apparent ?



Other stuff I keep hitting:

Start transaction spew:
http://codemonkey.org.uk/junk/btrfs/start_transaction.txt
That's the WARN_ON(h->use_count > 2);
I hit this with enough regularity that I had to comment it out.
It's not clear to me whether this is related at all.

Lockdep spew:
http://codemonkey.org.uk/junk/btrfs/register_lock_class1.txt
http://codemonkey.org.uk/junk/btrfs/register_lock_class2.txt
This stuff has been around for a while (4.6ish iirc)

Sometimes the fs got into a screwed up state that needed btrfscking.
http://codemonkey.org.uk/junk/btrfs/replay-log-fail.txt

Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[bug]: possible recursive locking detected

2016-12-07 Thread Yclept Nemo
kernel version: drm-next x86_64
abrt (fedora) bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1402453
dmesg snippet (the full dmesg is 15Mb):

kernel: =
kernel: [ INFO: possible recursive locking detected ]
kernel: 4.9.0-0.rc8.git0.1.fc25.x86_64 #1 Not tainted
kernel: -
kernel: gvfsd-metadata/1397 is trying to acquire lock:
kernel:  (
kernel: >log_mutex
kernel: ){+.+...}
kernel: , at:
kernel: [] btrfs_log_inode+0x162/0x1190 [btrfs]
kernel:
but task is already holding lock:
kernel:  (
kernel: >log_mutex
kernel: ){+.+...}
kernel: , at:
kernel: [] btrfs_log_inode+0x162/0x1190 [btrfs]
kernel:
other info that might help us debug this:
kernel:  Possible unsafe locking scenario:
kernel:CPU0
kernel:
kernel:   lock(
kernel: >log_mutex
kernel: );
kernel:   lock(
kernel: >log_mutex
kernel: );
kernel:
 *** DEADLOCK ***
kernel:  May be due to missing lock nesting notation
kernel: 3 locks held by gvfsd-metadata/1397:
kernel:  #0:
kernel:  (
kernel: >i_mutex_dir_key
kernel: #3
kernel: ){++}
kernel: , at:
kernel: [] btrfs_sync_file+0x163/0x4c0 [btrfs]
kernel:  #1:
kernel:  (
kernel: sb_internal
kernel: ){.+.+.+}
kernel: , at:
kernel: [] start_transaction+0x2f6/0x530 [btrfs]
kernel:  #2:
kernel:  (
kernel: >log_mutex
kernel: ){+.+...}
kernel: , at:
kernel: [] btrfs_log_inode+0x162/0x1190 [btrfs]
kernel:
stack backtrace:
kernel: CPU: 0 PID: 1397 Comm: gvfsd-metadata Not tainted
4.9.0-0.rc8.git0.1.fc25.x86_64 #1
kernel: Hardware name:/LP NF4 Series, BIOS 6.00 PG 01/25/2005
kernel:  b5c3c383b760 a64772e3 a7be05e0 96512b328000
kernel:  b5c3c383b828 a611231e b5c3c383b780 0003
kernel:  c383b7a8 a74e5600 6629c0631375af20 96512b328ca8
kernel: Call Trace:
kernel:  [] dump_stack+0x86/0xc3
kernel:  [] __lock_acquire+0x78e/0x1290
kernel:  [] ? sched_clock_cpu+0x90/0xc0
kernel:  [] ? mutex_unlock+0xe/0x10
kernel:  [] lock_acquire+0xf6/0x1f0
kernel:  [] ? btrfs_log_inode+0x162/0x1190 [btrfs]
kernel:  [] mutex_lock_nested+0x86/0x3f0
kernel:  [] ? btrfs_log_inode+0x162/0x1190 [btrfs]
kernel:  [] ? __btrfs_release_delayed_node+0x75/0x1c0 [btrfs]
kernel:  [] ? btrfs_log_inode+0x162/0x1190 [btrfs]
kernel:  [] ?
btrfs_commit_inode_delayed_inode+0xe9/0x130 [btrfs]
kernel:  [] btrfs_log_inode+0x162/0x1190 [btrfs]
kernel:  [] ? __might_sleep+0x4a/0x80
kernel:  [] btrfs_log_inode+0xd18/0x1190 [btrfs]
kernel:  [] ? sched_clock_local+0x17/0x80
kernel:  [] log_new_dir_dentries+0x1e1/0x4c0 [btrfs]
kernel:  [] btrfs_log_inode_parent+0x898/0x940 [btrfs]
kernel:  [] ? dget_parent+0x99/0x2a0
kernel:  [] btrfs_log_dentry_safe+0x62/0x80 [btrfs]
kernel:  [] btrfs_sync_file+0x312/0x4c0 [btrfs]
kernel:  [] vfs_fsync_range+0x4b/0xb0
kernel:  [] do_fsync+0x3d/0x70
kernel:  [] SyS_fsync+0x10/0x20
kernel:  [] entry_SYSCALL_64_fastpath+0x1f/0xc2
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: Fix disable backtrace assert error

2016-12-07 Thread Goldwyn Rodrigues


On 12/06/2016 07:29 PM, Qu Wenruo wrote:
> Due to commit 00e769d04c2c83029d6c71(btrfs-progs: Correct value printed
> by assertions/BUG_ON/WARN_ON), which changed the assert_trace()
> parameter, the condition passed to assert/WARN_ON/BUG_ON are logical
> notted for backtrace enabled and disabled case.
> 
> Such behavior makes us easier to pass value wrong, and in fact it did
> cause us to pass wrong condition for ASSERT().
> 
> Instead of passing different conditions for ASSERT/WARN_ON/BUG_ON()
> manually, this patch will use BUG_ON() to implement the resting
> ASSERT/WARN_ON/BUG(), so we don't need to pass 3 different conditions
> but only one.
> 
> And to further info the review for the fact that the condition should be
> different, rename "assert_trace" to "bugon_trace", as unlike assert, we
> will only trigger the bug when condition is true.
> 
> Also, move WARN_ON() out of the ifdef branch, as it's completely the
> same for both branches.
> 
> Cc: Goldwyn Rodrigues 
> Signed-off-by: Qu Wenruo 
> ---
>  kerncompat.h | 19 +++
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/kerncompat.h b/kerncompat.h
> index e374614..be77608 100644
> --- a/kerncompat.h
> +++ b/kerncompat.h
> @@ -277,7 +277,7 @@ static inline long IS_ERR(const void *ptr)
>  #define vfree(x) free(x)
>  
>  #ifndef BTRFS_DISABLE_BACKTRACE
> -static inline void assert_trace(const char *assertion, const char *filename,
> +static inline void bugon_trace(const char *assertion, const char *filename,
> const char *func, unsigned line, long val)
>  {
>   if (!val)

To keep confusion to the minimum, you can call this *condition instead
of *assertion.

> @@ -287,17 +287,20 @@ static inline void assert_trace(const char *assertion, 
> const char *filename,
>   exit(1);
>  }
>  
> -#define BUG_ON(c) assert_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
> -#define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
> -#define  ASSERT(c) assert_trace(#c, __FILE__, __func__, __LINE__, 
> (long)!(c))
> -#define BUG() assert_trace(NULL, __FILE__, __func__, __LINE__, 1)
> +#define BUG_ON(c) bugon_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
>  #else
>  #define BUG_ON(c) assert(!(c))
> -#define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
> -#define ASSERT(c) assert(!(c))
> -#define BUG() assert(0)
>  #endif
>  
> +#define WARN_ON(c) warning_trace(#c, __FILE__, __func__, __LINE__, (long)(c))
> +/*
> + * TODO: ASSERT() should be depercated. In case like ASSERT(ret == 0), it
> + * won't output any useful value for ret.
> + * Should be replaced by BUG_ON(ret);
> + */
> +#define  ASSERT(c) BUG_ON(!(c))

I am not sure of this. As you are stating, this (double negation) will
kill the value of the condition. Won't it be better to remove all
ASSERTs first instead of putting this TODO?


> +#define BUG() BUG_ON(1)
> +
>  #define container_of(ptr, type, member) ({  \
>  const typeof( ((type *)0)->member ) *__mptr = (ptr);\
>   (type *)( (char *)__mptr - offsetof(type,member) );})
> 

-- 
Goldwyn
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: LSF/MM 2017: Call for Proposals

2016-12-07 Thread James Bottomley
On Thu, 2016-12-01 at 09:11 -0500, Jeff Layton wrote:
> 1) Proposals for agenda topics should be sent before January 15th, 
> 2016 to:
> 
> lsf...@lists.linux-foundation.org
> 
> and cc the Linux list or lists that are relevant for the topic in
> question:
> 
> ATA:   linux-...@vger.kernel.org
> Block: linux-bl...@vger.kernel.org
> FS:linux-fsde...@vger.kernel.org
> MM:linux...@kvack.org
> SCSI:  linux-s...@vger.kernel.org
> NVMe:  linux-n...@lists.infradead.org
> 
> Please tag your proposal with [LSF/MM TOPIC] to make it easier to 
> track.

Just on this point, since there seems to be a lot of confusion: lsf-pc
is the list for contacting the programme committee, so you cannot
subscribe to it.

There is no -discuss equivalent, like kernel summit has, because we
expect you to cc the relevant existing mailing list and have the
discussion there instead rather than expecting people to subscribe to a
new list.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: *** Some devices missing ***

2016-12-07 Thread Duncan
Luescher Claude posted on Tue, 06 Dec 2016 14:00:28 +0100 as excerpted:

> The server was running 3.13 kernel and btrfs progs 4.0  until it
> crashed. After that mounting the file system hang forever so I decided
> to switch to Linux Kernel 4.8.11 and btrfs-progs v4.8.5.
> 
> There are of course no missing devices.  This is what I get in dmesg
> when I try to mount the array with or without the degraded option:

> [363.066379] BTRFS error (device sdf): super_num_devices 10 mismatch
> with num_devices 9 found here

> I only find an old 2014 list entry that it was some bug and fixed so it
> should be long time fixed in 4.8.5. I need to put the device back to
> operation as soon as possible. Anybody have any further ideas what can I
> do?

FWIW there's a very recent (within the last week) thread on this, with a 
patch.  I'm not a dev, just a btrfs user and list regular, so I didn't 
specifically track it, but someone else will likely post a more direct 
link or list the thread name anyway.

IIRC the problem is that under certain circumstances the number of 
devices as given in the superblock can be incorrect, so the patch lets 
btrfs ignore the given number if all chunks (both copies in raid1/10 
mode) appear to be accounted for on already available devices.

I believe the patch has been queued, but isn't in a release yet.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html