Re: Building a brtfs filesystem 70M?

2014-03-11 Thread Saul Wold

On 03/10/2014 10:47 PM, Gui Hecheng wrote:

On Mon, 2014-03-10 at 20:16 -0700, Saul Wold wrote:

On 03/10/2014 07:38 PM, Gui Hecheng wrote:

On Mon, 2014-03-10 at 16:25 -0700, Saul Wold wrote:

Hi There

There seems to be an issue if we try to build a btrfs based FS that is
less than 70M, we get the following assertion failure:

mkfs.btrfs: extent-tree.c:2682: btrfs_reserve_extent: Assertion `!(ret)'
failed.

I tried to do a search on this and did not find anything obvious.

Further, if I do build a 70M image, it will not mount until I get to I
increase  the about 100M!

# mount -o loop -v rootfs.btrfs mnt
mount: wrong fs type, bad option, bad superblock on /dev/loop0,
  missing codepage or helper program, or other error

  In some cases useful info is found in syslog - try
  dmesg | tail or so.

I can provide a small rootfs (~4M) example if needed

Builds and mounts correct:
mkfs.btrfs -b 104857600 -r rootfs rootfs.btrfs

Builds, but does not mount:
mkfs.btrfs -b 73400320 -r rootfs rootfs.btrfs

Does not build, gives the above assertion error:
mkfs.btrfs -b 10889216 -r rootfs rootfs.btrfs


Thanks


Hi Saul,
Sorry, I'm not able to reproduce your problem...
Are you running the latest btrfs-progs from david's branch?


Yes, I am building it from git using master I think, git hash:
8cae1840afb3ea44dcc298f32983e577480dfee4

I tried both with and without the -M as cwillu suggested, still no joy,
I can send some the rootfs I am using to see if is's something specific.

Here's the full failure:

$ tmp/sysroots/x86_64-linux/usr/bin/mkfs.btrfs -M -b 10889216 -r
tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs rootfs.btrfs
SMALL VOLUME: forcing mixed metadata/data groups
SMALL VOLUME: forcing mixed metadata/data groups

WARNING! - Btrfs v3.12-dirty IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

Turning ON incompat feature 'mixed-bg': mixed data and metadata block groups
Turning ON incompat feature 'extref': increased hardlink limit per file
to 65536
Created a data/metadata chunk of size 8388608
fs created label (null) on rootfs.btrfs
nodesize 4096 leafsize 4096 sectorsize 4096 size 180.00MiB
Btrfs v3.12-dirty
scandir for
tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs failed:
No such file or directory
unable to traverse_directory
Making image is aborted.
mkfs.btrfs: mkfs.c:1592: main: Assertion `!(ret)' failed.
Aborted (core dumped)


Thanks for the help!

Sau!


I think the output really tells us the problem: the mkfs '-r' option
requires a 'directory' as an arg.
But still it should not abort with 'core dumped', I would be glad to
make it more friendly.

Yes, the 
tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs is a 
directory containing a rootfs, we use this with genext2fs with no 
issues.  As I said, I can provide you with a tarball of this directory 
if you wish to try and reproduce this issue.


Sau!


-Gui



Thanks,
Gui






--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a brtfs filesystem 70M?

2014-03-11 Thread Gui Hecheng
On Mon, 2014-03-10 at 23:41 -0700, Saul Wold wrote:
 On 03/10/2014 10:47 PM, Gui Hecheng wrote:
  On Mon, 2014-03-10 at 20:16 -0700, Saul Wold wrote:
  On 03/10/2014 07:38 PM, Gui Hecheng wrote:
  On Mon, 2014-03-10 at 16:25 -0700, Saul Wold wrote:
  Hi There
 
  There seems to be an issue if we try to build a btrfs based FS that is
  less than 70M, we get the following assertion failure:
 
  mkfs.btrfs: extent-tree.c:2682: btrfs_reserve_extent: Assertion `!(ret)'
  failed.
 
  I tried to do a search on this and did not find anything obvious.
 
  Further, if I do build a 70M image, it will not mount until I get to I
  increase  the about 100M!
 
  # mount -o loop -v rootfs.btrfs mnt
  mount: wrong fs type, bad option, bad superblock on /dev/loop0,
missing codepage or helper program, or other error
 
In some cases useful info is found in syslog - try
dmesg | tail or so.
 
  I can provide a small rootfs (~4M) example if needed
 
  Builds and mounts correct:
  mkfs.btrfs -b 104857600 -r rootfs rootfs.btrfs
 
  Builds, but does not mount:
  mkfs.btrfs -b 73400320 -r rootfs rootfs.btrfs
 
  Does not build, gives the above assertion error:
  mkfs.btrfs -b 10889216 -r rootfs rootfs.btrfs
 
 
  Thanks
 
  Hi Saul,
  Sorry, I'm not able to reproduce your problem...
  Are you running the latest btrfs-progs from david's branch?
 
  Yes, I am building it from git using master I think, git hash:
  8cae1840afb3ea44dcc298f32983e577480dfee4
 
  I tried both with and without the -M as cwillu suggested, still no joy,
  I can send some the rootfs I am using to see if is's something specific.
 
  Here's the full failure:
 
  $ tmp/sysroots/x86_64-linux/usr/bin/mkfs.btrfs -M -b 10889216 -r
  tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs 
  rootfs.btrfs
  SMALL VOLUME: forcing mixed metadata/data groups
  SMALL VOLUME: forcing mixed metadata/data groups
 
  WARNING! - Btrfs v3.12-dirty IS EXPERIMENTAL
  WARNING! - see http://btrfs.wiki.kernel.org before using
 
  Turning ON incompat feature 'mixed-bg': mixed data and metadata block 
  groups
  Turning ON incompat feature 'extref': increased hardlink limit per file
  to 65536
  Created a data/metadata chunk of size 8388608
  fs created label (null) on rootfs.btrfs
 nodesize 4096 leafsize 4096 sectorsize 4096 size 180.00MiB
  Btrfs v3.12-dirty
  scandir for
  tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs failed:
  No such file or directory
  unable to traverse_directory
  Making image is aborted.
  mkfs.btrfs: mkfs.c:1592: main: Assertion `!(ret)' failed.
  Aborted (core dumped)
 
 
  Thanks for the help!
 
  Sau!
 
  I think the output really tells us the problem: the mkfs '-r' option
  requires a 'directory' as an arg.
  But still it should not abort with 'core dumped', I would be glad to
  make it more friendly.
 
 Yes, the 
 tmp/work/qemux86_64-poky-linux/core-image-minimal/1.0-r0/rootfs is a 
 directory containing a rootfs, we use this with genext2fs with no 
 issues.  As I said, I can provide you with a tarball of this directory 
 if you wish to try and reproduce this issue.
 
 Sau!
 
Acturally, I notised that u'v present 2 different BUG_ON()
1. extent-tree.c:2682:btrfs_reserve_extent
2. mkfs.c:1592:main

The 'full failure' u showed is for the 2nd, not the 1st. 

o For the 1st, it is really a space related thing.
o For the 2nd, the 'errno' of the scandir() won't lie,
please check whether arg for '-r' is 'valid'.

For the ~4M rootfs... I will be glad for your kind offer~
Please send it to me.

-Gui

  -Gui
 
  Thanks,
  Gui
 
 
 
 


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: fix bug on mkfs with relative path specified

2014-03-11 Thread Gui Hecheng
The bug accurs when exec:
# mkfs.btrfs -r a relative path device
(note: the path should be 'valid' correspond to your `pwd`)
error msg:
$ scandir for a relative path failed: No such file...

Replace strdup() with realpath() to get the correct scan path.

Reported-by: Saul Wold s...@linux.intel.com
Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com
---
 mkfs.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index 2dc90c2..1bd3069 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -756,6 +756,7 @@ static int traverse_directory(struct btrfs_trans_handle 
*trans,
ino_t parent_inum, cur_inum;
ino_t highest_inum = 0;
char *parent_dir_name;
+   char real_path[PATH_MAX];
struct btrfs_path path;
struct extent_buffer *leaf;
struct btrfs_key root_dir_key;
@@ -764,7 +765,7 @@ static int traverse_directory(struct btrfs_trans_handle 
*trans,
/* Add list for source directory */
dir_entry = malloc(sizeof(struct directory_name_entry));
dir_entry-dir_name = dir_name;
-   dir_entry-path = strdup(dir_name);
+   dir_entry-path = realpath(dir_name, real_path);
 
parent_inum = highest_inum + BTRFS_FIRST_FREE_OBJECTID;
dir_entry-inum = parent_inum;
@@ -876,7 +877,6 @@ static int traverse_directory(struct btrfs_trans_handle 
*trans,
}
 
free_namelist(files, count);
-   free(parent_dir_entry-path);
free(parent_dir_entry);
 
index_cnt = 2;
@@ -887,7 +887,6 @@ static int traverse_directory(struct btrfs_trans_handle 
*trans,
 fail:
free_namelist(files, count);
 fail_no_files:
-   free(parent_dir_entry-path);
free(parent_dir_entry);
return -1;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs-progs: mkfs: make sure we can deal with hard links with -r option

2014-03-11 Thread Wang Shilong
Steps to reproduce:
 # mkdir -p /tmp/test
 # touch /tmp/test/file
 # ln /tmp/test/file /tmp/test/hardlinks
 # mkfs.btrfs -f /dev/sda13 -r /tmp/test
 # btrfs check /dev/sda13

To deal with hard link, we must deal with inode with same inode id rather
than increase inode id by ourselves.

Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
---
 mkfs.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/mkfs.c b/mkfs.c
index 2f7dfef..b9385bc 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -380,7 +380,10 @@ static int add_directory_items(struct btrfs_trans_handle 
*trans,
ret = btrfs_insert_dir_item(trans, root, name, name_len,
parent_inum, location,
filetype, index_cnt);
-
+   if (ret)
+   return ret;
+   ret = btrfs_insert_inode_ref(trans, root, name, name_len,
+objectid, parent_inum, index_cnt);
*dir_index_cnt = index_cnt;
index_cnt++;
 
@@ -493,9 +496,7 @@ static int add_inode_items(struct btrfs_trans_handle *trans,
struct btrfs_inode_item btrfs_inode;
u64 objectid;
u64 inode_size = 0;
-   int name_len;
 
-   name_len = strlen(name);
fill_inode_item(trans, root, btrfs_inode, st);
objectid = self_objectid;
 
@@ -509,16 +510,8 @@ static int add_inode_items(struct btrfs_trans_handle 
*trans,
btrfs_set_key_type(inode_key, BTRFS_INODE_ITEM_KEY);
 
ret = btrfs_insert_inode(trans, root, objectid, btrfs_inode);
-   if (ret)
-   goto fail;
-
-   ret = btrfs_insert_inode_ref(trans, root, name, name_len,
-objectid, parent_inum, dir_index_cnt);
-   if (ret)
-   goto fail;
 
*inode_ret = btrfs_inode;
-fail:
return ret;
 }
 
@@ -826,7 +819,7 @@ static int traverse_directory(struct btrfs_trans_handle 
*trans,
goto fail;
}
 
-   cur_inum = ++highest_inum + BTRFS_FIRST_FREE_OBJECTID;
+   cur_inum = st.st_ino;
ret = add_directory_items(trans, root,
  cur_inum, parent_inum,
  cur_file-d_name,
@@ -840,6 +833,10 @@ static int traverse_directory(struct btrfs_trans_handle 
*trans,
  cur_file-d_name, cur_inum,
  parent_inum, dir_index_cnt,
  cur_inode);
+   if (ret == -EEXIST) {
+   BUG_ON(st.st_nlink = 1);
+   continue;
+   }
if (ret) {
fprintf(stderr, add_inode_items failed\n);
goto fail;
-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Btrfs-progs: mkfs: don't create extent for an empty file

2014-03-11 Thread Wang Shilong
Steps to reproduce:
 # mkdir -p /tmp/test
 # touch /tmp/test/file
 # mkfs.btrfs -f /dev/sda13 -r /tmp/test
 # btrfs check /dev/sda13

For an empty file, don't create extent data for it.

Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
---
 mkfs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mkfs.c b/mkfs.c
index 2dc90c2..2f7dfef 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -619,6 +619,9 @@ static int add_file_items(struct btrfs_trans_handle *trans,
struct extent_buffer *eb = NULL;
int fd;
 
+   if (st-st_size == 0)
+   return 0;
+
fd = open(path_name, O_RDONLY);
if (fd == -1) {
fprintf(stderr, %s open failed\n, path_name);
-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Testing BTRFS

2014-03-11 Thread Josef Bacik
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/10/2014 08:39 PM, Lists wrote:
 I'd like to begin testing BTRFS. We'd probably begin roll out in 6 
 months to a year if testing goes well.
 
 We're currently using CentOS6/64 everywhere, are aware of BTRFS
 being a Technology preview in RHEL 7beta and would like to begin
 testing production-level load testing. We generate about 10 GB of
 distinct data daily that is stored redundantly by default on a
 combination of ZFS and Ext4.
 
 Is there a recommended way to do this? Is it anywhere as easy as 
 ZFSonLinux yum install?
 
 

There is way too much churn for any enterprise distro to be able to
keep up with bugfixes and stuff.  You are best off rolling your own
kernel based on the stable series if you want to think about using
btrfs in production.  Thanks,

Josef

-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTHxCtAAoJEANb+wAKly3BdJ8QAK5hYoNtJT/UEkpKakpNoXfV
q6lg2NVPT6EeHzcMhTRS+VTOJ/bjvfwX0qDxRkjvo73F+nkQYcrO78cEMvPqtwTq
HKxrGMibCtt5PlzzcbKqSc1VGIDFEkD2z7fr5y2n4V+E5x0EPCFxU6VOjXgqXyEZ
8tKW24oxLAwbWBvyiaKrB/gWm47Aw6p2pVWgWrqMjMFUaNQBoisAU+1Ezn0Xjg6w
4wazfGqUkUZ3pMcZr5IMQ9X+p+FUid7JWcdNwPjIsPMQhP7mkIK0Mq8eDu6ijVv2
nI52pZuYaZs3+7OlkEoHRssnAwIWUwUq9UQwRjl4WK8FrpgdyYe0n2zlZIWGinvF
qZRMmB5PtM+SYT9Wt5OPAgZxb/ivc9Vz7ACG4edNSBqZ1D7+52aazT4JY0fqWGGU
8vapdKUmyXPQT9MphvHUEqnJtA/K9ek8Frt+f304KCcl/0IEESAoo3InlS7Hw45D
ANEO4ZCwaUp/WjhqvwuvhYrqn8ENsbCm31RYAvAGEOoROzwXEnbl/Nv4DaKa+Q7b
I6uSpyS60cNA2wmKm3wzFGpvSkP8PMzA1zSepK/yJ9p3PxUdxUpY1OqYc8y7gqOf
+ACNUuNMbNwAhMb9udEZzuBZojX3/vVPlqOWLPYDr3fVCrDIwuSKtloao+czkrpo
sxJbe80q3rqtw+p0pStO
=n1mD
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4] xfstests: add test for btrfs-progs restore feature

2014-03-11 Thread Filipe David Borba Manana
This is a regression test to verify that the restore feature of btrfs-progs
is able to correctly recover files that have compressed extents, specially when
the respective file extent items have a non-zero data offset field.

This issue is fixed by the following btrfs-progs patch:

Btrfs-progs: fix restore dealing with compressed extents

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: Fixed title of btrfs-progs patch in the comment and commit message.
V3: Make use of TEST_DIR instead of /tmp, defined $here=`pwd` and better
comments about the conditions necessary to make the test fail. As
suggested by Dave Chinner.
V4: Use a dedicated directory to place the restore files, as suggested by
Josef.

 tests/btrfs/043 |  113 +++
 tests/btrfs/043.out |   40 ++
 tests/btrfs/group   |1 +
 3 files changed, 154 insertions(+)
 create mode 100755 tests/btrfs/043
 create mode 100644 tests/btrfs/043.out

diff --git a/tests/btrfs/043 b/tests/btrfs/043
new file mode 100755
index 000..d0a7152
--- /dev/null
+++ b/tests/btrfs/043
@@ -0,0 +1,113 @@
+#! /bin/bash
+# FS QA Test No. btrfs/043
+#
+# Test that btrfs-progs' restore command is able to correctly recover files
+# that have compressed extents, specially when the respective file extent
+# items have a non-zero data offset field.
+#
+# This issue is fixed by the following btrfs-progs patch:
+#
+#Btrfs-progs: fix restore of files with compressed extents
+#
+#---
+# Copyright (c) 2014 Filipe Manana.  All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+trap _cleanup; exit \$status 0 1 2 3 15
+
+restore_dir=$TEST_DIR/btrfs-test-$seq
+
+_cleanup()
+{
+   rm -fr $tmp
+   rm -fr $restore_dir
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+_need_to_be_root
+
+rm -f $seqres.full
+mkdir $restore_dir
+
+test_btrfs_restore()
+{
+   if [ -z $1 ]
+   then
+   OPTIONS=
+   else
+   OPTIONS=-o compress-force=$1
+   fi
+   _scratch_mkfs /dev/null 21
+   _scratch_mount $OPTIONS
+
+   # Create first file extent item, then fsync to make sure the next write
+   # won't end up in the same file extent item, so that we have 2 distinct
+   # file extent items.
+   $XFS_IO_PROG -f -c pwrite -S 0xff -b 10 0 10 -c fsync \
+   $SCRATCH_MNT/foo | _filter_xfs_io
+
+   # This creates a second file extent item.
+   $XFS_IO_PROG -c pwrite -S 0xaa -b 10 10 10 -c fsync \
+   $SCRATCH_MNT/foo | _filter_xfs_io
+
+   # Now do a few writes that will cause the first extent item to be split,
+   # with some of the new smaller file extent items getting a data offset
+   # field different from 0.
+   $XFS_IO_PROG -c pwrite -S 0x1e -b 2 1 2 $SCRATCH_MNT/foo \
+   | _filter_xfs_io
+   $XFS_IO_PROG -c pwrite -S 0xd0 -b 11 33000 11 $SCRATCH_MNT/foo \
+   | _filter_xfs_io
+   $XFS_IO_PROG -c pwrite -S 0xbc -b 100 99000 100 $SCRATCH_MNT/foo \
+   | _filter_xfs_io
+
+   md5sum $SCRATCH_MNT/foo | _filter_scratch
+
+   _scratch_unmount
+
+   rm -f $restore_dir/foo
+   # Now that the fs is unmounted, call btrfs restore to read the file
+   # from disk and save it in the test directory. It used to incorrectly
+   # read compressed file extents that have a non-zero data offset field,
+   # resulting either in decompression failure or reading a wrong section
+   # of the extent.
+   _run_btrfs_util_prog restore $SCRATCH_DEV $restore_dir
+   md5sum $restore_dir/foo | cut -d ' ' -f 1
+}
+
+echo Testing restore of file compressed with lzo
+test_btrfs_restore lzo
+echo Testing restore of file compressed with zlib
+test_btrfs_restore zlib
+echo Testing restore of file without any compression
+test_btrfs_restore
+
+status=0
+exit
diff --git a/tests/btrfs/043.out 

Re: [PATCH v5 00/18] Replace btrfs_workers with kernel workqueue based btrfs_workqueue

2014-03-11 Thread Filipe David Manana
On Fri, Feb 28, 2014 at 2:46 AM, Qu Wenruo quwen...@cn.fujitsu.com wrote:
 Add a new btrfs_workqueue_struct which use kernel workqueue to implement
 most of the original btrfs_workers, to replace btrfs_workers.

 With this patchset, redundant workqueue codes are replaced with kernel
 workqueue infrastructure, which not only reduces the code size but also the
 effort to maintain it.

 The result(somewhat outdated though) from sysbench shows minor improvement on 
 the following server:
 CPU: two-way Xeon X5660
 RAM: 4G
 HDD: SAS HDD, 150G total, 100G partition for btrfs test

 Test result on default mount option:
 https://docs.google.com/spreadsheet/ccc?key=0AhpkL3ehzX3pdENjajJTWFg5d1BWbExnYWFpMTJxeUEusp=sharing

 Test result on -o compress mount option:
 https://docs.google.com/spreadsheet/ccc?key=0AhpkL3ehzX3pdHdTTEJ6OW96SXJFaDR5enB1SzMzc0Eusp=sharing

 Changelog:
 v1-v2:
   - Fix some workqueue flags.
 v2-v3:
   - Add the thresholding mechanism to simulate the old behavior
   - Convert all the btrfs_workers to btrfs_workrqueue_struct.
   - Fix some potential deadlock when executed in IRQ handler.
 v3-v4:
   - Change the ordered workqueue implement to fix the performance drop in 32K
 multi thread random write.
   - Change the high priority workqueue implement to get an independent high
 workqueue without starving problem.
   - Simplify the btrfs_alloc_workqueue parameters.
   - Coding style cleanup.
   - Remove the redundant _struct suffix.
 v4-v5:
   - Fix a multithread free-and-use bug reported by Josef and David.

 Qu Wenruo (18):
   btrfs: Cleanup the unused struct async_sched.
   btrfs: Added btrfs_workqueue_struct implemented ordered execution
 based on kernel workqueue
   btrfs: Add high priority workqueue support for btrfs_workqueue_struct
   btrfs: Add threshold workqueue based on kernel workqueue
   btrfs: Replace fs_info-workers with btrfs_workqueue.
   btrfs: Replace fs_info-delalloc_workers with btrfs_workqueue
   btrfs: Replace fs_info-submit_workers with btrfs_workqueue.
   btrfs: Replace fs_info-flush_workers with btrfs_workqueue.
   btrfs: Replace fs_info-endio_* workqueue with btrfs_workqueue.
   btrfs: Replace fs_info-rmw_workers workqueue with btrfs_workqueue.
   btrfs: Replace fs_info-cache_workers workqueue with btrfs_workqueue.
   btrfs: Replace fs_info-readahead_workers workqueue with
 btrfs_workqueue.
   btrfs: Replace fs_info-fixup_workers workqueue with btrfs_workqueue.
   btrfs: Replace fs_info-delayed_workers workqueue with
 btrfs_workqueue.
   btrfs: Replace fs_info-qgroup_rescan_worker workqueue with
 btrfs_workqueue.
   btrfs: Replace fs_info-scrub_* workqueue with btrfs_workqueue.
   btrfs: Cleanup the old btrfs_worker.
   btrfs: Cleanup the _struct suffix in btrfs_workequeue

  fs/btrfs/async-thread.c  | 830 
 ---
  fs/btrfs/async-thread.h  | 119 ++-
  fs/btrfs/ctree.h |  39 ++-
  fs/btrfs/delayed-inode.c |   6 +-
  fs/btrfs/disk-io.c   | 212 +---
  fs/btrfs/extent-tree.c   |   4 +-
  fs/btrfs/inode.c |  38 +--
  fs/btrfs/ordered-data.c  |  11 +-
  fs/btrfs/qgroup.c|  15 +-
  fs/btrfs/raid56.c|  21 +-
  fs/btrfs/reada.c |   4 +-
  fs/btrfs/scrub.c |  70 ++--
  fs/btrfs/super.c |  36 +-
  fs/btrfs/volumes.c   |  16 +-
  14 files changed, 446 insertions(+), 975 deletions(-)

 --
 1.9.0

Hi Qu,

On latest btrfs-next/master, which includes these patches, kmemleak is
reporting many leaks that seems related to the work queues.
I can reliably reproduce it by running the xfstests.

Dmesg:

[ 1308.359146] kmemleak: 1308 new suspected memory leaks (see
/sys/kernel/debug/kmemleak)

Sample of kmemleak stack traces:

unreferenced object 0x8800d3f84408 (size 16):
comm mount, pid 4214, jiffies 4294927007 (age 1198.824s)
hex dump (first 16 bytes):
30 4c 6b d4 00 88 ff ff 78 5e 6b d4 00 88 ff ff 0Lk.x^k.
backtrace:
[816e5b46] kmemleak_alloc+0x26/0x50
[8118ec1d] kmem_cache_alloc_trace+0x11d/0x1e0
[a029ce84] btrfs_alloc_workqueue+0x44/0x2a0 [btrfs]
[a026ac15] open_ctree+0xff5/0x20a0 [btrfs]
[a0240eac] btrfs_mount+0x6ec/0x8d0 [btrfs]
[811a4d53] mount_fs+0x43/0x1b0
[811c2403] vfs_kern_mount+0x73/0x160
[811c4d49] do_mount+0x259/0xb70
[811c594e] SyS_mount+0x8e/0xe0
[81703212] system_call_fastpath+0x16/0x1b
[] 0x
unreferenced object 0x8800d3f85830 (size 16):
comm mount, pid 4214, jiffies 4294927008 (age 1198.820s)
hex dump (first 16 bytes):
58 16 0f f5 01 88 ff ff 00 00 00 00 00 00 00 00 X...
backtrace:
[816e5b46] kmemleak_alloc+0x26/0x50
[8118ec1d] kmem_cache_alloc_trace+0x11d/0x1e0
[a029ce84] btrfs_alloc_workqueue+0x44/0x2a0 [btrfs]
[a026ac35] open_ctree+0x1015/0x20a0 [btrfs]
[a0240eac] btrfs_mount+0x6ec/0x8d0 [btrfs]
[811a4d53] mount_fs+0x43/0x1b0
[811c2403] 

[PATCH 1/2] Btrfs: less fs tree lock contention when using autodefrag

2014-03-11 Thread Filipe David Borba Manana
When finding new extents during an autodefrag, don't do so many fs tree
lookups to find an extent with a size smaller then the target treshold.
Instead, after each fs tree forward search immediately unlock upper
levels and process the entire leaf while holding a read lock on the leaf,
since our leaf processing is very fast.
This reduces lock contention, allowing for higher concurrency when other
tasks want to write/update items related to other inodes in the fs tree,
as we're not holding read locks on upper tree levels while processing the
leaf and we do less tree searches.

Test:

sysbench --test=fileio --file-num=512 --file-total-size=16G \
   --file-test-mode=rndrw --num-threads=32 --file-block-size=32768 \
   --file-rw-ratio=3 --file-io-mode=sync --max-time=1800 \
   --max-requests=100 [prepare|run]

(fileystem mounted with -o autodefrag, averages of 5 runs)

Before this change: 58.852Mb/sec throughtput, read 77.589Gb, written 25.863Gb
After this change:  62.683Mb/sec throughtput, read 82.111Gb, written 27.37Gb

Test machine: quad core intel i5-3570K, 32Gb of RAM, SSD.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 fs/btrfs/ioctl.c |8 
 1 file changed, 8 insertions(+)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e174770..5239470 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -941,6 +941,8 @@ static int find_new_extents(struct btrfs_root *root,
ret = btrfs_search_forward(root, min_key, path, newer_than);
if (ret != 0)
goto none;
+   btrfs_unlock_up_safe(path, 1);
+process_slot:
if (min_key.objectid != ino)
goto none;
if (min_key.type != BTRFS_EXTENT_DATA_KEY)
@@ -959,6 +961,12 @@ static int find_new_extents(struct btrfs_root *root,
return 0;
}
 
+   path-slots[0]++;
+   if (path-slots[0]  btrfs_header_nritems(leaf)) {
+   btrfs_item_key_to_cpu(leaf, min_key, path-slots[0]);
+   goto process_slot;
+   }
+
if (min_key.offset == (u64)-1)
goto none;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs: cache extent states in defrag code path

2014-03-11 Thread Filipe David Borba Manana
When locking file ranges in the inode's io_tree, cache the first
extent state that belongs to the target range, so that when unlocking
the range we don't need to search in the io_tree again, reducing cpu
time and making and therefore holding the io_tree's lock for a shorter
period.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 fs/btrfs/ioctl.c |   13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 5239470..6de00ad 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -994,10 +994,13 @@ static struct extent_map *defrag_lookup_extent(struct 
inode *inode, u64 start)
read_unlock(em_tree-lock);
 
if (!em) {
+   struct extent_state *cached = NULL;
+   u64 end = start + len - 1;
+
/* get the big lock and read metadata off disk */
-   lock_extent(io_tree, start, start + len - 1);
+   lock_extent_bits(io_tree, start, end, 0, cached);
em = btrfs_get_extent(inode, NULL, 0, start, len, 0);
-   unlock_extent(io_tree, start, start + len - 1);
+   unlock_extent_cached(io_tree, start, end, cached, GFP_NOFS);
 
if (IS_ERR(em))
return NULL;
@@ -1136,10 +1139,12 @@ again:
page_start = page_offset(page);
page_end = page_start + PAGE_CACHE_SIZE - 1;
while (1) {
-   lock_extent(tree, page_start, page_end);
+   lock_extent_bits(tree, page_start, page_end,
+0, cached_state);
ordered = btrfs_lookup_ordered_extent(inode,
  page_start);
-   unlock_extent(tree, page_start, page_end);
+   unlock_extent_cached(tree, page_start, page_end,
+cached_state, GFP_NOFS);
if (!ordered)
break;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [systemd-devel] [HEADS-UP] Discoverable Partitions Spec

2014-03-11 Thread Calvin Walton
On Tue, 2014-03-11 at 00:45 +0100, Lennart Poettering wrote:
 On Mon, 10.03.14 23:39, Goffredo Baroncelli (kreij...@libero.it) wrote:
 
   Well, the name is property of the admin really. There needs to be a way
   how the admin can label his subvolumes, with a potentially localized
   name. This makes it unsuitable for our purpose, we cannot just take
   possession of this and leave the admin with nothing.
  
  Instead of the name we can use the xattr to store these information.
 
 Ah, using xattrs for this is indeed an option. That way we should be able
 attach any kind of information we like to a subvolume.
 
 Hmm, I figure though that there is no way currently to read xattrs off a
 subvolume without first mounting them individually? Having to mount all
 subvolumes before we can make sense of them and mount them to the right
 place certainly sounds less than ideal...

One thing to remember about the way btrfs snapshots/subvolumes work is
that they simply show up as directories within the filesystem tree.
There is no separate namespace for storing subvolumes; they all end up
in a single unified filesystem tree along with regular files and
directories.

As a result, you can simply mount the root of the filesystem, and then
read xattrs off any and all of the subvolumes.

In fact, mounting a the subvolume /@home to /home in btrfs is
effectively equivalent to

mount -t btrfs -o subvolid=0 /dev/sda1 /home
mount --bind /home/@home /home

except done as a single step in one mount command.

-- 
Calvin Walton calvin.wal...@kepstin.ca

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: add missing kfree in btrfs_destroy_workqueue

2014-03-11 Thread Filipe David Borba Manana

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---
 fs/btrfs/async-thread.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index 00623dd..66532b8 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -315,6 +315,7 @@ void btrfs_destroy_workqueue(struct btrfs_workqueue *wq)
if (wq-high)
__btrfs_destroy_workqueue(wq-high);
__btrfs_destroy_workqueue(wq-normal);
+   kfree(wq);
 }
 
 void btrfs_workqueue_set_max(struct btrfs_workqueue *wq, int max)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a brtfs filesystem 70M?

2014-03-11 Thread Zach Brown
 There seems to be an issue if we try to build a btrfs based FS that
 is less than 70M, we get the following assertion failure:
 
 mkfs.btrfs: extent-tree.c:2682: btrfs_reserve_extent: Assertion
 `!(ret)' failed.

 mkfs.btrfs -b 104857600 -r rootfs rootfs.btrfs

Honestly, the path of least resistance is probably to avoid the -r
option all together.  As you've found, it's not reliable.

I'd take the time to roll the infrastrcture to populate the image by
writing to a mounted image with the kernel code.

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel BUG: btrfs send - Incremental backup

2014-03-11 Thread Josef Bacik
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/08/2014 02:35 AM, Swâmi Petaramesh wrote:
 Hi there,
 
 I tried to perform an incremental backup as described in 
 https://urldefense.proofpoint.com/v1/url?u=https://btrfs.wiki.kernel.org/index.php/Incremental_Backupk=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0Ar=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0Am=b7nDm4ij%2BOl6zWwvbxbx%2BbfpaygHhTjGu8dUoBmQj8E%3D%0As=550780a62cfd5275891e088623e3d05c78f828ee008ef9b1cb99957ee3752a5d
 between 2 external USB drives,
 
 The 1st btrfs send foo/snap1 | btrfs receive bar went well,
 although it took 5-6 times the time the same workload takes in
 ZFS.
 
 Then btrfs send -v -p foo/snap1 foo/snap2 | btrfs receive bar
 crashed on me.
 
 After a reboot and retry, I get a KERNEL BUG again :
 
 

Well that's odd, I can't think of how this would happen.  Can pull
down btrfs-next and run that and verify you still hit the BUG_ON?  If
you do please apply this patch and re-run and reply with the dmesg.

http://paste.fedoraproject.org/84393/55775013

This will spit out something like this

cur_ino=NUMBER, cmp_key objectid=OTHER NUMBER

then I need you to unmount your fs and run

btrfs-debug-tree /dev/whatever  blah.txt

You have about 1.5 gig of metadata so that is going to take a while.
Once that is done you want to go through blah.txt and look for

item number key (OTHER NUMBER FROM ABOVE EXTENT_DATA number)

You need to find all of these entries and copy them down, an example
output looks like this

item 1 key (286 EXTENT_DATA 577536) itemoff 3889 itemsize 53
extent data disk byte 12763136 nr 4096
extent data offset 0 nr 4096 ram 4096
extent compression 0

That is what I'm looking for, you will have multiple of these.  Thanks,

Josef

-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTH0P+AAoJEANb+wAKly3BA+8QAMuccHP7+IIQ62g9WRRgOuBh
Mz/XG+lxpGwbHSwwp7MZPGO10Ee9rJiIJJp/VgHSyNcb/6VHB7HtU1ijifcBFaGj
NkgdTTVHRHl/z17wv2cuiRfCZKOyzXtNvRYGY+lhEfITxeGklUSjwMdCBtG73XqK
gWU3f9hvBjbTtN8Q7RD0O/TRNyrcUK5sEGtfJGM/a0U+avlZpE+Vu89g716BUi5L
F2I5OFRtkVj9WpZLzT6e3PM/5hc0Rg0kgd54fnJoTD34GyNCyqCo1GRhON4DG+ew
NCedZI3aY+/yK9iiaAfcdDpm9332xiVfrJt9tq7EJus/EVyHoDGCUbdN4D2JV+TM
8xapBtcRfy4KExORVtxOpsaamx/xPz53mngN9AH+9mLRO0OZ9xYN5SrN7FCSi/ja
dhdCSR2ndEXmsy3MvcaK0g+d2DntL4uliHb3S6GUlAsDaJuiKoINa4no/FmISc62
uVEbc/cJD0KUKFlUGEcEKst3ja6ecOKFqvutII3t73R7c0N1cvf0yU4WJCTv44CZ
iClBqTI/d23Du6VZpN9hcGSvFap5HiZDO+YCLA/+o5mqMJ2bG8pNQ2cXRzJlSpCN
7zZLAo8Xz0C734F6/SxnoFb+fZATQOz6VX40ZbgL1x931s/jaTJ35c0ZrlBcNSJY
Y8HYBWVEOlXCUkbap7XT
=hcwz
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] Btrfs: return EPERM when deleting a default subvolume

2014-03-11 Thread Guangyu Sun
The error message is confusing:

 # btrfs sub delete /mnt/mysub/
 Delete subvolume '/mnt/mysub'
 ERROR: cannot delete '/mnt/mysub' - Directory not empty

The error message does not make sense to me: It's not about deleting a
directory but it's a subvolume, and it doesn't matter if the subvolume is
empty or not.

Maybe EPERM or is more appropriate in this case, combined with an explanatory
kernel log message. (e.g. subvolume with ID 123 cannot be deleted because
it is configured as default subvolume.)

Reported-by: Koen De Wit koen.de@oracle.com
Signed-off-by: Guangyu Sun guangyu@oracle.com
---
 fs/btrfs/ioctl.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a6d8efa..0751c0f 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -1797,7 +1797,9 @@ static noinline int may_destroy_subvol(struct btrfs_root 
*root)
if (di  !IS_ERR(di)) {
btrfs_dir_item_key_to_cpu(path-nodes[0], di, key);
if (key.objectid == root-root_key.objectid) {
-   ret = -ENOTEMPTY;
+   ret = -EPERM;
+   btrfs_err(root-fs_info, deleting default subvolume 
+ %llu is not allowed, key.objectid);
goto out;
}
btrfs_release_path(path);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Testing BTRFS

2014-03-11 Thread Eric Sandeen
On 3/10/14, 8:02 PM, Avi Miller wrote:
 
 On 11 Mar 2014, at 11:39 am, Lists li...@benjamindsmith.com wrote:
 
 Is there a recommended way to do this? Is it anywhere as easy as
 ZFSonLinux yum install?
 
 Oracle Linux 6 with the Unbreakable Enterprise Kernel Release 2 or
 Release 3 has production-ready btrfs support. You can even convert
 your existing CentOS6 boxes across to Oracle Linux 6 in-place without
 reinstalling:
 
 http://linux.oracle.com/switch/centos/

If we're plugging distros... I can also tell you that you can install
upcoming RHEL7 on btrfs if you like, and it has a very up-to-date
btrfs codebase.  Of course Fedora and other non-enterprise distros
have btrfs support as well.

But we're keeping it tech preview in RHEL7 for now, because in our
testing, it does not yet reach the level of reliability that we
wish to provide to our customers.

Indeed, testing 3.8.13-26.2.1.el6uek.x86_64 (which is, I believe,
the kernel which Avi referred to) via xfstests, I saw
failures on btrfs/009 and btrfs/022; then the box deadlocked
on btrfs/024.  I rebooted  resumed, then deadlocked on btrfs/030.
Rebooted and resumed again, then panicked on btrfs/035.  At that
point I stopped.

Ben, the best advice I have for you is to test *your* workload
on btrfs with whatever qualification tests you have, and see how
things fare.  If you want to know the current state of btrfs,
test the upstream code as best you can; if you hope to deploy
on a distribution with a longer support window, test on that
distribution.

But I agree with Josef that for now, the fixes and changes are
still flying fast  furious, and except in limited use cases,
btrfs is not yet ready for general commercial deployment.

-Eric
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Testing BTRFS

2014-03-11 Thread Avi Miller
Hey,

On 12 Mar 2014, at 6:08 am, Eric Sandeen sand...@redhat.com wrote:

 If we're plugging distros... I can also tell you that you can install
 upcoming RHEL7 on btrfs if you like, and it has a very up-to-date
 btrfs codebase.

Ditto for OL7, for obvious reasons. :)

 Indeed, testing 3.8.13-26.2.1.el6uek.x86_64 (which is, I believe,
 the kernel which Avi referred to) via xfstests, I saw
 failures on btrfs/009 and btrfs/022; then the box deadlocked
 on btrfs/024.  I rebooted  resumed, then deadlocked on btrfs/030.
 Rebooted and resumed again, then panicked on btrfs/035.  At that
 point I stopped.

We have a bunch of btrfs fixes queued for UEK3-QU2 which is in alpha build 
internally at the moment. We do run the full xfstests against our UEK3 releases 
and are working with Liu Bo to backport fixes from mainline which should 
resolve some (hopefully all) of the failing xfstests. It’s also worth ensuring 
that you’re upgrading the userspace btrfs-progs package that ships with the 
updated UEK3 kernels, if applicable.

 Ben, the best advice I have for you is to test *your* workload
 on btrfs with whatever qualification tests you have, and see how
 things fare.  If you want to know the current state of btrfs,
 test the upstream code as best you can; if you hope to deploy
 on a distribution with a longer support window, test on that
 distribution.

Agreed.

 But I agree with Josef that for now, the fixes and changes are
 still flying fast  furious, and except in limited use cases,
 btrfs is not yet ready for general commercial deployment.

Obviously, we disagree (somewhat) here. We’re happy with the status of btrfs 
functionality in UEK3 to provide limited production support, but this is just 
from the Oracle Linux team. The other product teams within Oracle (RDBMS, Java, 
middleware, etc) obviously have to do their own validation and testing and are 
responsible for their own support. As above, I agree with Eric that you should 
test your own workloads and requirements and make your own judgement call.

Cheers,
Avi

--
Oracle http://www.oracle.com
Avi Miller | Product Management Director | +61 (3) 8616 3496
Oracle Linux and Virtualization
417 St Kilda Road, Melbourne, Victoria 3004 Australia

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


warn at fs/btrfs/extent-tree.c:5748 __btrfs_free_extent+0x9ce/0xa20

2014-03-11 Thread Sage Weil
Hey,

Is this something you guys have seen before?  This is from v3.13-rc2.

kernel: [49432.696440] WARNING: CPU: 3 PID: 26411 at 
/srv/autobuild-ceph/gitbuilder.git/build/fs/btrfs/extent-tree.c:5748 
__btrfs_free_extent+0x9ce/0xa20 [btrfs]()
kernel: [49432.710128] Modules linked in: arc4(F) md4(F) nls_utf8(F) cifs(F) 
ufs(F) qnx4(F) hfsplus(F) hfs(F) minix(F) ntfs(F) msdos(F) jfs(F) xfs(F) 
reiserfs(F) ext2(F) kvm_intel(F) kvm(F) ib_iser(F) rdma_cm(F) ib_cm(F) iw_cm(F) 
ib_sa(F) ib_mad(F) ib_core(F) ib_addr(F) iscsi_tcp(F) libiscsi_tcp(F) 
libiscsi(F) psmouse(F) ipmi_si(F) serio_raw(F) gpio_ich(F) joydev(F) dcdbas(F) 
i7core_edac(F) edac_core(F) ipmi_msghandler(F) mac_hid(F) acpi_power_meter(F) 
lpc_ich(F) tpm_tis(F) nfsd(F) nfs_acl(F) auth_rpcgss(F) scsi_transport_iscsi(F) 
nfs(F) fscache(F) lockd(F) lp(F) sunrpc(F) parport(F) hid_generic(F) usbhid(F) 
hid(F) btrfs(F) raid6_pq(F) mptsas(F) ixgbe(F) mptscsih(F) dca(F) mptbase(F) 
ptp(F) pps_core(F) scsi_transport_sas(F) xor(F) mdio(F) bnx2(F) libcrc32c(F)
kernel: [49432.777445] CPU: 3 PID: 26411 Comm: ceph-osd Tainted: GF I  
3.14.0-rc5-ceph-00016-gf31a96a #1
kernel: [49432.786704] Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 
1.6.3 02/07/2011
kernel: [49432.794223]  1674 8800bf1cbac8 816e4840 
88022726ef90
kernel: [49432.801700]   8800bf1cbb08 810524ac 
a8b07e50
kernel: [49432.809176]  880094e74120  b07c9000 

kernel: [49432.816653] Call Trace:
kernel: [49432.819119]  [816e4840] dump_stack+0x46/0x58
kernel: [49432.825384]  [810524ac] warn_slowpath_common+0x8c/0xc0
kernel: [49432.831413]  [810524fa] warn_slowpath_null+0x1a/0x20
kernel: [49432.837284]  [a010b4be] __btrfs_free_extent+0x9ce/0xa20 
[btrfs]
kernel: [49432.844108]  [a01110b8] 
__btrfs_run_delayed_refs+0x428/0x11e0 [btrfs]
kernel: [49432.851465]  [a0109458] ? 
block_rsv_release_bytes+0x108/0x190 [btrfs]
kernel: [49432.858823]  [a0114066] btrfs_run_delayed_refs+0x76/0x2a0 
[btrfs]
kernel: [49432.865869]  [a01251ff] 
__btrfs_end_transaction+0x26f/0x370 [btrfs]
kernel: [49432.873044]  [a0125330] btrfs_end_transaction+0x10/0x20 
[btrfs]
kernel: [49432.879872]  [a01327de] btrfs_link+0x13e/0x1d0 [btrfs]
kernel: [49432.885903]  [811b7571] vfs_link+0x1b1/0x270
kernel: [49432.891060]  [811b8120] SyS_linkat+0x210/0x2d0
kernel: [49432.896394]  [811b81fe] SyS_link+0x1e/0x20
kernel: [49432.901380]  [816f7cd6] system_call_fastpath+0x1a/0x1f

The full dump is at

http://tracker.ceph.com/issues/7688
http://tracker.ceph.com/attachments/download/1141/kern.log.gz

Thanks-
sage
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a brtfs filesystem 70M?

2014-03-11 Thread quwen...@cn.fujitsu.com
On Tue, 11 Mar 2014 09:37:00 -0700, Zach Brown wrote:
 There seems to be an issue if we try to build a btrfs based FS that
 is less than 70M, we get the following assertion failure:

 mkfs.btrfs: extent-tree.c:2682: btrfs_reserve_extent: Assertion
 `!(ret)' failed.
 mkfs.btrfs -b 104857600 -r rootfs rootfs.btrfs
 Honestly, the path of least resistance is probably to avoid the -r
 option all together.  As you've found, it's not reliable.

 I'd take the time to roll the infrastrcture to populate the image by
 writing to a mounted image with the kernel code.

 - z
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

I although agree with the mount + cp(kernel) way to populate the filesystem.
Also I think the implement of -r should be somewhat like mount+cp 
other than the current way,
since the userland implement is noticeably slow than kernel way.

Cc:Donggeun Kim
I also wonder why -r option is needed, since IMO the -r options is 
only needed
if the filesystem is full readonly and must be populated on 
initialization like squashfs.
And since btrfs is a filesystem that can be read and write,
the -r option is not somewhat needed.

So I prefer to remove the -r option.


Thanks
QuN�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥

Re: Building a brtfs filesystem 70M?

2014-03-11 Thread Saul Wold

On 03/11/2014 06:10 PM, quwen...@cn.fujitsu.com wrote:

On Tue, 11 Mar 2014 09:37:00 -0700, Zach Brown wrote:

There seems to be an issue if we try to build a btrfs based FS that
is less than 70M, we get the following assertion failure:

mkfs.btrfs: extent-tree.c:2682: btrfs_reserve_extent: Assertion
`!(ret)' failed.
mkfs.btrfs -b 104857600 -r rootfs rootfs.btrfs

Honestly, the path of least resistance is probably to avoid the -r
option all together.  As you've found, it's not reliable.

I'd take the time to roll the infrastrcture to populate the image by
writing to a mounted image with the kernel code.

- z
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I although agree with the mount + cp(kernel) way to populate the filesystem.
Also I think the implement of -r should be somewhat like mount+cp
other than the current way,
since the userland implement is noticeably slow than kernel way.

Cc:Donggeun Kim
I also wonder why -r option is needed, since IMO the -r options is
only needed
if the filesystem is full readonly and must be populated on
initialization like squashfs.
And since btrfs is a filesystem that can be read and write,
the -r option is not somewhat needed.

So I prefer to remove the -r option.

Please dont remove the -r option, as you point out above it's used from 
userland.  The Yocto Project / OE-Core uses this option to build a put a 
rootfs into a filesystem image in userland without requiring root 
permissions, we use something call pseudo (it a smarter version of 
fakeroot) to lay down a root filesytem.


The patch from Gui worked well for our purposes, we are no longer 
failing to build the filesystem image.


Thanks

Sau!



Thanks
Qu


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: add missing kfree in btrfs_destroy_workqueue

2014-03-11 Thread quwen...@cn.fujitsu.com
On Tue, 11 Mar 2014 14:31:44 +, Filipe David Borba Manana wrote:
 Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
 ---
  fs/btrfs/async-thread.c |1 +
  1 file changed, 1 insertion(+)

 diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
 index 00623dd..66532b8 100644
 --- a/fs/btrfs/async-thread.c
 +++ b/fs/btrfs/async-thread.c
 @@ -315,6 +315,7 @@ void btrfs_destroy_workqueue(struct btrfs_workqueue *wq)
   if (wq-high)
   __btrfs_destroy_workqueue(wq-high);
   __btrfs_destroy_workqueue(wq-normal);
 + kfree(wq);
  }
  
  void btrfs_workqueue_set_max(struct btrfs_workqueue *wq, int max)
Thanks for finding out the missing kfree.
That's my fault

Qu.

[PATCH 1/2 v2] Btrfs: less fs tree lock contention when using autodefrag

2014-03-11 Thread Filipe David Borba Manana
When finding new extents during an autodefrag, don't do so many fs tree
lookups to find an extent with a size smaller then the target treshold.
Instead, after each fs tree forward search immediately unlock upper
levels and process the entire leaf while holding a read lock on the leaf,
since our leaf processing is very fast.
This reduces lock contention, allowing for higher concurrency when other
tasks want to write/update items related to other inodes in the fs tree,
as we're not holding read locks on upper tree levels while processing the
leaf and we do less tree searches.

Test:

sysbench --test=fileio --file-num=512 --file-total-size=16G \
   --file-test-mode=rndrw --num-threads=32 --file-block-size=32768 \
   --file-rw-ratio=3 --file-io-mode=sync --max-time=1800 \
   --max-requests=100 [prepare|run]

(fileystem mounted with -o autodefrag, averages of 5 runs)

Before this change: 58.852Mb/sec throughtput, read 77.589Gb, written 25.863Gb
After this change:  63.034Mb/sec throughtput, read 83.102Gb, written 27.701Gb

Test machine: quad core intel i5-3570K, 32Gb of RAM, SSD.

Signed-off-by: Filipe David Borba Manana fdman...@gmail.com
---

V2: Added missing path-keep_locks reset, which made the unlock_up_safe call
be a noop. And updated commit message test numbers.

 fs/btrfs/ioctl.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e174770..6ab95cc 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -935,12 +935,14 @@ static int find_new_extents(struct btrfs_root *root,
min_key.type = BTRFS_EXTENT_DATA_KEY;
min_key.offset = *off;
 
-   path-keep_locks = 1;
-
while (1) {
+   path-keep_locks = 1;
ret = btrfs_search_forward(root, min_key, path, newer_than);
if (ret != 0)
goto none;
+   path-keep_locks = 0;
+   btrfs_unlock_up_safe(path, 1);
+process_slot:
if (min_key.objectid != ino)
goto none;
if (min_key.type != BTRFS_EXTENT_DATA_KEY)
@@ -959,6 +961,12 @@ static int find_new_extents(struct btrfs_root *root,
return 0;
}
 
+   path-slots[0]++;
+   if (path-slots[0]  btrfs_header_nritems(leaf)) {
+   btrfs_item_key_to_cpu(leaf, min_key, path-slots[0]);
+   goto process_slot;
+   }
+
if (min_key.offset == (u64)-1)
goto none;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html