date:20140917

Re: Problem with a filesystem

2014-09-17 Thread Tovo Rabemanantsoa

On 09/16/2014 11:27 PM, Chris Murphy wrote:
 
 Try to mount normally, then with -o recovery, then with -o ro,recovery. 
 Include dmesg showing any messages that appear for these attempts.
Thank you,
Please find below the messages shown during a normal mount and a mount
with -o ro,recovery.

 Normal Mount 
/var/log/messages
Sep 17 11:00:24 sdeeph1 kernel: [85051.529541] device fsid
9ddd403a-2863-4e71-b28d-d2931a133af3 devid 1 transid 255446 /dev/sda4
Sep 17 11:00:24 sdeeph1 kernel: [85051.529917] btrfs: disk space caching
is enabled
Sep 17 11:01:09 sdeeph1 kernel: [85096.286741] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:01:09 sdeeph1 kernel: [85096.286756] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:01:09 sdeeph1 kernel: [85096.286764] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:01:09 sdeeph1 kernel: [85096.286770] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:01:09 sdeeph1 kernel: [85096.286775] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:01:09 sdeeph1 kernel: [85096.408271] btrfs: open_ctree failed

/var/log/syslog adds this line :
Sep 17 11:01:09 sdeeph1 kernel: [85096.286782] Failed to read block
groups: -5


 mount -o ro,recovery 
/var/log/messages
Sep 17 11:05:10 sdeeph1 kernel: [85337.307267] device fsid
9ddd403a-2863-4e71-b28d-d2931a133af3 devid 1 transid 255446 /dev/sda4
Sep 17 11:05:10 sdeeph1 kernel: [85337.307635] btrfs: enabling auto recovery
Sep 17 11:05:10 sdeeph1 kernel: [85337.307639] btrfs: disk space caching
is enabled
Sep 17 11:05:55 sdeeph1 kernel: [85381.548215] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:05:55 sdeeph1 kernel: [85381.548229] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:05:55 sdeeph1 kernel: [85381.548237] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:05:55 sdeeph1 kernel: [85381.548243] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:05:55 sdeeph1 kernel: [85381.548249] parent transid verify
failed on 24562568384512 wanted 255444 found 255446
Sep 17 11:05:55 sdeeph1 kernel: [85381.670344] btrfs: open_ctree failed

/var/log/syslog adds this line :
Sep 17 11:05:55 sdeeph1 kernel: [85381.548255] Failed to read block
groups: -5

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 01/12] crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code

2014-09-17 Thread Herbert Xu

On Mon, Sep 15, 2014 at 12:30:23AM -0700, beh...@converseincode.com wrote:
 From: Behan Webster beh...@converseincode.com
 
 Add a macro which replaces the use of a Variable Length Array In Struct 
 (VLAIS)
 with a C99 compliant equivalent. This macro instead allocates the appropriate
 amount of memory using an char array.
 
 The new code can be compiled with both gcc and clang.
 
 struct shash_desc contains a flexible array member member ctx declared with
 CRYPTO_MINALIGN_ATTR, so sizeof(struct shash_desc) aligns the beginning
 of the array declared after struct shash_desc with long long.
 
 No trailing padding is required because it is not a struct type that can
 be used in an array.
 
 The CRYPTO_MINALIGN_ATTR is required so that desc is aligned with long long
 as would be the case for a struct containing a member with
 CRYPTO_MINALIGN_ATTR.
 
 Signed-off-by: Behan Webster beh...@converseincode.com

Acked-by: Herbert Xu herb...@gondor.apana.org.au

Thanks,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs: add regression test for remount with thread_pool resized

2014-09-17 Thread gux.fnst

Hi,

Could anyone help to review this patch? Thanks!

Regards,
Xing Gu

On 05/30/2014 04:52 PM, Xing Gu wrote:
 Regression test for resizing 'thread_pool' when remount the fs.
 
 Signed-off-by: Xing Gu gux.f...@cn.fujitsu.com
 ---
   tests/btrfs/055 | 55 
 +
   tests/btrfs/055.out |  1 +
   tests/btrfs/group   |  1 +
   3 files changed, 57 insertions(+)
   create mode 100755 tests/btrfs/055
   create mode 100644 tests/btrfs/055.out
 
 diff --git a/tests/btrfs/055 b/tests/btrfs/055
 new file mode 100755
 index 000..0a0dd34
 --- /dev/null
 +++ b/tests/btrfs/055
 @@ -0,0 +1,55 @@
 +#!/bin/bash
 +# FS QA Test No. btrfs/055
 +#
 +# Regression test for resizing 'thread_pool' when remount the fs.
 +#
 +#---
 +# Copyright (c) 2014 Fujitsu.  All Rights Reserved.
 +#
 +# This program is free software; you can redistribute it and/or
 +# modify it under the terms of the GNU General Public License as
 +# published by the Free Software Foundation.
 +#
 +# This program is distributed in the hope that it would be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +#
 +# You should have received a copy of the GNU General Public License
 +# along with this program; if not, write the Free Software Foundation,
 +# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 +#
 +#---
 +#
 +
 +seq=`basename $0`
 +seqres=$RESULT_DIR/$seq
 +echo QA output created by $seq
 +
 +here=`pwd`
 +tmp=/tmp/$$
 +status=1 # failure is the default!
 +
 +_cleanup()
 +{
 +rm -f $tmp.*
 +}
 +
 +trap _cleanup ; exit \$status 0 1 2 3 15
 +
 +# get standard environment, filters and checks
 +. ./common/rc
 +. ./common/filter
 +
 +# real QA test starts here
 +_supported_fs btrfs
 +_supported_os Linux
 +_require_scratch
 +
 +_scratch_mkfs  /dev/null 21
 +
 +_scratch_mount -o thread_pool=6
 +
 +_scratch_mount -o remount,thread_pool=10
 +
 +status=0 ; exit
 diff --git a/tests/btrfs/055.out b/tests/btrfs/055.out
 new file mode 100644
 index 000..2fdd8f4
 --- /dev/null
 +++ b/tests/btrfs/055.out
 @@ -0,0 +1 @@
 +QA output created by 055
 diff --git a/tests/btrfs/group b/tests/btrfs/group
 index b668485..2c10c5b 100644
 --- a/tests/btrfs/group
 +++ b/tests/btrfs/group
 @@ -57,3 +57,4 @@
   052 auto quick
   053 auto quick
   054 auto quick
 +055 auto quick
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 01/12] crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code

2014-09-17 Thread Dmitry Kasatkin

On 17/09/14 12:22, Herbert Xu wrote:
 On Mon, Sep 15, 2014 at 12:30:23AM -0700, beh...@converseincode.com wrote:
 From: Behan Webster beh...@converseincode.com

 Add a macro which replaces the use of a Variable Length Array In Struct 
 (VLAIS)
 with a C99 compliant equivalent. This macro instead allocates the appropriate
 amount of memory using an char array.

 The new code can be compiled with both gcc and clang.

 struct shash_desc contains a flexible array member member ctx declared with
 CRYPTO_MINALIGN_ATTR, so sizeof(struct shash_desc) aligns the beginning
 of the array declared after struct shash_desc with long long.

 No trailing padding is required because it is not a struct type that can
 be used in an array.

 The CRYPTO_MINALIGN_ATTR is required so that desc is aligned with long long
 as would be the case for a struct containing a member with
 CRYPTO_MINALIGN_ATTR.

 Signed-off-by: Behan Webster beh...@converseincode.com
 Acked-by: Herbert Xu herb...@gondor.apana.org.au

 Thanks,

Just in case.
I would still follow advice from Michał Mirosław to use shash##__desc[]

- Dmitry


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with unmountable filesystem.

2014-09-17 Thread Austin S Hemmelgarn

On 2014-09-16 16:57, Chris Murphy wrote:
 
 On Sep 16, 2014, at 8:40 AM, Austin S Hemmelgarn ahferro...@gmail.com wrote:
 
 Based on the kernel messages, the primary issue is log corruption, and
 in theory btrfs-zero-log should fix it.
 
 Can you provide a complete dmesg somewhere for this initial failure, just for 
 reference? I'm curious what this indication looks like compared to other 
 problems.
 
Okay, I can't really get a 'complete' dmesg, because the system panics 
on the mount failure (the filesystem in question is the system's root 
filesystem), the system has no serial ports, and I didn't think to 
build in support for console on ttyUSB0.  I can however get what the 
recovery environment (locally compiled based on buildroot) shows when I 
try to mount the filesystem:
[   30.871036] BTRFS: device label gentoo devid 1 transid 160615 /dev/sda3
[   30.875225] BTRFS info (device sda3): disk space caching is enabled
[   30.917091] BTRFS: detected SSD devices, enabling SSD mode
[   30.920536] BTRFS: bad tree block start 0 130402254848
[   30.924018] BTRFS: bad tree block start 0 130402254848
[   30.926234] BTRFS: failed to read log tree
[   30.953055] BTRFS: open_ctree failed
  The actual issue however, is
 that the primary superblock appears to be pointing at a corrupted root
 tree, which causes pretty much everything that does anything other than
 just read the sb to fail.  The first backup sb does point to a good
 tree, but only btrfs check and btrfs restore have any option to ignore
 the first sb and use one of the backups instead.
 
 Maybe use wipefs -a on this volume, which removes the magic from only the 
 first superblock by default (you can specify another location). And then try 
 btrfs-show-super -F which dumps supers with bad magic.
 
Thanks for the suggestion, I hadn't thought of that...
 I just tried this:
 # wipefs -a /dev/sdb
 /dev/sdb: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 
 5f 4d
 # btrfs-show-super -F /dev/sdb
 superblock: bytenr=65536, device=/dev/sdb
 -
 csum  0x5c1196d7 [DON'T MATCH]
 bytenr65536
 flags 0x1
 magic  [DON'T MATCH]
 […]
 # btrfs-show-super -i1 /dev/sdb
 superblock: bytenr=67108864, device=/dev/sdb
 -
 csum  0xfc70be19 [match]
 bytenr67108864
 flags 0x1
 magic _BHRfS_M [match]
 
 So the mirror is definitely there and valid.
 # btrfs rescue super-recover -yv /dev/sdb
 No valid Btrfs found on /dev/sdb
 Usage or syntax errors
 
 Not expected at all, man page says Recover bad superblocks from good 
 copies. There's a good copy, it's not being found by btrfs rescue 
 super-recover. Seems like a bug.
 
 
 # btrfs check /dev/sdb
 No valid Btrfs found on /dev/sdb
 Couldn't open file system
 
 # btrfs check -s1 /dev/sdb
 using SB copy 1, bytenr 67108864
 Checking filesystem on /dev/sdb
 UUID: 9acf13de-5b98-4f28-9992-533e4a99d348
 [snip]
 OK it finds it, maybe a --repair will fix the bad first one?
 # btrfs check -s1 /dev/sdb
 using SB copy 1, bytenr 67108864
 enabling repair mode
 Checking filesystem on /dev/sdb
 UUID: 9acf13de-5b98-4f28-9992-533e4a99d348
 [snip]
 No indication of repair
 # btrfs check /dev/sdb
 No valid Btrfs found on /dev/sdb
 Couldn't open file system
 # btrfs check /dev/sdb
 No valid Btrfs found on /dev/sdb
 Couldn't open file system
 [root@f21v ~]# btrfs-show-super -F /dev/sdb
 superblock: bytenr=65536, device=/dev/sdb
 -
 csum  0x5c1196d7 [DON'T MATCH]
 bytenr65536
 flags 0x1
 magic  [DON'T MATCH]
 
 
 Still not fixed. Maybe I needed to corrupt something else in the superblock 
 other than the magic and this behavior is intentional, otherwise wipefs -a, 
 followed by btrfsck would resurrect an intentionally wiped btrfs fs, 
 potentially wiping out some newer file system in the process.
 
...though maybe it's a good thing I didn't.
 
 
 I'm fine using dd to replace the primary sb with one of the
 backups, but don't know the exact parameters that would be needed.
 
 Here's an idea:
 
 # btrfs-show-super /dev/sdb
 superblock: bytenr=65536, device=/dev/sdb
 -
 csum  0x92aa51ab [match]
 [snip]
 So I know what I'm looking for starts at LBA 65536/512
 
 # dd if=/dev/sdb skip=128 count=4 2/dev/null | hexdump -C
   92 aa 51 ab 00 00 00 00  00 00 00 00 00 00 00 00  |..Q…..|
 [snip]
 
 And as it turns out the csum is right at the beginning, 4 bytes. So use bs of 
 4 bytes, seek 65536/4, count of 1. This should zero just 4 bytes starting at 
 65536 bytes in.
 
 # dd if=/dev/zero of=/dev/sdb bs=4 seek=16384 count=1
 
 Checked it with the earlier skip=128

Re: [PATCH v3 01/12] crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code

2014-09-17 Thread Herbert Xu

On Wed, Sep 17, 2014 at 02:15:40PM +0300, Dmitry Kasatkin wrote:
 On 17/09/14 12:22, Herbert Xu wrote:
  On Mon, Sep 15, 2014 at 12:30:23AM -0700, beh...@converseincode.com wrote:
  From: Behan Webster beh...@converseincode.com
 
  Add a macro which replaces the use of a Variable Length Array In Struct 
  (VLAIS)
  with a C99 compliant equivalent. This macro instead allocates the 
  appropriate
  amount of memory using an char array.
 
  The new code can be compiled with both gcc and clang.
 
  struct shash_desc contains a flexible array member member ctx declared with
  CRYPTO_MINALIGN_ATTR, so sizeof(struct shash_desc) aligns the beginning
  of the array declared after struct shash_desc with long long.
 
  No trailing padding is required because it is not a struct type that can
  be used in an array.
 
  The CRYPTO_MINALIGN_ATTR is required so that desc is aligned with long long
  as would be the case for a struct containing a member with
  CRYPTO_MINALIGN_ATTR.
 
  Signed-off-by: Behan Webster beh...@converseincode.com
  Acked-by: Herbert Xu herb...@gondor.apana.org.au
 
  Thanks,
 
 Just in case.
 I would still follow advice from Michał Mirosław to use shash##__desc[]

Oh yes of course.  My ack is more about the approach.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Unable to mount multiple subvolumes of a single disk

2014-09-17 Thread Chris Mason



On 09/17/2014 05:47 AM, Anand Jain wrote:
 
 Hi Chris,
 
 
 ---
 diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
 index e9676a4..1224b61 100644
 --- a/fs/btrfs/volumes.c
 +++ b/fs/btrfs/volumes.c
 @@ -533,7 +533,7 @@ static noinline int device_list_add(const char
 *path,
   * the btrfs dev scan cli, after FS has been mounted.
   */
  if (fs_devices-opened) {
 -   return -EBUSY;
 +   goto out;
  } else {
  /*
   * That is if the FS is _not_ mounted and if
 you
 @@ -566,6 +566,7 @@ static noinline int device_list_add(const char
 *path,
  if (!fs_devices-opened)
  device-generation = found_transid;

 +out:
  *fs_devices_ret = fs_devices;

  return ret;

 Anand, are you planning on sending a full patch out for this?  One
 concern
 I have is that after the device_list_add call:


  if (!ret  fs_devices_ret)
  (*fs_devices_ret)-total_devices = total_devices;

 We should only be doing this from the newest super, not blindly
 overwriting.
 But that's a merge window fix.  For now I just want to deal with the
 regression,
 and your patch above looks good.

 Thanks for jumping on this one.
 
 
  Sorry for the trouble.
  yes, I will be sending a full patch. I am finding too difficult
  to revive the function btrfs_scan_one_device() which is predominately
  to handle device scan and list_update _before_ any mount. Further
  to the concern which you mention above, there is Ioctl
  BTRFS_IOC_DEV_READY also using this function, which absolutely should
  not have any intention to update the device list, but it does ..
  theoretically. And I note that this ioctl is used by systemd as well.
  So the fix is getting a bit complicated. I am attempting.

No problem, the original patch looked right to me too.  We're getting
closer to rc6, I think at this point I'll revert the original patch
until the next merge window.  Then we can step back and nail down
exactly what is going on.

-chris

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Btrfs: fix wrong parse of extent map's tracepoint

2014-09-17 Thread Liu Bo

The tracepoint of extent map doesn't parse @flag correctly, we set @flag via
set_bit(), so we need to parse it on a bit bias.

Also add the missing flag, EXTENT_FLAG_FS_MAPPING.

Signed-off-by: Liu Bo bo.li@oracle.com
---
 include/trace/events/btrfs.h | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 4ee4e30..9c69cf0 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -157,12 +157,13 @@ DEFINE_EVENT(btrfs__inode, btrfs_inode_evict,
 
 #define show_map_flags(flag)   \
__print_flags(flag, |,\
-   { EXTENT_FLAG_PINNED,   PINNED},  \
-   { EXTENT_FLAG_COMPRESSED,   COMPRESSED},  \
-   { EXTENT_FLAG_VACANCY,  VACANCY   },  \
-   { EXTENT_FLAG_PREALLOC, PREALLOC  },  \
-   { EXTENT_FLAG_LOGGING,  LOGGING   },  \
-   { EXTENT_FLAG_FILLING,  FILLING   })
+   { (1  EXTENT_FLAG_PINNED),PINNED},\
+   { (1  EXTENT_FLAG_COMPRESSED),COMPRESSED},\
+   { (1  EXTENT_FLAG_VACANCY),   VACANCY   },\
+   { (1  EXTENT_FLAG_PREALLOC),  PREALLOC  },\
+   { (1  EXTENT_FLAG_LOGGING),   LOGGING   },\
+   { (1  EXTENT_FLAG_FILLING),   FILLING   },\
+   { (1  EXTENT_FLAG_FS_MAPPING),FS_MAPPING})
 
 TRACE_EVENT_CONDITION(btrfs_get_extent,
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 02/12] btrfs: LLVMLinux: Remove VLAIS

2014-09-17 Thread Chris Mason



On 09/15/2014 03:30 AM, beh...@converseincode.com wrote:
 From: Vinícius Tinti viniciusti...@gmail.com
 
 Replaced the use of a Variable Length Array In Struct (VLAIS) with a C99
 compliant equivalent. This is the original VLAIS struct.
 
 struct {
   struct shash_desc shash;
   char ctx[crypto_shash_descsize(tfm)];
 } desc;
 
 This patch instead allocates the appropriate amount of memory using a
 char array using the SHASH_DESC_ON_STACK macro.
 
 The new code can be compiled with both gcc and clang.
 
 Signed-off-by: Vinícius Tinti viniciusti...@gmail.com
 Reviewed-by: Jan-Simon Möller dl...@gmx.de
 Reviewed-by: Mark Charlebois charl...@gmail.com
 Signed-off-by: Behan Webster beh...@converseincode.com
 Cc: David S. Miller da...@davemloft.net
 Cc: Herbert Xu herb...@gondor.apana.org.au

Acked-by: Chris Mason c...@fb.com

On the btrfs bits.  Thanks for the v3.

-chris
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] Btrfs: don't do async reclaim during log replay

2014-09-17 Thread Josef Bacik

Trying to reproduce a log enospc bug I hit a panic in the async reclaim code
during log replay.  This is because we use fs_info-fs_root as our root for
shrinking and such.  Technically we can use whatever root we want, but let's
just not allow async reclaim while we're doing log replay.  Thanks,

Signed-off-by: Josef Bacik jba...@fb.com
---
 fs/btrfs/extent-tree.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 53714ef..e3fde87 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4513,7 +4513,13 @@ again:
space_info-flush = 1;
} else if (!ret  space_info-flags  BTRFS_BLOCK_GROUP_METADATA) {
used += orig_bytes;
-   if (need_do_async_reclaim(space_info, root-fs_info, used) 
+   /*
+* We will do the space reservation dance during log replay,
+* which means we won't have fs_info-fs_root set, so don't do
+* the async reclaim as we will panic.
+*/
+   if (root-fs_info-fs_root 
+   need_do_async_reclaim(space_info, root-fs_info, used) 
!work_busy(root-fs_info-async_reclaim_work))
queue_work(system_unbound_wq,
   root-fs_info-async_reclaim_work);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: NOCOW on VM images causes extreme btrfs slowdowns, memory leaks, and deadlocks

2014-09-17 Thread Marc MERLIN

kernel: 3.16.2

On Tue, Sep 16, 2014 at 04:57:42PM -0700, Marc MERLIN wrote:
 I have a filtered log showing any system call that took more than 1 sec,
 that list is small:
 http://marc.merlins.org/tmp/btrfs_receive.log
 
 Most of the time is apparently just death by a thousand cuts of many
 many system calls spent around receiving my virtual images that didn't
 change.
 
 Here's the full strace log if you wish
 http://marc.merlins.org/tmp/btrfs_receive.log.xz

Ok, so while debugging this further, I found out that my VM images were
not NOCOW anymore (they used to be, but this must have been lost during a
restore).

Problems:
filefrag on my vbox file took all of my RAM and swap (32GB) and killed my
machine without being able to finish.

Moving the dir to +C and copying the vbox image from backup (having deleted
the fragmented one) took much longer to start than it should have
(destination had a filesize of 0 for a long time), but finished overnight.

The next morning (now), I see multiple of my CPUs deadlocked and a kworker
at the top of the list:
INFO: task kworker/u16:6:21880 blocked for more than 120 seconds.
  Tainted: G   O  3.16.2-amd64-i915-preempt-20140714 #1
echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
kworker/u16:6   D  0 21880  2 0x0080
Workqueue: writeback bdi_writeback_workfn (flush-btrfs-2)
 88012f87b9d0 0046 88012f87b9a0 88012f87bfd8
 8800139c0490 000140c0 88041d2140c0 8800139c0490
 88012f87ba70 0002 81106441 88012f87b9e0
Call Trace:
 [81106441] ? wait_on_page_read+0x3c/0x3c
 [8163a889] schedule+0x6e/0x70
 [8163aa2b] io_schedule+0x60/0x7a
 [8110644f] sleep_on_page+0xe/0x12
 [8163adbb] __wait_on_bit_lock+0x46/0x8a
 [8110650a] __lock_page+0x69/0x6b
 [81087ba4] ? autoremove_wake_function+0x34/0x34
 [8124aead] lock_page+0x1e/0x21
 [8124ecb9] extent_write_cache_pages.isra.16.constprop.31+0x10e/0x2c3
 [8124f2a2] extent_writepages+0x4b/0x5c
 [81238e7c] ? btrfs_submit_direct+0x3f9/0x3f9
 [81079658] ? preempt_count_add+0x78/0x8d
 [81237568] btrfs_writepages+0x28/0x2a
 [81110efe] do_writepages+0x1e/0x2c
 [811814db] __writeback_single_inode+0x7d/0x238
 [81182213] writeback_sb_inodes+0x1eb/0x339
 [811823d5] __writeback_inodes_wb+0x74/0xb7
 [81182550] wb_writeback+0x138/0x293
 [81182b88] bdi_writeback_workfn+0x19a/0x329
 [81068bf7] process_one_work+0x195/0x2d2
 [81068fd8] worker_thread+0x275/0x352
 [81068d63] ? process_scheduled_works+0x2f/0x2f
 [8106e3a9] kthread+0xae/0xb6
 [8106e2fb] ? __kthread_parkme+0x61/0x61
 [8163d8fc] ret_from_fork+0x7c/0xb0
 [8106e2fb] ? __kthread_parkme+0x61/0x61

Hung tasks (sysrq-w) are here: 
http://marc.merlins.org/tmp/btrfs_hang-3.16.2.txt

I'm going to purge that fragmented vbox image from all my snapshots and reboot,
but clearly there are things that are going wrong.

Filipe, sorry for the initial bad problem report. While I can't exactly see
how it's related, it looks like btrfs receive of a heavily fragmented files
can take 12h or more.
It may not be that important to fix compared to the main problem heavy 
fragmentation
causes to btrfs still

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 01/12] crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code

2014-09-17 Thread Behan Webster


On 09/17/14 04:30, Herbert Xu wrote:

On Wed, Sep 17, 2014 at 02:15:40PM +0300, Dmitry Kasatkin wrote:

On 17/09/14 12:22, Herbert Xu wrote:

On Mon, Sep 15, 2014 at 12:30:23AM -0700, beh...@converseincode.com wrote:

From: Behan Webster beh...@converseincode.com

Add a macro which replaces the use of a Variable Length Array In Struct (VLAIS)
with a C99 compliant equivalent. This macro instead allocates the appropriate
amount of memory using an char array.

The new code can be compiled with both gcc and clang.

struct shash_desc contains a flexible array member member ctx declared with
CRYPTO_MINALIGN_ATTR, so sizeof(struct shash_desc) aligns the beginning
of the array declared after struct shash_desc with long long.

No trailing padding is required because it is not a struct type that can
be used in an array.

The CRYPTO_MINALIGN_ATTR is required so that desc is aligned with long long
as would be the case for a struct containing a member with
CRYPTO_MINALIGN_ATTR.

Signed-off-by: Behan Webster beh...@converseincode.com

Acked-by: Herbert Xu herb...@gondor.apana.org.au

Thanks,

Just in case.
I would still follow advice from Michał Mirosław to use shash##__desc[]
Absolutely. I will be posting a v4 patchset . Just waiting a bit more 
for more comments on v3.


The macro from v4 will look like this which I believe will satisfy the 
concern and indeed be safer than my previous version.


+#define SHASH_DESC_ON_STACK(shash, tfm)  \
+   char __##shash##_desc[sizeof(struct shash_desc) +\
+   crypto_shash_descsize(tfm)] CRYPTO_MINALIGN_ATTR; \
+   struct shash_desc *shash = (struct shash_desc *)__##shash##_desc

Hmm. Is it worth adding a comment with this macro explaining the reason this 
works? Essentially much of what is in the commit message?
 


Oh yes of course.  My ack is more about the approach.


Wonderful!

Indeed. I would have asked for you to wait for v4 anyways. :)

Thank you,

Behan

--
Behan Webster
beh...@converseincode.com

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Setting FS_USERNS_MOUNT in btrfs_fs_type.fs_flags

2014-09-17 Thread Zach Brown

On Wed, Sep 17, 2014 at 04:54:48AM +0100, Al Viro wrote:
 On Tue, Sep 16, 2014 at 11:05:00PM -0400, Shea Levy wrote:
  Hi all,
  
  What work would be required to mark btrfs_fs_type with FS_USERNS_MOUNT
  so that btrfs images can be mounted by unprivileged users within a user
  namespace (along with something like [1])? I'd like to be able to create
  disk images without having to start a VM (and --rootdir isn't flexible
  enough because I want to make subvolumes).
 
 Er...  Which is to say, you have an audit of btrfs code making sure that
 it can cope with arbitrary image hand-crafted by potential attacker?

It definitely can't cope. The easiest places to find bugs are the
hundreds of BUG_ON() sites, many can be triggered by on-disk structures.
The sheer volume of those makes me trust that you could find much worse
if you did a thorough audit.

- z
(fun related fact: distros automount btrfs images)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

btrfs rescue super-recover memory corruption

2014-09-17 Thread Eric Sandeen

This:

# truncate --size=8g
# dd if=/dev/zero of=file conv=notrunc  bs=4 seek=16384 count=1
# valgrind ./btrfs rescue super-recover file -v

yields:

==4604== Memcheck, a memory error detector
==4604== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==4604== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==4604== Command: ./btrfs rescue super-recover file -v
==4604== 
All Devices:
Device: id = 1, name = file

Before Recovering:
[All good supers]:
device name = file
superblock bytenr = 67108864

[All bad supers]:
device name = file
superblock bytenr = 65536


Make sure this is a btrfs disk otherwise the tool will destroy other fs, Are 
you sure? [y/N]: y
Recovered bad superblocks successful
==4604== Invalid read of size 8
==4604==at 0x426B55: btrfs_recover_superblocks (list.h:204)
==4604==by 0x421C79: cmd_super_recover (cmds-rescue.c:148)
==4604==by 0x40420A: handle_command_group (btrfs.c:145)
==4604==by 0x421B54: cmd_rescue (cmds-rescue.c:162)
==4604==by 0x404199: main (btrfs.c:247)
==4604==  Address 0x4c250b0 is 48 bytes inside a block of size 96 free'd
==4604==at 0x4A063F0: free (vg_replace_malloc.c:446)
==4604==by 0x43C77E: btrfs_close_devices (volumes.c:196)
==4604==by 0x42F5D1: close_ctree (disk-io.c:1404)
==4604==by 0x426A85: btrfs_recover_superblocks (super-recover.c:340)
==4604==by 0x421C79: cmd_super_recover (cmds-rescue.c:148)
==4604==by 0x40420A: handle_command_group (btrfs.c:145)
==4604==by 0x421B54: cmd_rescue (cmds-rescue.c:162)
==4604==by 0x404199: main (btrfs.c:247)
==4604== 
==4604== Invalid free() / delete / delete[] / realloc()
==4604==at 0x4A063F0: free (vg_replace_malloc.c:446)
==4604==by 0x426B9E: btrfs_recover_superblocks (super-recover.c:85)
==4604==by 0x421C79: cmd_super_recover (cmds-rescue.c:148)
==4604==by 0x40420A: handle_command_group (btrfs.c:145)
==4604==by 0x421B54: cmd_rescue (cmds-rescue.c:162)
==4604==by 0x404199: main (btrfs.c:247)
==4604==  Address 0x4c25080 is 0 bytes inside a block of size 96 free'd
==4604==at 0x4A063F0: free (vg_replace_malloc.c:446)
==4604==by 0x43C77E: btrfs_close_devices (volumes.c:196)
==4604==by 0x42F5D1: close_ctree (disk-io.c:1404)
==4604==by 0x426A85: btrfs_recover_superblocks (super-recover.c:340)
==4604==by 0x421C79: cmd_super_recover (cmds-rescue.c:148)
==4604==by 0x40420A: handle_command_group (btrfs.c:145)
==4604==by 0x421B54: cmd_rescue (cmds-rescue.c:162)
==4604==by 0x404199: main (btrfs.c:247)
==4604== 
==4604== 
==4604== HEAP SUMMARY:
==4604== in use at exit: 0 bytes in 0 blocks
==4604==   total heap usage: 72 allocs, 73 frees, 140,384 bytes allocated
==4604== 
==4604== All heap blocks were freed -- no leaks are possible
==4604== 
==4604== For counts of detected and suppressed errors, rerun with: -v
==4604== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6)

i.e. I think we are double freeing memory:

close_ctree(root); // -- here
no_recover:
recover_err_str(ret);
free_recover_superblock(recover); // -- and here

I can't really work out what all this is all doing, but maybe the fix is obvious
to Wang Shilong (who wrote the original code)?

Thanks,
-Eric
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs rescue super-recover memory corruption

2014-09-17 Thread Eric Sandeen

On 9/17/14 12:00 PM, Eric Sandeen wrote:
 This:
 
 # truncate --size=8g

oops, s/b:

# truncate --size=8g file

 # dd if=/dev/zero of=file conv=notrunc  bs=4 seek=16384 count=1
 # valgrind ./btrfs rescue super-recover file -v

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: NOCOW on VM images causes extreme btrfs slowdowns, memory leaks, and deadlocks

2014-09-17 Thread Marc MERLIN

On Wed, Sep 17, 2014 at 08:00:02AM -0700, Marc MERLIN wrote:
 kernel: 3.16.2

Grumble, I messed up the subject line.
I meant that COW on virtual disk images causes the problems described in the
previous Email.
Obviously I already know that COW on disk images wasn't ideal, but the
effects in 3.16 are still fairly severe.

Marc
-- 
A mouse is a device used to point at the xterm you want to type in - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Can btrfs re-sync an out-of-sync RAID1 filesystem?

2014-09-17 Thread Alan Hagge

I know that sounds weird, but here's my scenario:

- Create a RAID1 filesystem (both data and metadata) using 2 same-sized
external USB drives
- Copy data (backup of other filesystem) onto this new filesystem
- Dismount the filesystem
- Split up the drives (keep one at home, move one to offsite backup)

This way if I need to recover a file, I can mount the one drive I have
with -o ro,degraded to recover data.  If there's a read error on the
backup drive during the copy, I can go to the offsite location, bring
back the 2nd drive and mount both and have RAID1 protection.

BUT...if I accidentally (because I forgot to use ro when mounting) or
purposely write data to the single drive in degraded mode,  is it
possible to later mount both drives in RAID1 mode and resync them (as
opposed to having to do a replace operation on the out-of-sync drive,
which would force it to be completely rewritten)?  If so, how would
btrfs know which drive is the master (ie. the updated one)?

Or is it not possible to write to a btrfs volume mounted in degraded mode?
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs balance enospc

2014-09-17 Thread Mark Murawski

 Does/should a balance imply removal of missing devices (as long as 
the minimum number of devices are still available)?


That's a really good question.  As a user I would expect it to balance 
over remaining devices assuming you still have a complete picture. 
Doing a device delete missing after a balance should be just some pool 
metadata updates at that point.


Anyway... I solved my problem by moving/deleting files to free up space 
to the point that balance no longer complained about enospc.


I suppose btrfs needs extra working space to do a balance... above and 
beyond the actual size of the existing data/metadata to be moved?  I had 
a total of three devices, with what appeared to be plenty of space on 
the two that were to be remaining, but balance/remove was still 
complaining to be out of disk space.


It would be a good idea for some metrics to be calculated upon start of 
a removal or balance to tell the user hey you need to free up XXX more 
bytes in order for this operation to be successful.


--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with a filesystem

2014-09-17 Thread Chris Murphy


On Sep 17, 2014, at 3:06 AM, Tovo Rabemanantsoa 
tovo.rabemanant...@bordeaux.inra.fr wrote:
 
 Sep 17 11:05:55 sdeeph1 kernel: [85381.548215] parent transid verify
 failed on 24562568384512 wanted 255444 found 255446
 Sep 17 11:05:55 sdeeph1 kernel: [85381.548229] parent transid verify
 failed on 24562568384512 wanted 255444 found 255446
 Sep 17 11:05:55 sdeeph1 kernel: [85381.548237] parent transid verify
 failed on 24562568384512 wanted 255444 found 255446
 Sep 17 11:05:55 sdeeph1 kernel: [85381.548243] parent transid verify
 failed on 24562568384512 wanted 255444 found 255446
 Sep 17 11:05:55 sdeeph1 kernel: [85381.548249] parent transid verify
 failed on 24562568384512 wanted 255444 found 255446
 Sep 17 11:05:55 sdeeph1 kernel: [85381.670344] btrfs: open_ctree failed
 
 /var/log/syslog adds this line :
 Sep 17 11:05:55 sdeeph1 kernel: [85381.548255] Failed to read block
 groups: -5

This isn't what I expect as a candidate for btrfs-zero-log, I don't know what 
the last message means. You could run a 
btrfs check btrfsdev
and then ask about both the failure to read block groups -5 and also the btrfs 
check (without --repair) results

Or take a leap of faith, take an image of the file system first as it might 
make recovery more difficult. If there are things you really need off this file 
system first you should look at https://btrfs.wiki.kernel.org/index.php/Restore 
before zeroing the log.

btrfs-image -c 9 -t 8 btrfsdev /path/to/file
btrfs-zero-log btrfsdev


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with unmountable filesystem.

2014-09-17 Thread Chris Murphy


On Sep 17, 2014, at 5:23 AM, Austin S Hemmelgarn ahferro...@gmail.com wrote:
 
 Thanks for all the help.

Well, it's not much help. It seems possible to corrupt a primary superblock 
that points to a corrupt tree root, and use btrfs rescure super-recover to 
replace it, and then mount should work. One thing I didn't try was corrupting 
the primary superblock and just mounting normally or with recovery, to see if 
it'll automatically ignore the primary superblock and use the backup.

But I think you're onto something, that a good superblock can point to a 
corrupt tree root, and then not have a straight forward way to mount the good 
tree root. If I understand this correctly.


Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can btrfs re-sync an out-of-sync RAID1 filesystem?

2014-09-17 Thread Piotr Szymaniak

On Wed, Sep 17, 2014 at 10:13:01AM -0700, Alan Hagge wrote:
 I know that sounds weird, but here's my scenario:

There was similar thread [1] few days ago, you should take a look at it.

[1] https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg37144.html


Piotr Szymaniak.
-- 
I ten smród. Diabli wiedzą, co tam gniło, w tym mięsiwie, ale Redowi
wydało się, że sto tysięcy rozbitych cuchnących jaj wylanych na sto
tysięcy cuchnących rybich łbów i zdechłych kotów nie może śmierdzieć
tak, jak śmierdziała ta maź.
 -- Arkadij i Borys Strugaccy, „Piknik na skraju drogi”


signature.asc
Description: Digital signature

Re: btrfs balance enospc

2014-09-17 Thread Chris Murphy


On Sep 17, 2014, at 11:51 AM, Mark Murawski markm-li...@intellasoft.net wrote:

  Does/should a balance imply removal of missing devices (as long as the 
  minimum number of devices are still available)?
 
 That's a really good question.  As a user I would expect it to balance over 
 remaining devices assuming you still have a complete picture. Doing a device 
 delete missing after a balance should be just some pool metadata updates at 
 that point.
 
 Anyway... I solved my problem by moving/deleting files to free up space to 
 the point that balance no longer complained about enospc.

Another option in such a case is to add a new device. It can be small, even a 
2GB loop device or USB stick would do it in a bind. Then delete the device when 
you're done.


Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Can btrfs re-sync an out-of-sync RAID1 filesystem?

2014-09-17 Thread Duncan

Alan Hagge posted on Wed, 17 Sep 2014 10:13:01 -0700 as excerpted:

 I know that sounds weird, but here's my scenario:
 
 - Create a RAID1 filesystem (both data and metadata) using 2 same-sized
 external USB drives - Copy data (backup of other filesystem) onto this
 new filesystem - Dismount the filesystem - Split up the drives (keep one
 at home, move one to offsite backup)
 
 This way if I need to recover a file, I can mount the one drive I have
 with -o ro,degraded to recover data.  If there's a read error on the
 backup drive during the copy, I can go to the offsite location, bring
 back the 2nd drive and mount both and have RAID1 protection.
 
 BUT...if I accidentally (because I forgot to use ro when mounting) or
 purposely write data to the single drive in degraded mode,  is it
 possible to later mount both drives in RAID1 mode and resync them (as
 opposed to having to do a replace operation on the out-of-sync drive,
 which would force it to be completely rewritten)?  If so, how would
 btrfs know which drive is the master (ie. the updated one)?
 
 Or is it not possible to write to a btrfs volume mounted in degraded
 mode?

In general this works.  However, as the thread linked in Piotr's reply 
mentions, don't expect it to work for NOCOW files.  (But you have to make 
files nocow, so if you haven't, that shouldn't be an issue.)

Additionally, the newest version detection is based on file generation, 
so you want to be SURE you don't separately mount first one device 
writable and then the other, so they diverge not only from each other but 
from the point of separation.  If you split them, make SURE only the one 
is mounted writable, so it'll be very clear which one has the updated 
content.  If you DO accidentally separately mount both of them writable, 
I'd suggest using wipefs (or dd) to kill the btrfs magic on one of them 
so there's no possibility of btrfs getting mixed up which is newer, and 
then doing a btrfs replace to replace it with a new, current copy.

Meanwhile, again as noted in the Piotr's linked thread, the detection and 
update isn't automatic.  Once split with one mounted writable, when 
rejoined you'll want to do a btrfs scrub to bring the other one current.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with unmountable filesystem.

2014-09-17 Thread Duncan

Chris Murphy posted on Wed, 17 Sep 2014 12:57:59 -0600 as excerpted:

 But I think you're onto something, that a good superblock can point to a
 corrupt tree root, and then not have a straight forward way to mount the
 good tree root. If I understand this correctly.

This is what I ran into myself a couple months ago.

I ended up doing a restore as the simplest recovery I could do, then 
recreated the affected filesystems.  It happened on two  filesystems 
(separate, I don't trust subvolumes precisely because it's the same 
filesystem underneath and that's too many eggs in one basket for my 
comfort), /var/log and /home.  I was able to restore nearly all of /home 
(enough that I wasn't aware of anything missing except symlinks that 
restore doesn't work with), but lost about half of /var/log, which I did 
first and made a couple mistakes on that I didn't repeat on /home.

I figured there was a way to fix the filesystems as they were, but these 
threads about deliberately corrupting the first superblock in ordered to 
use the others hadn't appeared yet, and between restore and backups I got 
enough back not to be worried about the rest, so restore was good 
enough.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Problem with unmountable filesystem.

2014-09-17 Thread Duncan

Austin S Hemmelgarn posted on Wed, 17 Sep 2014 07:23:46 -0400 as
excerpted:

 I've also discovered, when trying to use btrfs restore to copy out the
 data to a different system, that 3.14.1 restore apparently chokes on
 filesystem that have lzo compression turned on.  It's reporting errors
 trying to inflate compressed files, and I know for a fact that none of
 those files were even open, let alone being written to, when the system
 crashed.  I don't know if this is a known bug or even if it is still the
 case with btrfs-progs 3.16, but I figured I'd comment about it because I
 haven't seen anything about it anywhere.

FWIW that's a known and recently patched issue.  If you're still seeing 
issues with it with btrfs-progs 3.16, report it, but 3.14.1 almost 
certainly wouldn't have had the fix.  (This is one related patch turned 
up by a quick search; there may be others.)

* commit 93ebec96f2ae1d3276ebe89e2d6188f9b46692fb
| Author: Vincent Stehlé vincent.ste...@laposte.net
| Date:   Wed Jun 18 18:51:19 2014 +0200
|
| btrfs-progs: restore: check lzo compress length
|
| When things go wrong for lzo-compressed btrfs, feeding
| lzo1x_decompress_safe() with corrupt data during restore
| can lead to crashes. Reduce the risk by adding
| a check on the input length.
|
| Signed-off-by: Vincent Stehlé vincent.ste...@laposte.net
| Signed-off-by: David Sterba dste...@suse.cz
|
|  cmds-restore.c | 6 ++
|  1 file changed, 6 insertions(+)

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Setting FS_USERNS_MOUNT in btrfs_fs_type.fs_flags

2014-09-17 Thread Shea Levy

On Wed, Sep 17, 2014 at 09:12:14AM -0700, Zach Brown wrote:
 On Wed, Sep 17, 2014 at 04:54:48AM +0100, Al Viro wrote:
  On Tue, Sep 16, 2014 at 11:05:00PM -0400, Shea Levy wrote:
   Hi all,
   
   What work would be required to mark btrfs_fs_type with FS_USERNS_MOUNT
   so that btrfs images can be mounted by unprivileged users within a user
   namespace (along with something like [1])? I'd like to be able to create
   disk images without having to start a VM (and --rootdir isn't flexible
   enough because I want to make subvolumes).
  
  Er...  Which is to say, you have an audit of btrfs code making sure that
  it can cope with arbitrary image hand-crafted by potential attacker?
 
 It definitely can't cope. The easiest places to find bugs are the
 hundreds of BUG_ON() sites, many can be triggered by on-disk structures.
 The sheer volume of those makes me trust that you could find much worse
 if you did a thorough audit.
 
 - z
 (fun related fact: distros automount btrfs images)

OK, so it seems like the answer to my question is a helluva lot. Guess
I won't count on seeing it any time soon :)

Thanks,
Shea
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs balance enospc

2014-09-17 Thread Duncan

Mark Murawski posted on Wed, 17 Sep 2014 13:51:51 -0400 as excerpted:

 Does/should a balance imply removal of missing devices (as long as
 the minimum number of devices are still available)?
 
 That's a really good question.  As a user I would expect it to balance
 over remaining devices assuming you still have a complete picture. Doing
 a device delete missing after a balance should be just some pool
 metadata updates at that point.

A balance does not imply removal of missing devices.  And at this point 
I'd say it shouldn't, tho perhaps some day after the code is somewhat 
more stable it could.

In fact, until recently kernelspace btrfs (which does all the work in a 
balance, userspace is simply the way you tell it what to do) didn't even 
properly detect dynamically added/removed devices, resulting in 
definitely unintuitive behavior where the balance would still queue up 
chunks to be rewritten to the missing device, that would obviously never 
be written because the device was missing and wasn't coming back! (!!!)

AFAIK (I'm a sysadmin and list regular, not a developer) that arguably 
pathological behavior has been fixed now, at least in theory, and the 
kernel should properly detect missing devices and should no longer try to 
write to them when doing a balance, so now, at least in theory and 
assuming good copies of all data and metadata on the remaining device 
from the original pair, a balance to it and a just added device in raid1 
mode should leave only the device metadata for btrfs device delete 
missing to fix up afterward.

However, as of now, there's still at least two bug reports being traced 
down in the dynamic device detection code (see the current thread where 
btrfs fi show on a two-device filesystem is pointing to the wrong place 
for one of the devices, and another where show says a device is missing, 
that isn't), and possibly others yet to be found, so it's not yet a good 
idea to have btrfs doing automatic device delete missing on balance.  
After the bug fixes are in and the code churn in that area calmed down 
for a couple kernel cycles, perhaps then we can debate whether a balance 
should automatically delete missing devices when appropriate, or not, but 
certainly now isn't isn't the time.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rockstor 3.0 now available

2014-09-17 Thread Suman C

Hello everyone,

Some of you know about Rockstor already from an email I sent several
weeks ago. For others, It's the BTRFS powered NAS solution we've
developed and just made the 3.0 release available.

Besides software updates, we've also improved the distribution
infrastructure and the installer so it's fast and easy as opposed to
painfully slow that some users complained about with earlier release.

We have a lot of work ahead and I'd greatly appreciate any comments
you may have. Please feel free to contact me.

Here's the direct download link:
https://sourceforge.net/projects/rockstor/files/latest/download

Here's our website: http://rockstor.com/

Thanks
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] btrfs-progs: remove scan_for_btrfs()

2014-09-17 Thread Gui Hecheng

From: Anand Jain anand.j...@oracle.com

With the changes as in the previous patch, now scan_for_btrfs()
is an unused function. So delete it.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 utils.c | 15 ---
 utils.h |  1 -
 2 files changed, 16 deletions(-)

diff --git a/utils.c b/utils.c
index 0ae0475..0cd97c7 100644
--- a/utils.c
+++ b/utils.c
@@ -2173,21 +2173,6 @@ int btrfs_scan_lblkid(int update_kernel)
return 0;
 }
 
-/*
- * scans devs for the btrfs
-*/
-int scan_for_btrfs(int where, int update_kernel)
-{
-   int ret = 0;
-
-   switch (where) {
-   case BTRFS_SCAN_LBLKID:
-   ret = btrfs_scan_lblkid(update_kernel);
-   break;
-   }
-   return ret;
-}
-
 int is_vol_small(char *file)
 {
int fd = -1;
diff --git a/utils.h b/utils.h
index 13f2e60..12466a6 100644
--- a/utils.h
+++ b/utils.h
@@ -105,7 +105,6 @@ u64 btrfs_device_size(int fd, struct stat *st);
 /* Helper to always get proper size of the destination string */
 #define strncpy_null(dest, src) __strncpy__null(dest, src, sizeof(dest))
 int test_dev_for_mkfs(char *file, int force_overwrite, char *estr);
-int scan_for_btrfs(int where, int update_kernel);
 int get_label_mounted(const char *mount_path, char *labelp);
 int test_num_disk_vs_raid(u64 metadata_profile, u64 data_profile,
u64 dev_cnt, int mixed, char *estr);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] btrfs-progs: remove BTRFS_SCAN_PROC scan method

2014-09-17 Thread Gui Hecheng

From: Anand Jain anand.j...@oracle.com

The libblkid scan method which was introduced later, will also
scan devices under /proc/partitions. So we don't have to do
the explicit scan of the same.

Remove the scan method BTRFS_SCAN_PROC.

Signed-off-by: Anand Jain anand.j...@oracle.com
---
 cmds-device.c |  5 ++---
 cmds-filesystem.c | 10 +-
 disk-io.c |  2 +-
 utils.c   |  5 +
 utils.h   |  5 ++---
 5 files changed, 11 insertions(+), 16 deletions(-)

diff --git a/cmds-device.c b/cmds-device.c
index a7183e3..a728f21 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -200,13 +200,13 @@ static int cmd_rm_dev(int argc, char **argv)
 static const char * const cmd_scan_dev_usage[] = {
btrfs device scan [(-d|--all-devices)|device [device...]],
Scan devices for a btrfs filesystem,
+-d|--all-devices (deprecated),
NULL
 };
 
 static int cmd_scan_dev(int argc, char **argv)
 {
int i, fd, e;
-   int where = BTRFS_SCAN_LBLKID;
int devstart = 1;
int all = 0;
int ret = 0;
@@ -224,7 +224,6 @@ static int cmd_scan_dev(int argc, char **argv)
break;
switch (c) {
case 'd':
-   where = BTRFS_SCAN_PROC;
all = 1;
break;
default:
@@ -237,7 +236,7 @@ static int cmd_scan_dev(int argc, char **argv)
 
if (all || argc == 1) {
printf(Scanning for Btrfs filesystems\n);
-   ret = scan_for_btrfs(where, BTRFS_UPDATE_KERNEL);
+   ret = btrfs_scan_lblkid(BTRFS_UPDATE_KERNEL);
if (ret)
fprintf(stderr, ERROR: error %d while scanning\n, 
ret);
goto out;
diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index 69c1ca5..dc5185e 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -505,7 +505,7 @@ static int cmd_show(int argc, char **argv)
struct list_head *cur_uuid;
char *search = NULL;
int ret;
-   int where = BTRFS_SCAN_LBLKID;
+   int where = -1; // default, search both kernel and udev
int type = 0;
char mp[BTRFS_PATH_NAME_MAX + 1];
char path[PATH_MAX];
@@ -526,7 +526,7 @@ static int cmd_show(int argc, char **argv)
break;
switch (c) {
case 'd':
-   where = BTRFS_SCAN_PROC;
+   where = BTRFS_SCAN_LBLKID;
break;
case 'm':
where = BTRFS_SCAN_MOUNTED;
@@ -550,7 +550,7 @@ static int cmd_show(int argc, char **argv)
 * right away
 */
if (type == BTRFS_ARG_BLKDEV) {
-   if (where == BTRFS_SCAN_PROC) {
+   if (where == BTRFS_SCAN_LBLKID) {
/* we need to do this because
 * legacy BTRFS_SCAN_DEV
 * provides /dev/dm-x paths
@@ -586,7 +586,7 @@ static int cmd_show(int argc, char **argv)
}
}
 
-   if (where == BTRFS_SCAN_PROC)
+   if (where == BTRFS_SCAN_LBLKID)
goto devs_only;
 
/* show mounted btrfs */
@@ -601,7 +601,7 @@ static int cmd_show(int argc, char **argv)
goto out;
 
 devs_only:
-   ret = scan_for_btrfs(where, !BTRFS_UPDATE_KERNEL);
+   ret = btrfs_scan_lblkid(!BTRFS_UPDATE_KERNEL);
 
if (ret) {
fprintf(stderr, ERROR: %d while scanning\n, ret);
diff --git a/disk-io.c b/disk-io.c
index c7901f4..9fe8769 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -1005,7 +1005,7 @@ int btrfs_scan_fs_devices(int fd, const char *path,
}
 
if (total_devs != 1) {
-   ret = scan_for_btrfs(BTRFS_SCAN_PROC, run_ioctl);
+   ret = btrfs_scan_lblkid(run_ioctl);
if (ret)
return ret;
}
diff --git a/utils.c b/utils.c
index fb78dd6..0ae0475 100644
--- a/utils.c
+++ b/utils.c
@@ -1150,7 +1150,7 @@ int check_mounted_where(int fd, const char *file, char 
*where, int size,
 
/* scan other devices */
if (is_btrfs  total_devs  1) {
-   if ((ret = scan_for_btrfs(BTRFS_SCAN_PROC, 
!BTRFS_UPDATE_KERNEL)))
+   if ((ret = btrfs_scan_lblkid(!BTRFS_UPDATE_KERNEL)))
return ret;
}
 
@@ -2181,9 +2181,6 @@ int scan_for_btrfs(int where, int update_kernel)
int ret = 0;
 
switch (where) {
-   case BTRFS_SCAN_PROC:
-   ret = btrfs_scan_block_devices(update_kernel);
-   break;
case BTRFS_SCAN_LBLKID:
ret = btrfs_scan_lblkid(update_kernel);
break;
diff --git a/utils.h b/utils.h
index 01b3259..13f2e60 100644
--- a/utils.h
+++ b/utils.h
@@ -26,9 +26,8 @@
 #define BTRFS_MKFS_SYSTEM_GROUP_SIZE (4 * 1024 * 1024)
 #define

[PATCH 3/3] btrfs-progs: fix device missing of btrfs fi show with seeding devices

2014-09-17 Thread Gui Hecheng

*Note*: this handles the problem under umounted state,
the problem under mounted state is already fixed by Anand.

Steps to reproduce:
# mkfs.btrfs -f /dev/sda1
# btrfstune -S 1 /dev/sda1
# mount /dev/sda1 /mnt
# btrfs dev add /dev/sda2 /mnt
# umount /mnt   == (umounted)
# btrfs fi show /dev/sda2
result:
Label: none  uuid: XX
Total devices 2 FS bytes used 368.00KiB
devid2 size 9.31GiB used 1.25GiB path /dev/sda2
*** Some devices missing
Btrfs v3.16-67-g69f54ea-dirty

It is because the @btrfs_scan_lblkid procedure is not capable of detecting
seeding devices since the seeding devices have different FSIDs from
derived devices. So when it tries to show all devices under the derived
fs, only the derived devices are shown.
Actually the @open_ctree deal with the seeding devices properly, so
we can make use of it to find seeding devices.
We call @open_ctree on every block device with a btrfs on it,
and all devices under the opening filesystem including the seed devices
will be ready to be shown.

Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com
---
 cmds-filesystem.c | 104 --
 1 file changed, 69 insertions(+), 35 deletions(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index dc5185e..f978175 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -28,6 +28,7 @@
 #include mntent.h
 #include linux/limits.h
 #include getopt.h
+#include blkid/blkid.h
 
 #include kerncompat.h
 #include ctree.h
@@ -268,10 +269,26 @@ static int cmp_device_id(void *priv, struct list_head *a,
da-devid  db-devid ? 1 : 0;
 }
 
+static void print_devices(struct btrfs_fs_devices *fs_devices, u64 *devs_found)
+{
+   struct btrfs_device *device;
+   struct list_head *cur;
+
+   list_sort(NULL, fs_devices-devices, cmp_device_id);
+   list_for_each(cur, fs_devices-devices) {
+   device = list_entry(cur, struct btrfs_device, dev_list);
+
+   printf(\tdevid %4llu size %s used %s path %s\n,
+   (unsigned long long)device-devid,
+   pretty_size(device-total_bytes),
+   pretty_size(device-bytes_used), device-name);
+   (*devs_found)++;
+   }
+}
+
 static void print_one_uuid(struct btrfs_fs_devices *fs_devices)
 {
char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
-   struct list_head *cur;
struct btrfs_device *device;
u64 devs_found = 0;
u64 total;
@@ -293,17 +310,10 @@ static void print_one_uuid(struct btrfs_fs_devices 
*fs_devices)
   (unsigned long long)total,
   pretty_size(device-super_bytes_used));
 
-   list_sort(NULL, fs_devices-devices, cmp_device_id);
-   list_for_each(cur, fs_devices-devices) {
-   device = list_entry(cur, struct btrfs_device, dev_list);
-
-   printf(\tdevid %4llu size %s used %s path %s\n,
-  (unsigned long long)device-devid,
-  pretty_size(device-total_bytes),
-  pretty_size(device-bytes_used), device-name);
+   if (fs_devices-seed)
+   print_devices(fs_devices-seed, devs_found);
+   print_devices(fs_devices, devs_found);
 
-   devs_found++;
-   }
if (devs_found  total) {
printf(\t*** Some devices missing\n);
}
@@ -489,6 +499,53 @@ out:
return ret;
 }
 
+static int scan_all_fs_lblkid(char *search_target)
+{
+   blkid_dev_iterate iter = NULL;
+   blkid_dev dev = NULL;
+   blkid_cache cache = NULL;
+   char path[PATH_MAX];
+   struct btrfs_fs_info *fs_info;
+   int found = 0;
+
+   if (blkid_get_cache(cache, 0)  0) {
+   printf(ERROR: lblkid cache get failed\n);
+   return -1;
+   }
+   blkid_probe_all(cache);
+   iter = blkid_dev_iterate_begin(cache);
+   blkid_dev_set_search(iter, TYPE, btrfs);
+   while (blkid_dev_next(iter, dev) == 0) {
+   dev = blkid_verify(cache, dev);
+   if (!dev)
+   continue;
+   strncpy(path, blkid_dev_devname(dev), PATH_MAX);
+   fs_info = open_ctree_fs_info(path, 0, 0, OPEN_CTREE_PARTIAL);
+   if (!fs_info)
+   continue;
+
+   if (search_target
+!uuid_search(fs_info-fs_devices, search_target)) {
+   close_ctree(fs_info-fs_root);
+   continue;
+   }
+
+   if (search_target)
+   found = 1;
+   print_one_uuid(fs_info-fs_devices);
+
+   close_ctree(fs_info-fs_root);
+   }
+   blkid_dev_iterate_end(iter);
+   blkid_put_cache(cache);
+
+   if (search_target  !found)
+   return 1;
+
+   return 0;
+}
+
+

[PATCH] btrfs-progs: fix page align issue for lzo compress in restore

2014-09-17 Thread Gui Hecheng

When runing restore under lzo compression, bad compress length
problems are encountered.
It is because there is a page align problem with the @decompress_lzo,
as follows:
|--| ||-| |--|...|--|
  page ^page   page
   |
  3 bytes left

When lzo compress pages im RAM, lzo will ensure that
the 4 bytes len will be in one page as a whole.
There is a situation that 3 (or less) bytes are left
at the end of a page, and then the 4 bytes len is
stored at the start of the next page.
But the @decompress_lzo doesn't goto the start of
the next page and continue to read the next 4 bytes
which is across two pages, so a random value is fetched
as a bad compress length.

So we just switch to the page-aligned start position to read
the len of next piece of data when bad compress length is encounterd.
If we still get bad compress length in this case, then there is a
real bad compress length, and we shall report error.

Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com
---
 cmds-restore.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/cmds-restore.c b/cmds-restore.c
index 38a131e..8b230ab 100644
--- a/cmds-restore.c
+++ b/cmds-restore.c
@@ -57,6 +57,9 @@ static int dry_run = 0;
 
 #define LZO_LEN 4
 #define PAGE_CACHE_SIZE 4096
+#define PAGE_CACHE_MASK (~(PAGE_CACHE_SIZE - 1))
+#define PAGE_CACHE_ALIGN(addr) (((addr) + PAGE_CACHE_SIZE - 1) \
+PAGE_CACHE_MASK)
 #define lzo1x_worst_compress(x) ((x) + ((x) / 16) + 64 + 3)
 
 static int decompress_zlib(char *inbuf, char *outbuf, u64 compress_len,
@@ -101,6 +104,8 @@ static int decompress_lzo(unsigned char *inbuf, char 
*outbuf, u64 compress_len,
size_t out_len = 0;
size_t tot_len;
size_t tot_in;
+   size_t tot_in_aligned;
+   int aligned = 0;
int ret;
 
ret = lzo_init();
@@ -117,6 +122,20 @@ static int decompress_lzo(unsigned char *inbuf, char 
*outbuf, u64 compress_len,
in_len = read_compress_length(inbuf);
 
if ((tot_in + LZO_LEN + in_len)  tot_len) {
+   /*
+* The LZO_LEN bytes is guaranteed to be
+* in one page as a whole, so if a page
+* has fewer than LZO_LEN bytes left,
+* the LZO_LEN bytes should be fetched
+* at the start of the next page
+*/
+   if (!aligned) {
+   tot_in_aligned = PAGE_CACHE_ALIGN(tot_in);
+   inbuf += (tot_in_aligned - tot_in);
+   tot_in = tot_in_aligned;
+   aligned = 1;
+   continue;
+   }
fprintf(stderr, bad compress length %lu\n,
(unsigned long)in_len);
return -1;
@@ -137,6 +156,7 @@ static int decompress_lzo(unsigned char *inbuf, char 
*outbuf, u64 compress_len,
outbuf += new_len;
inbuf += in_len;
tot_in += in_len;
+   aligned = 0;
}
 
*decompress_len = out_len;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: fs corruption report

2014-09-17 Thread Gui Hecheng

On Mon, 2014-09-01 at 15:25 +, Zooko Wilcox-OHearn wrote:
 I'm more than happy to try out patches and even focus my own brain on
 diagnosing it, if I can. I'm hoping to regain access to some of my
 files on my btrfs partition, and also I would enjoy helping get this
 improved. :-)
 
 So if you want me to try an experiment, just email me. Unfortunately I
 can't just give you a copy of the partition, since it has confidential
 information on it.
 
 Regards,
 
 Zooko

Hi, Zooko, Marc,
I'm glad that I could send some feedbacks,
The following piece deals with the lzo decompress problem with restore

https://patchwork.kernel.org/patch/4928831/

Hope that it helps.

-Gui

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] btrfs: Add support for nocow write into prealloc space with compression

2014-09-17 Thread Qu Wenruo

**
* WARNING: on-disk data format changes is introduced *
**

Before this patch, when use compression, prealloc space will never be used,
and any compression write into it will be cowed.

If we just split the prealloc space and do the normal compress write
into the preallocated space,
a lot of new backref will be created for each compressed write, which may
lead to performance regression.

So to keep the behavior much similar to the uncompressed prealloc write,
which will not add new backref for the write and only increase
the original backref, we mush keep the disk_bytenr/disk_num_bytes
the same as the prealloc range.
Due to the above limit, we need to introduce two new members in
btrfs_file_extent_item to record where the real compressed data lies:
1. data_offset
The offset to the prealloc range start where on-disk data is.

2. data_len
The length of the on-disk compressed data length.

Other members will keep the behavior of uncompressed nocow write into
prealloc range.

The overall new btrfs_file_extent_item will acts like the following after
a compressed write into prealloc range:

0   4K  8K  12K 16K 32K - file offset
|compressed-|
|disk_bytenr: A /  \
|disk_num_bytes: 32K  / --- Same behavior as uncompressed write
|offset: 4K/---/
|data_offset: 4K/
|data_len: 4K/
|ram: 12K /
|---|   - On disk data
|-|
A   +4K +8K +12K+16K+32K-disk bytenr

For the backward compatibility, current implement is to use the
following method:
1) COMPPREALLOC incompatible flags.
Add new COMPPREALLOC incompatible flag, which is determined at mkfs
time.
Only when COMPPREALLOC flag is set, nocow compressed write will happen.
Seamless convert will be added later like using
'convert=compressed-prealloc' mount option to seamlessly convert old fs
to use new file extent.

2) Only append the new members at nocow write, and provide fallback
setget funcions.
New macro BTRFS_SETGET_APPEND_FUNCS is introduced to provide set/get
support on the new members.
Only when given incompatible flags is set *AND* the item size is larger
than the original item size, set/get on the new members will work.
Or fallback function is called to get corresponding fallback value when
get function is called, and set function will just be ignored.

So old file extent format is not changed without the COMPPREALLOC flag,
and is compatibility with old kernels.

Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/ctree.c|  33 ++
 fs/btrfs/ctree.h| 112 +-
 fs/btrfs/extent_map.c   |   1 +
 fs/btrfs/extent_map.h   |   1 +
 fs/btrfs/file-item.c|  14 ++-
 fs/btrfs/file.c |  39 +-
 fs/btrfs/inode.c| 309 +++-
 fs/btrfs/ordered-data.h |   2 +
 fs/btrfs/tree-log.c |  99 +---
 9 files changed, 556 insertions(+), 54 deletions(-)

diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 44ee5d2..d24a448 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -4721,6 +4721,39 @@ void btrfs_extend_item(struct btrfs_root *root, struct 
btrfs_path *path,
 }
 
 /*
+ * resize the item pointer to by the path, may split leaf if free space
+ * is not enough, so it may return -EAGAIN.
+ *
+ * ins_len is the size to added by. can be minus, then will just truncate
+ * item size.
+ */
+int btrfs_resize_item(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root,
+ struct btrfs_path *path,
+ int ins_len)
+{
+   struct extent_buffer *leaf;
+   int slot;
+   int ret = 0;
+   int item_size;
+
+   if (!ins_len)
+   goto out;
+   leaf = path-nodes[0];
+   slot = path-slots[0];
+   item_size = btrfs_item_size_nr(leaf, slot);
+   if (ins_len  0) {
+   ret = setup_leaf_for_split(trans, root, path, ins_len);
+   if (ret)
+   goto out;
+   btrfs_extend_item(root, path, ins_len);
+   } else
+   btrfs_truncate_item(root, path, item_size + ins_len, 1);
+out:
+   return ret;
+}
+
+/*
  * this is a helper for btrfs_insert_empty_items, the main goal here is
  * to save stack depth by doing the bulk of the work in a function
  * that doesn't call btrfs_search_slot
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 2fc7908..bfd3fbd 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -524,6 +524,7 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL  7)
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL  8)
 #define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL  9)
+#define

[PATCH 1/2] btrfs: Add more check before read_extent_buffer() to avoid read overflow.

2014-09-17 Thread Qu Wenruo

Before this patch, when replay_one_extent() find an existing file
extent item, btrfs will call read_extent_buffer() to read out the file
extent.
However it lacks enough check, and may read out the inline file extent
using the wrong size(currently it always uses
sizeof(btrfs_file_extent_item))

If a inline file extent's size is smaller than normal file extent
size(53 bytes) and unfortunately the inline file extent lies at the end
of a full leaf, WARN_ON in read_extent_buffer() will be triggered.

This patch will check the file extent type before calling
read_extent_buffer(), since the if the logged one and the existing one
are all preallocated/regular file extent item, their size must be
sizeof(struct btrfs_file_extent_item) and will avoid the read overflow.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 fs/btrfs/tree-log.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 7e0e6e3..1ea2b10 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -620,6 +620,8 @@ static noinline int replay_one_extent(struct 
btrfs_trans_handle *trans,
existing = btrfs_item_ptr(leaf, path-slots[0],
  struct btrfs_file_extent_item);
 
+   if (btrfs_file_extent_type(leaf, existing) != found_type)
+   goto no_compare;
read_extent_buffer(eb, cmp1, (unsigned long)item,
   sizeof(cmp1));
read_extent_buffer(leaf, cmp2, (unsigned long)existing,
@@ -634,6 +636,7 @@ static noinline int replay_one_extent(struct 
btrfs_trans_handle *trans,
goto out;
}
}
+no_compare:
btrfs_release_path(path);
 
/* drop any overlapping extents */
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map

2014-09-17 Thread Liu Bo

On Wed, Sep 17, 2014 at 11:53:35AM +0800, Qu Wenruo wrote:
 The following commit enhanced the merge_extent_mapping() to reduce
 fragment in extent map tree, but it can't handle case which existing
 lies before map_start:
 51f39 btrfs: Use right extent length when inserting overlap extent map.
 
 [BUG]
 When existing extent map's start is before map_start,
 the em-len will be minus, which will corrupt the extent map and fail to
 insert the new extent map.
 This will happen when someone get a large extent map, but when it is
 going to insert it into extent map tree, some one has already commit
 some write and split the huge extent into small parts.
 
 [REPRODUCER]
 It is very easy to tiger using filebench with randomrw personality.
 It is about 100% to reproduce when using 8G preallocated file in 60s
 randonrw test.
 
 [FIX]
 This patch can now handle any existing extent position.
 Since it does not directly use existing-start, now it will find the
 previous and next extent around map_start.
 So the old existing-start  map_start bug will never happen again.
 
 [ENHANCE]
 This patch will insert the best fitted extent map into extent map tree,
 other than the oldest [map_start, map_start + sectorsize) or the
 relatively newer but not perfect [map_start, existing-start).
 
 The patch will first search existing extent that does not intersects with
 the desired map range [map_start, map_start + len).
 The existing extent will be either before or behind map_start, and based
 on the existing extent, we can find out the previous and next extent
 around map_start.
 
 So the best fitted extent would be [prev-end, next-start).
 For prev or next is not found, em-start would be prev-end and em-end
 wold be next-start.
 
 With this patch, the fragment in extent map tree should be reduced much
 more than the 51f39 commit and reduce an unneeded extent map tree search.
 
 Reported-by: Tsutomu Itoh t-i...@jp.fujitsu.com
 Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
 ---
  fs/btrfs/inode.c | 79 
 
  1 file changed, 57 insertions(+), 22 deletions(-)
 
 diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
 index 016c403..8039021 100644
 --- a/fs/btrfs/inode.c
 +++ b/fs/btrfs/inode.c
 @@ -6191,21 +6191,60 @@ out_fail_inode:
   goto out_fail;
  }
  
 +/* Find next extent map of a given extent map, caller needs to ensure locks 
 */
 +static struct extent_map *next_extent_map(struct extent_map *em)
 +{
 + struct rb_node *next;
 +
 + next = rb_next(em-rb_node);
 + if (!next)
 + return NULL;
 + return container_of(next, struct extent_map, rb_node);
 +}
 +
 +static struct extent_map *prev_extent_map(struct extent_map *em)
 +{
 + struct rb_node *prev;
 +
 + prev = rb_prev(em-rb_node);
 + if (!prev)
 + return NULL;
 + return container_of(prev, struct extent_map, rb_node);
 +}
 +
  /* helper for btfs_get_extent.  Given an existing extent in the tree,
 + * the existing extent is the nearest extent to map_start,
   * and an extent that you want to insert, deal with overlap and insert
 - * the new extent into the tree.
 + * the best fitted new extent into the tree.
   */
  static int merge_extent_mapping(struct extent_map_tree *em_tree,
   struct extent_map *existing,
   struct extent_map *em,
   u64 map_start)
  {
 + struct extent_map *prev;
 + struct extent_map *next;
 + u64 start;
 + u64 end;
   u64 start_diff;
  
   BUG_ON(map_start  em-start || map_start = extent_map_end(em));
 - start_diff = map_start - em-start;
 - em-start = map_start;
 - em-len = existing-start - em-start;
 +
 + if (existing-start  map_start) {
 + next = existing;
 + prev = prev_extent_map(next);
 + } else {
 + prev = existing;
 + next = next_extent_map(prev);
 + }
 +
 + start = prev ? extent_map_end(prev) : em-start;
 + start = max_t(u64, start, em-start);
 + end = next ? next-start : extent_map_end(em);
 + end = min_t(u64, end, extent_map_end(em));
 + start_diff = start - em-start;
 + em-start = start;
 + em-len = end - start;
   if (em-block_start  EXTENT_MAP_LAST_BYTE 
   !test_bit(EXTENT_FLAG_COMPRESSED, em-flags)) {
   em-block_start += start_diff;
 @@ -6482,25 +6521,21 @@ insert:
  
   ret = 0;
  
 - existing = lookup_extent_mapping(em_tree, start, len);
 - if (existing  (existing-start  start ||
 - existing-start + existing-len = start)) {
 + existing = search_extent_mapping(em_tree, start, len);
 + /*
 +  * existing will always be non-NULL, since there must be
 +  * extent causing the -EEXIST.
 +  */
 + if (start = extent_map_end(existing) ||
 + start + len = existing-start) {

[PATCH 3/3] btrfs: Add btrfsck support for nocow compressed prealloc write.

2014-09-17 Thread Qu Wenruo

Add support for nocow compressed write into preallocated range.

The main change is the following:
1) use disk_bytenr + data_offset to search csum
2) use offset + num_bytes - data_offset to judge if this is a valid
file extent.
3) add file extent item size check,
Since now regular file extent has 2 sizes(one with appended members
and the old one),

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 cmds-check.c | 19 ---
 ctree.h  |  1 -
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/cmds-check.c b/cmds-check.c
index 268e588..7a18ad4 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -1175,12 +1175,17 @@ static int process_file_extent(struct btrfs_root *root,
num_bytes = (num_bytes + mask)  ~mask;
} else if (extent_type == BTRFS_FILE_EXTENT_REG ||
   extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
+   u32 item_size = btrfs_item_size_nr(eb, slot);
num_bytes = btrfs_file_extent_num_bytes(eb, fi);
-   disk_bytenr = btrfs_file_extent_disk_bytenr(eb, fi);
+   /* data_offset will fallback to 0 on normal extent */
+   disk_bytenr = btrfs_file_extent_disk_bytenr(eb, fi) +
+ btrfs_ondemand_file_extent_data_offset(eb,
+   slot, fi);
extent_offset = btrfs_file_extent_offset(eb, fi);
if (num_bytes == 0 || (num_bytes  mask))
rec-errors |= I_ERR_BAD_FILE_EXTENT;
-   if (num_bytes + extent_offset 
+   if (num_bytes + extent_offset -
+   btrfs_ondemand_file_extent_data_offset(eb, slot, fi) 
btrfs_file_extent_ram_bytes(eb, fi))
rec-errors |= I_ERR_BAD_FILE_EXTENT;
if (extent_type == BTRFS_FILE_EXTENT_PREALLOC 
@@ -1190,6 +1195,13 @@ static int process_file_extent(struct btrfs_root *root,
rec-errors |= I_ERR_BAD_FILE_EXTENT;
if (disk_bytenr  0)
rec-found_size += num_bytes;
+   if (item_size != BTRFS_FILE_EXTENT_SIZE_NORMAL 
+   item_size != BTRFS_FILE_EXTENT_SIZE_MAX)
+   rec-errors |= I_ERR_BAD_FILE_EXTENT;
+   if (!btrfs_fs_incompat(root-fs_info,
+  BTRFS_FEATURE_INCOMPAT_COMPPREALLOC) 
+   (item_size == BTRFS_FILE_EXTENT_SIZE_MAX))
+   rec-errors |= I_ERR_BAD_FILE_EXTENT;
} else {
rec-errors |= I_ERR_BAD_FILE_EXTENT;
}
@@ -1198,7 +1210,8 @@ static int process_file_extent(struct btrfs_root *root,
if (disk_bytenr  0) {
u64 found;
if (btrfs_file_extent_compression(eb, fi))
-   num_bytes = btrfs_file_extent_disk_num_bytes(eb, fi);
+   num_bytes = btrfs_ondemand_file_extent_data_len(eb,
+   slot, fi);
else
disk_bytenr += extent_offset;
 
diff --git a/ctree.h b/ctree.h
index ab87133..e11e4d8 100644
--- a/ctree.h
+++ b/ctree.h
@@ -834,7 +834,6 @@ struct __compprealloc_data {
__le64 data_len;
 
 } __attribute__ ((__packed__));
-
 #define BTRFS_FILE_EXTENT_SIZE_NORMAL (sizeof(struct btrfs_file_extent_item))
 #define BTRFS_FILE_EXTENT_SIZE_MAX (sizeof(struct btrfs_file_extent_item) + \
sizeof(struct __compprealloc_data))
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/3] btrfs-progs: Add data_offset and data_len output for print_file_extent_item

2014-09-17 Thread Qu Wenruo

Add data_offset and data_len output for print_file_extent_item in
print-tree.c

WARNING: because the original output has word 'data' before 'offset',
to avoid confusion, remove the work 'data' before 'offset' and 'disk
bytenr'.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 print-tree.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/print-tree.c b/print-tree.c
index 7df5798..9b416bd 100644
--- a/print-tree.c
+++ b/print-tree.c
@@ -277,23 +277,29 @@ static void print_file_extent_item(struct extent_buffer 
*eb,
return;
}
if (extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
-   printf(\t\tprealloc data disk byte %llu nr %llu\n,
+   printf(\t\tprealloc disk byte %llu nr %llu\n,
  (unsigned long long)btrfs_file_extent_disk_bytenr(eb, fi),
  (unsigned long long)btrfs_file_extent_disk_num_bytes(eb, fi));
-   printf(\t\tprealloc data offset %llu nr %llu\n,
+   printf(\t\tprealloc offset %llu nr %llu\n,
  (unsigned long long)btrfs_file_extent_offset(eb, fi),
  (unsigned long long)btrfs_file_extent_num_bytes(eb, fi));
return;
}
-   printf(\t\textent data disk byte %llu nr %llu\n,
+   printf(\t\textent disk byte %llu nr %llu\n,
(unsigned long long)btrfs_file_extent_disk_bytenr(eb, fi),
(unsigned long long)btrfs_file_extent_disk_num_bytes(eb, fi));
-   printf(\t\textent data offset %llu nr %llu ram %llu\n,
+   printf(\t\textent offset %llu nr %llu ram %llu\n,
(unsigned long long)btrfs_file_extent_offset(eb, fi),
(unsigned long long)btrfs_file_extent_num_bytes(eb, fi),
(unsigned long long)btrfs_file_extent_ram_bytes(eb, fi));
printf(\t\textent compression %d\n,
   btrfs_file_extent_compression(eb, fi));
+   if (btrfs_item_size_nr(eb, slot)  sizeof(*fi))
+   printf(\t\tdata offset %llu nr %llu\n,
+  (unsigned long long)
+  btrfs_ondemand_file_extent_data_offset(eb, slot, fi),
+  (unsigned long long)
+  btrfs_ondemand_file_extent_data_len(eb, slot, fi));
 }
 
 /* Caller should ensure sizeof(*ret) = 16(DATA|TREE_BLOCK) */
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/3] btrfs-progs: Add nocow compressed write into prealloc support

2014-09-17 Thread Qu Wenruo

Add basic nocow compressed write into preallocated range support in
mkfs and headers.

Now we can use mkfs to create a btrfs which supports nocow compressed
prealloc write.

Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
 ctree.h | 69 -
 mkfs.c  |  2 ++
 2 files changed, 70 insertions(+), 1 deletion(-)

diff --git a/ctree.h b/ctree.h
index fa73c4a..ab87133 100644
--- a/ctree.h
+++ b/ctree.h
@@ -473,6 +473,7 @@ struct btrfs_super_block {
 #define BTRFS_FEATURE_INCOMPAT_RAID56  (1ULL  7)
 #define BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA (1ULL  8)
 #define BTRFS_FEATURE_INCOMPAT_NO_HOLES(1ULL  9)
+#define BTRFS_FEATURE_INCOMPAT_COMPPREALLOC(1ULL  10)
 
 
 #define BTRFS_FEATURE_COMPAT_SUPP  0ULL
@@ -486,7 +487,8 @@ struct btrfs_super_block {
 BTRFS_FEATURE_INCOMPAT_RAID56 |\
 BTRFS_FEATURE_INCOMPAT_MIXED_GROUPS |  \
 BTRFS_FEATURE_INCOMPAT_SKINNY_METADATA |   \
-BTRFS_FEATURE_INCOMPAT_NO_HOLES)
+BTRFS_FEATURE_INCOMPAT_NO_HOLES |  \
+BTRFS_FEATURE_INCOMPAT_COMPPREALLOC)
 
 /*
  * A leaf is full of items. offset and size tell us where to find
@@ -825,6 +827,18 @@ struct btrfs_file_extent_item {
 
 } __attribute__ ((__packed__));
 
+struct __compprealloc_data {
+   /* extent offset where we read data from disk. */
+   __le64 data_offset;
+   /* extent len that we need read into memory. */
+   __le64 data_len;
+
+} __attribute__ ((__packed__));
+
+#define BTRFS_FILE_EXTENT_SIZE_NORMAL (sizeof(struct btrfs_file_extent_item))
+#define BTRFS_FILE_EXTENT_SIZE_MAX (sizeof(struct btrfs_file_extent_item) + \
+   sizeof(struct __compprealloc_data))
+
 struct btrfs_csum_item {
u8 csum;
 } __attribute__ ((__packed__));
@@ -2146,6 +2160,59 @@ static inline int btrfs_fs_incompat(struct btrfs_fs_info 
*fs_info, u64 flag)
return !!(btrfs_super_incompat_flags(disk_super)  flag);
 }
 
+/* struct __compprealloc_data, don't use them directly!!! */
+BTRFS_SETGET_FUNCS(__compprealloc_data_offset, struct __compprealloc_data,
+  data_offset, 64);
+BTRFS_SETGET_FUNCS(__compprealloc_data_len, struct __compprealloc_data,
+  data_len, 64);
+
+/*
+ * functions for appended data after a original structure
+ * unlike kernel part, eb has no member fs_info, so caller should check
+ * the incompatible flag manually
+ */
+#define BTRFS_SETGET_APPEND_FUNCS(name, type, new_type, func, bits,\
+ flag, fallback)   \
+static inline u##bits btrfs_##name(struct extent_buffer *eb,   \
+  int slot, type *s)   \
+{  \
+   if (btrfs_item_size_nr(eb, slot)  sizeof(*s))  \
+   return btrfs_##func(eb, (new_type *)(s + 1));   \
+   return fallback(eb, s); \
+}  \
+static inline void btrfs_set_##name(struct extent_buffer *eb,  \
+   int slot, type *s, u##bits val) \
+{  \
+   if (btrfs_item_size_nr(eb, slot)  sizeof(*s))  \
+   btrfs_set_##func(eb, (new_type *)(s + 1), val); \
+}  \
+
+/* appended data for btrfs_file_extent_item */
+static inline u64
+data_offset_fallback(struct extent_buffer *eb,
+struct btrfs_file_extent_item *fi)
+{
+   return 0;
+}
+BTRFS_SETGET_APPEND_FUNCS(ondemand_file_extent_data_offset,
+ struct btrfs_file_extent_item,
+ struct __compprealloc_data,
+ __compprealloc_data_offset,
+ 64, COMPPREALLOC, data_offset_fallback);
+
+static inline u64
+data_len_fallback(struct extent_buffer *eb,
+ struct btrfs_file_extent_item *fi)
+{
+   return btrfs_file_extent_disk_num_bytes(eb, fi);
+}
+BTRFS_SETGET_APPEND_FUNCS(ondemand_file_extent_data_len,
+ struct btrfs_file_extent_item,
+ struct __compprealloc_data,
+ __compprealloc_data_len,
+ 64, COMPPREALLOC, data_len_fallback);
+
+
 /* helper function to cast into the data area of the leaf. */
 #define btrfs_item_ptr(leaf, slot, type) \
((type *)(btrfs_leaf_data(leaf) + \
diff --git a/mkfs.c b/mkfs.c
index 9de61e1..da1e586 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -1163,6 +1163,8 @@ static const struct btrfs_fs_feature {
reduced-size metadata extent refs },
{ no-holes, BTRFS_FEATURE_INCOMPAT_NO_HOLES,

Re: [PATCH] btrfs-progs: fix find_mount_root() to handle duplicated mount point correctly

2014-09-17 Thread Omar Sandoval

What's the status on this patch? There have been at least a couple of bug
reports that this fixes, including
https://bugzilla.kernel.org/show_bug.cgi?id=83741.
-- 
Omar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map

2014-09-17 Thread Qu Wenruo

 Original Message 
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to 
insert best fitted extent map

From: Liu Bo bo.li@oracle.com
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年09月18日 12:21

On Wed, Sep 17, 2014 at 11:53:35AM +0800, Qu Wenruo wrote:

The following commit enhanced the merge_extent_mapping() to reduce
fragment in extent map tree, but it can't handle case which existing
lies before map_start:
51f39 btrfs: Use right extent length when inserting overlap extent map.

[BUG]
When existing extent map's start is before map_start,
the em-len will be minus, which will corrupt the extent map and fail to
insert the new extent map.
This will happen when someone get a large extent map, but when it is
going to insert it into extent map tree, some one has already commit
some write and split the huge extent into small parts.

[REPRODUCER]
It is very easy to tiger using filebench with randomrw personality.
It is about 100% to reproduce when using 8G preallocated file in 60s
randonrw test.

[FIX]
This patch can now handle any existing extent position.
Since it does not directly use existing-start, now it will find the
previous and next extent around map_start.
So the old existing-start  map_start bug will never happen again.

[ENHANCE]
This patch will insert the best fitted extent map into extent map tree,
other than the oldest [map_start, map_start + sectorsize) or the
relatively newer but not perfect [map_start, existing-start).

The patch will first search existing extent that does not intersects with
the desired map range [map_start, map_start + len).
The existing extent will be either before or behind map_start, and based
on the existing extent, we can find out the previous and next extent
around map_start.

So the best fitted extent would be [prev-end, next-start).
For prev or next is not found, em-start would be prev-end and em-end
wold be next-start.

With this patch, the fragment in extent map tree should be reduced much
more than the 51f39 commit and reduce an unneeded extent map tree search.

Reported-by: Tsutomu Itoh t-i...@jp.fujitsu.com
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
  fs/btrfs/inode.c | 79 
  1 file changed, 57 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 016c403..8039021 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6191,21 +6191,60 @@ out_fail_inode:
goto out_fail;
  }

+/* Find next extent map of a given extent map, caller needs to ensure locks */

+static struct extent_map *next_extent_map(struct extent_map *em)
+{
+   struct rb_node *next;
+
+   next = rb_next(em-rb_node);
+   if (!next)
+   return NULL;
+   return container_of(next, struct extent_map, rb_node);
+}
+
+static struct extent_map *prev_extent_map(struct extent_map *em)
+{
+   struct rb_node *prev;
+
+   prev = rb_prev(em-rb_node);
+   if (!prev)
+   return NULL;
+   return container_of(prev, struct extent_map, rb_node);
+}
+
  /* helper for btfs_get_extent.  Given an existing extent in the tree,
+ * the existing extent is the nearest extent to map_start,
   * and an extent that you want to insert, deal with overlap and insert
- * the new extent into the tree.
+ * the best fitted new extent into the tree.
   */
  static int merge_extent_mapping(struct extent_map_tree *em_tree,
struct extent_map *existing,
struct extent_map *em,
u64 map_start)
  {
+   struct extent_map *prev;
+   struct extent_map *next;
+   u64 start;
+   u64 end;
u64 start_diff;

  	BUG_ON(map_start  em-start || map_start = extent_map_end(em));

-   start_diff = map_start - em-start;
-   em-start = map_start;
-   em-len = existing-start - em-start;
+
+   if (existing-start  map_start) {
+   next = existing;
+   prev = prev_extent_map(next);
+   } else {
+   prev = existing;
+   next = next_extent_map(prev);
+   }
+
+   start = prev ? extent_map_end(prev) : em-start;
+   start = max_t(u64, start, em-start);
+   end = next ? next-start : extent_map_end(em);
+   end = min_t(u64, end, extent_map_end(em));
+   start_diff = start - em-start;
+   em-start = start;
+   em-len = end - start;
if (em-block_start  EXTENT_MAP_LAST_BYTE 
!test_bit(EXTENT_FLAG_COMPRESSED, em-flags)) {
em-block_start += start_diff;
@@ -6482,25 +6521,21 @@ insert:

  		ret = 0;

-		existing = lookup_extent_mapping(em_tree, start, len);

-   if (existing  (existing-start  start ||
-   existing-start + existing-len = start)) {
+   existing = search_extent_mapping(em_tree, start, len);
+   /*
+* existing will always be non-NULL, since

Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to insert best fitted extent map

2014-09-17 Thread Qu Wenruo

 Original Message 
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to 
insert best fitted extent map

From: Qu Wenruo quwen...@cn.fujitsu.com
To: bo.li@oracle.com
Date: 2014年09月18日 13:36

 Original Message 
Subject: Re: [PATCH] btrfs: Fix and enhance merge_extent_mapping() to 
insert best fitted extent map

From: Liu Bo bo.li@oracle.com
To: Qu Wenruo quwen...@cn.fujitsu.com
Date: 2014年09月18日 12:21

On Wed, Sep 17, 2014 at 11:53:35AM +0800, Qu Wenruo wrote:

The following commit enhanced the merge_extent_mapping() to reduce
fragment in extent map tree, but it can't handle case which existing
lies before map_start:
51f39 btrfs: Use right extent length when inserting overlap extent map.

[BUG]
When existing extent map's start is before map_start,
the em-len will be minus, which will corrupt the extent map and 
fail to

insert the new extent map.
This will happen when someone get a large extent map, but when it is
going to insert it into extent map tree, some one has already commit
some write and split the huge extent into small parts.

[REPRODUCER]
It is very easy to tiger using filebench with randomrw personality.
It is about 100% to reproduce when using 8G preallocated file in 60s
randonrw test.

[FIX]
This patch can now handle any existing extent position.
Since it does not directly use existing-start, now it will find the
previous and next extent around map_start.
So the old existing-start  map_start bug will never happen again.

[ENHANCE]
This patch will insert the best fitted extent map into extent map tree,
other than the oldest [map_start, map_start + sectorsize) or the
relatively newer but not perfect [map_start, existing-start).

The patch will first search existing extent that does not intersects 
with

the desired map range [map_start, map_start + len).
The existing extent will be either before or behind map_start, and 
based

on the existing extent, we can find out the previous and next extent
around map_start.

So the best fitted extent would be [prev-end, next-start).
For prev or next is not found, em-start would be prev-end and em-end
wold be next-start.

With this patch, the fragment in extent map tree should be reduced much
more than the 51f39 commit and reduce an unneeded extent map tree 
search.

Reported-by: Tsutomu Itoh t-i...@jp.fujitsu.com
Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com
---
  fs/btrfs/inode.c | 79 

  1 file changed, 57 insertions(+), 22 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 016c403..8039021 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6191,21 +6191,60 @@ out_fail_inode:
  goto out_fail;
  }
  +/* Find next extent map of a given extent map, caller needs to 
ensure locks */

+static struct extent_map *next_extent_map(struct extent_map *em)
+{
+struct rb_node *next;
+
+next = rb_next(em-rb_node);
+if (!next)
+return NULL;
+return container_of(next, struct extent_map, rb_node);
+}
+
+static struct extent_map *prev_extent_map(struct extent_map *em)
+{
+struct rb_node *prev;
+
+prev = rb_prev(em-rb_node);
+if (!prev)
+return NULL;
+return container_of(prev, struct extent_map, rb_node);
+}
+
  /* helper for btfs_get_extent.  Given an existing extent in the tree,
+ * the existing extent is the nearest extent to map_start,
   * and an extent that you want to insert, deal with overlap and 
insert

- * the new extent into the tree.
+ * the best fitted new extent into the tree.
   */
  static int merge_extent_mapping(struct extent_map_tree *em_tree,
  struct extent_map *existing,
  struct extent_map *em,
  u64 map_start)
  {
+struct extent_map *prev;
+struct extent_map *next;
+u64 start;
+u64 end;
  u64 start_diff;
BUG_ON(map_start  em-start || map_start = 
extent_map_end(em));

-start_diff = map_start - em-start;
-em-start = map_start;
-em-len = existing-start - em-start;
+
+if (existing-start  map_start) {
+next = existing;
+prev = prev_extent_map(next);
+} else {
+prev = existing;
+next = next_extent_map(prev);
+}
+
+start = prev ? extent_map_end(prev) : em-start;
+start = max_t(u64, start, em-start);
+end = next ? next-start : extent_map_end(em);
+end = min_t(u64, end, extent_map_end(em));
+start_diff = start - em-start;
+em-start = start;
+em-len = end - start;
  if (em-block_start  EXTENT_MAP_LAST_BYTE 
  !test_bit(EXTENT_FLAG_COMPRESSED, em-flags)) {
  em-block_start += start_diff;
@@ -6482,25 +6521,21 @@ insert:
ret = 0;
  -existing = lookup_extent_mapping(em_tree, start, len);
-if (existing  (existing-start  start ||
-existing-start + existing-len = start)) {
+existing = search_extent_mapping(em_tree, start, len);
+/*
+

[PATCH v3 01/15] btrfs: new test to run btrfs balance and subvolume test simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs balance and subvolume create/mount/umount/delete simultaneously,
with fsstress running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 common/rc   | 110 +++--
 tests/btrfs/059 | 115 
 tests/btrfs/059.out |   2 +
 tests/btrfs/group   |   1 +
 4 files changed, 224 insertions(+), 4 deletions(-)
 create mode 100755 tests/btrfs/059
 create mode 100644 tests/btrfs/059.out

diff --git a/common/rc b/common/rc
index b8f711a..a0677da 100644
--- a/common/rc
+++ b/common/rc
@@ -582,11 +582,17 @@ _scratch_pool_mkfs()
 {
 case $FSTYP in
 btrfs)
-   $MKFS_BTRFS_PROG $MKFS_OPTIONS $* $SCRATCH_DEV_POOL  /dev/null
-   ;;
+# if dup profile is in mkfs options call _scratch_mkfs instead
+# because dup profile only works with single device
+if [[ $* =~ dup ]]; then
+_scratch_mkfs $*
+else
+$MKFS_BTRFS_PROG $MKFS_OPTIONS $* $SCRATCH_DEV_POOL  /dev/null
+fi
+;;
 *)
-   echo _scratch_pool_mkfs is not implemented for $FSTYP 12
-   ;;
+echo _scratch_pool_mkfs is not implemented for $FSTYP 12
+;;
 esac
 }
 
@@ -2483,6 +2489,102 @@ _get_free_inode()
echo $nr_inode
 }
 
+# get btrfs profile configs being tested
+#
+# A set of pre-set profile configs are exported via _btrfs_profile_configs
+# array. Default configs can be overridden by setting BTRFS_PROFILE_CONFIGS
+# var in the format metadata_profile:data_profile, multiple configs can be
+# seperated by space, e.g.
+# export BTRFS_PROFILE_CONFIGS=raid0:raid0 raid1:raid1 dup:single
+_btrfs_get_profile_configs()
+{
+   if [ $FSTYP != btrfs ]; then
+   return
+   fi
+
+   # no user specified btrfs profile configs, export the default configs
+   if [ -z $BTRFS_PROFILE_CONFIGS ]; then
+   # default configs
+   _btrfs_profile_configs=(
+   -m single -d single
+   -m dup -d single
+   -m raid0 -d raid0
+   -m raid1 -d raid0
+   -m raid1 -d raid1
+   -m raid10 -d raid10
+   -m raid5 -d raid5
+   -m raid6 -d raid6
+   )
+
+   # remove dup/raid5/raid6 profiles if we're doing device replace
+   # dup profile indicates only one device being used 
(SCRATCH_DEV),
+   # but we don't want to replace SCRATCH_DEV, which will be used 
in
+   # _scratch_mount/_check_scratch_fs etc.
+   # and raid5/raid6 doesn't support replace yet
+   if [ $1 == replace ]; then
+   _btrfs_profile_configs=(
+   -m single -d single
+   -m raid0 -d raid0
+   -m raid1 -d raid0
+   -m raid1 -d raid1
+   -m raid10 -d raid10
+   # add these back when raid5/6 is working with 
replace
+   #-m raid5 -d raid5
+   #-m raid6 -d raid6
+   )
+   fi
+   export _btrfs_profile_configs
+   return
+   fi
+
+   # parse user specified btrfs profile configs
+   local i=0
+   local cfg=
+   for cfg in $BTRFS_PROFILE_CONFIGS; do
+   # turn metadata:data format to -m metadata -d data
+   # and assign it to _btrfs_profile_configs array
+   cfg=`echo $cfg | sed -e 's/^/-m /' -e 's/:/ -d /'`
+   _btrfs_profile_configs[$i]=$cfg
+   let i=i+1
+   done
+
+   if [ $1 == replace ]; then
+   if echo ${_btrfs_profile_configs[*]} | grep -q raid[56]; then
+   _notrun RAID5/6 doesn't support btrfs device replace 
yet
+   fi
+   if echo ${_btrfs_profile_configs[*]} | grep -q dup; then
+   _notrun Do not set dup profile in btrfs device replace 
test
+   fi
+   fi
+   export _btrfs_profile_configs
+}
+
+# stress btrfs by running balance operation in a loop
+_btrfs_stress_balance()
+{
+   local btrfs_mnt=$1
+   while true; do
+   $BTRFS_UTIL_PROG balance start $btrfs_mnt
+   done
+}
+
+# stress btrfs by creating/mounting/umounting/deleting subvolume in a loop
+_btrfs_stress_subvolume()
+{
+   local btrfs_dev=$1
+   local btrfs_mnt=$2
+   local subvol_name=$3
+   local subvol_mnt=$4
+
+   mkdir -p $subvol_mnt
+   while true; do
+   $BTRFS_UTIL_PROG subvolume create $btrfs_mnt/$subvol_name
+   $MOUNT_PROG -o subvol=$subvol_name $btrfs_dev $subvol_mnt
+   $UMOUNT_PROG $subvol_mnt
+   $BTRFS_UTIL_PROG subvolume delete $btrfs_mnt/$subvol_name
+

[PATCH v3 00/15] xfstests: new btrfs stress test cases

2014-09-17 Thread Eryu Guan

This patchset add new stress test cases for btrfs by running two
different btrfs operations simultaneously under fsstress to ensure
btrfs doesn't hang or oops in such situations. btrfs scrub and
btrfs check will be run after each test.

The test matrix is the combination of 6 btrfs operations:

balance
create/mount/umount/delete subvolume
replace device
scrub
defrag
remount with different compress algorithms

Short descriptions:

059: balance-subvolume
060: balance-scrub
061: balance-defrag
062: balance-remount
063: balance-replace
064: subvolume-replace
065: subvolume-scrub
066: subvolume-defrag
067: subvolume-remount
068: replace-scrub
069: replace-defrag
070: replace-remount
071: scrub-defrag
072: scrub-remount
073: defrag-remount

Some issues I've seen:

1. subvolume cannot be mounted with selinux context, so you may see
   such logs in dmesg

   SELinux: mount invalid.  Same superblock, different security settings for 
(dev dm-8, type btrfs)

   I've reported the bug to btrfs list, see
   [BUG] cannot mount subvolume with selinux context
   http://www.spinics.net/lists/linux-btrfs/msg36779.html

2. btrfs replace operation always returns ENOENT if balance is running
   So in 063.full you'll see

   ERROR: ioctl(DEV_REPLACE_START) failed on /mnt/testarea/scratch: No such 
file or directory, no error

   Not sure if it's btrfs bug, at least I think the error code is misleading

3. replace operation hangs the kernel(3.16-rc4+ and 3.17-rc2+) with fsstress 
running
   So case 064/068/069/070 will hang


Changes since v2:
- mount subvolume at $TEST_DIR/$seq.mnt not $tmp.mnt
- don't rm -rf $tmp.* which is dangerous
- remove unnecessary btrfs filesystem sync operation
- update _scratch_pool_mkfs to deal with dup profile
- add more comments for each stress operation
- rename _btrfs_stress_remount to _btrfs_stress_remount_compress (hope it's 
better...)
- add _btrfs_get_profile_configs to remove duplicated test case array in each 
case
- use _require_scratch_nocheck and _check_scratch_fs after each loop

Changes since v1:
- put common operations in common/rc as functions and share them across these 
tests
- append mkfs options to _scratch_mkfs and _scratch_pool_mkfs instead of 
updating $MKFS_OPTIONS
- rebase on top of master and re-number starting from btrfs/059

Thanks,
Eryu Guan

Eryu Guan (15):
  btrfs: new test to run btrfs balance and subvolume test simultaneously
  btrfs: new test to run btrfs balance and scrub simultaneously
  btrfs: new test to run btrfs balance and defrag operations simultaneously
  btrfs: new case to run btrfs balance and remount with different compress 
algorithms
  btrfs: new case to run btrfs balance and device replace operations 
simultaneously
  btrfs: new case to run btrfs subvolume create/delete operations and device 
replace simultaneously
  btrfs: new case to run btrfs subvolume create/delete operations and scrub 
simultaneously
  btrfs: new case to run btrfs subvolume create/delete and defrag operations 
simultaneously
  btrfs: new case to run subvolume create/delete and remount with defferent 
compress algorithms
  btrfs: new case to run device replace and scrub operations simultaneously
  btrfs: new case to run device replace and defrag operations simultaneously
  btrfs: new case to run device replace and remount with different compress 
algorithms simultaneously
  btrfs: new case to run btrfs scrub and defrag operations simultaneously
  btrfs: new case to run btrfs scrub and remount with different compress 
algorithms simultaneously
  btrfs: new case to run defrag and remount with different compress algorithms 
simultaneously

 common/rc   | 225 +++-
 tests/btrfs/059 | 115 +++
 tests/btrfs/059.out |   2 +
 tests/btrfs/060 | 114 ++
 tests/btrfs/060.out |   2 +
 tests/btrfs/061 | 116 +++
 tests/btrfs/061.out |   2 +
 tests/btrfs/062 | 114 ++
 tests/btrfs/062.out |   2 +
 tests/btrfs/063 | 122 
 tests/btrfs/063.out |   2 +
 tests/btrfs/064 | 123 
 tests/btrfs/064.out |   2 +
 tests/btrfs/065 | 115 +++
 tests/btrfs/065.out |   2 +
 tests/btrfs/066 | 117 +++
 tests/btrfs/066.out |   2 +
 tests/btrfs/067 | 116 +++
 tests/btrfs/067.out |   2 +
 tests/btrfs/068 | 123 
 tests/btrfs/068.out |   2 +
 tests/btrfs/069 | 125 +
 tests/btrfs/069.out |   2 +
 tests/btrfs/070 | 123 
 tests/btrfs/070.out |   2 +
 tests/btrfs/071 | 116 +++
 tests/btrfs/071.out |   2 +
 tests/btrfs/072

[PATCH v3 02/15] btrfs: new test to run btrfs balance and scrub simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs balance and scrub operations simultaneously with fsstress
running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 common/rc   |   9 +
 tests/btrfs/060 | 114 
 tests/btrfs/060.out |   2 +
 tests/btrfs/group   |   1 +
 4 files changed, 126 insertions(+)
 create mode 100755 tests/btrfs/060
 create mode 100644 tests/btrfs/060.out

diff --git a/common/rc b/common/rc
index a0677da..cc5901f 100644
--- a/common/rc
+++ b/common/rc
@@ -2585,6 +2585,15 @@ _btrfs_stress_subvolume()
done
 }
 
+# stress btrfs by running scrub in a loop
+_btrfs_stress_scrub()
+{
+   local btrfs_mnt=$1
+   while true; do
+   $BTRFS_UTIL_PROG scrub start -B $btrfs_mnt
+   done
+}
+
 init_rc()
 {
if [ $iam == new ]
diff --git a/tests/btrfs/060 b/tests/btrfs/060
new file mode 100755
index 000..b270c7b
--- /dev/null
+++ b/tests/btrfs/060
@@ -0,0 +1,114 @@
+#! /bin/bash
+# FSQA Test No. btrfs/060
+#
+# Run btrfs balance and scrub operations simultaneously with fsstress
+# running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+
+   echo Test $mkfs_opts $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start balance worker:  $seqres.full
+   _btrfs_stress_balance $SCRATCH_MNT /dev/null 21 
+   balance_pid=$!
+   echo $balance_pid $seqres.full
+
+   echo -n Start scrub worker:  $seqres.full
+   _btrfs_stress_scrub $SCRATCH_MNT /dev/null 21 
+   scrub_pid=$!
+   echo $scrub_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $balance_pid $scrub_pid
+   wait
+   # wait for the balance and scrub operations to finish
+   while ps aux | grep balance start | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep scrub start | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/060.out b/tests/btrfs/060.out
new file mode 100644
index 000..8ffce4d
--- /dev/null
+++ b/tests/btrfs/060.out
@@ -0,0 +1,2 @@
+QA output created by 060
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index c66c42c..1c60c8f 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -62,3 +62,4 @@
 057 auto quick
 058 auto quick
 059 auto balance subvol
+060 auto balance scrub
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 07/15] btrfs: new case to run btrfs subvolume create/delete operations and scrub simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs subvolume create/mount/umount/delete and btrfs scrub
operation simultaneously, with fsstress running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/065 | 115 
 tests/btrfs/065.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 118 insertions(+)
 create mode 100755 tests/btrfs/065
 create mode 100644 tests/btrfs/065.out

diff --git a/tests/btrfs/065 b/tests/btrfs/065
new file mode 100755
index 000..14fee63
--- /dev/null
+++ b/tests/btrfs/065
@@ -0,0 +1,115 @@
+#! /bin/bash
+# FSQA Test No. btrfs/065
+#
+# Run btrfs subvolume create/mount/umount/delete and btrfs scrub
+# operation simultaneously, with fsstress running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local subvol_mnt=$TEST_DIR/$seq.mnt
+
+   echo Test $mkfs_opts $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start subvolume worker:  $seqres.full
+   _btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt 
/dev/null 21 
+   subvol_pid=$!
+   echo $subvol_pid $seqres.full
+
+   echo -n Start scrub worker:  $seqres.full
+   _btrfs_stress_scrub $SCRATCH_MNT /dev/null 21 
+   scrub_pid=$!
+   echo $scrub_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+
+   kill $subvol_pid $scrub_pid
+   wait
+   # wait for the scrub operation to finish
+   while ps aux | grep scrub start | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   # in case the subvolume is still mounted
+   $UMOUNT_PROG $subvol_mnt /dev/null 21
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/065.out b/tests/btrfs/065.out
new file mode 100644
index 000..94476cd
--- /dev/null
+++ b/tests/btrfs/065.out
@@ -0,0 +1,2 @@
+QA output created by 065
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index a214920..4685970 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -67,3 +67,4 @@
 062 auto balance remount compress
 063 auto balance replace
 064 auto subvol replace
+065 auto subvol scrub
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 06/15] btrfs: new case to run btrfs subvolume create/delete operations and device replace simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs subvolume create/mount/umount/delete and device replace
operation simultaneously, with fsstress running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/064 | 123 
 tests/btrfs/064.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 126 insertions(+)
 create mode 100755 tests/btrfs/064
 create mode 100644 tests/btrfs/064.out

diff --git a/tests/btrfs/064 b/tests/btrfs/064
new file mode 100755
index 000..319e480
--- /dev/null
+++ b/tests/btrfs/064
@@ -0,0 +1,123 @@
+#! /bin/bash
+# FSQA Test No. btrfs/064
+#
+# Run btrfs subvolume create/mount/umount/delete and device replace
+# operation simultaneously, with fsstress running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 5
+_require_scratch_dev_pool_equal_size
+_btrfs_get_profile_configs replace
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local saved_scratch_dev_pool=$SCRATCH_DEV_POOL
+   local subvol_mnt=$TEST_DIR/$seq.mnt
+
+   echo Test $mkfs_opts $seqres.full
+
+   # remove the last device from the SCRATCH_DEV_POOL list so
+   # _scratch_pool_mkfs won't use all devices in pool
+   local last_dev=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`
+   SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e s# *$last_dev *##`
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+   return
+   fi
+   _scratch_mount $seqres.full 21
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start subvolume worker:  $seqres.full
+   _btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt 
/dev/null 21 
+   subvol_pid=$!
+   echo $subvol_pid $seqres.full
+
+   echo -n Start replace worker:  $seqres.full
+   _btrfs_stress_replace $SCRATCH_MNT $seqres.full 21 
+   replace_pid=$!
+   echo $replace_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+
+   kill $subvol_pid $replace_pid
+   wait
+   # wait for the replace operation to finish
+   while ps aux | grep replace start | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   # in case the subvolume is still mounted
+   $UMOUNT_PROG $subvol_mnt /dev/null 21
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/064.out b/tests/btrfs/064.out
new file mode 100644
index 000..d907654
--- /dev/null
+++ b/tests/btrfs/064.out
@@ -0,0 +1,2 @@
+QA output created by 064
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index e234bc2..a214920 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -66,3 +66,4 @@
 061 auto balance defrag compress
 062 auto balance remount compress
 063 auto balance replace
+064 auto subvol

[PATCH v3 10/15] btrfs: new case to run device replace and scrub operations simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs replace operations and scrub simultaneously with fsstress
running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/068 | 123 
 tests/btrfs/068.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 126 insertions(+)
 create mode 100755 tests/btrfs/068
 create mode 100644 tests/btrfs/068.out

diff --git a/tests/btrfs/068 b/tests/btrfs/068
new file mode 100755
index 000..220e608
--- /dev/null
+++ b/tests/btrfs/068
@@ -0,0 +1,123 @@
+#! /bin/bash
+# FSQA Test No. btrfs/068
+#
+# Run btrfs replace operations and scrub simultaneously with fsstress
+# running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 5
+_require_scratch_dev_pool_equal_size
+_btrfs_get_profile_configs replace
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local saved_scratch_dev_pool=$SCRATCH_DEV_POOL
+
+   echo Test $mkfs_opts $seqres.full
+
+   # remove the last device from the SCRATCH_DEV_POOL list so
+   # _scratch_pool_mkfs won't use all devices in pool
+   local last_dev=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`
+   SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e s# *$last_dev *##`
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+   return
+   fi
+   _scratch_mount $seqres.full 21
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start replace worker:  $seqres.full
+   _btrfs_stress_replace $SCRATCH_MNT $seqres.full 21 
+   replace_pid=$!
+   echo $replace_pid $seqres.full
+
+   echo -n Start scrub worker:  $seqres.full
+   _btrfs_stress_scrub $SCRATCH_MNT /dev/null 21 
+   scrub_pid=$!
+   echo $scrub_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $replace_pid $scrub_pid
+   wait
+
+   # wait for the scrub and replace operations to finish
+   while ps aux | grep scrub start | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep replace start | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/068.out b/tests/btrfs/068.out
new file mode 100644
index 000..d10c9bd
--- /dev/null
+++ b/tests/btrfs/068.out
@@ -0,0 +1,2 @@
+QA output created by 068
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index d515ff5..1e83505 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -70,3 +70,4 @@
 065 auto subvol scrub
 066 auto subvol defrag compress
 067 auto subvol remount compress
+068 auto replace scrub
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo

[PATCH v3 03/15] btrfs: new test to run btrfs balance and defrag operations simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs balance and defrag operations simultaneously with fsstress
running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 common/rc   |  20 +
 tests/btrfs/061 | 116 
 tests/btrfs/061.out |   2 +
 tests/btrfs/group   |   1 +
 4 files changed, 139 insertions(+)
 create mode 100755 tests/btrfs/061
 create mode 100644 tests/btrfs/061.out

diff --git a/common/rc b/common/rc
index cc5901f..a333af8 100644
--- a/common/rc
+++ b/common/rc
@@ -2594,6 +2594,26 @@ _btrfs_stress_scrub()
done
 }
 
+# stress btrfs by defragmenting every file/dir in a loop and compress file
+# contents while defragmenting if second argument is not nocompress
+_btrfs_stress_defrag()
+{
+   local btrfs_mnt=$1
+   local compress=$2
+
+   while true; do
+   if [ $compress == nocompress ]; then
+   find $btrfs_mnt \( -type f -o -type d \) -exec \
+   $BTRFS_UTIL_PROG filesystem defrag {} \;
+   else
+   find $btrfs_mnt \( -type f -o -type d \) -exec \
+   $BTRFS_UTIL_PROG filesystem defrag -clzo {} \;
+   find $btrfs_mnt \( -type f -o -type d \) -exec \
+   $BTRFS_UTIL_PROG filesystem defrag -czlib {} \;
+   fi
+   done
+}
+
 init_rc()
 {
if [ $iam == new ]
diff --git a/tests/btrfs/061 b/tests/btrfs/061
new file mode 100755
index 000..4aeb366
--- /dev/null
+++ b/tests/btrfs/061
@@ -0,0 +1,116 @@
+#! /bin/bash
+# FSQA Test No. btrfs/061
+#
+# Run btrfs balance and defrag operations simultaneously with fsstress
+# running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local with_compress=$2
+
+   echo Test $mkfs_opts with $with_compress $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start balance worker:  $seqres.full
+   _btrfs_stress_balance $SCRATCH_MNT /dev/null 21 
+   balance_pid=$!
+   echo $balance_pid $seqres.full
+
+   echo -n Start defrag worker:  $seqres.full
+   _btrfs_stress_defrag $SCRATCH_MNT $with_compress /dev/null 21 
+   defrag_pid=$!
+   echo $defrag_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $balance_pid $defrag_pid
+   wait
+   # wait for the balance and defrag operations to finish
+   while ps aux | grep balance start | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep btrfs filesystem defrag | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t nocompress
+

[PATCH v3 08/15] btrfs: new case to run btrfs subvolume create/delete and defrag operations simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs subvolume create/mount/umount/delete and btrfs defrag
operations simultaneously, with fsstress running in backgound.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/066 | 117 
 tests/btrfs/066.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 120 insertions(+)
 create mode 100755 tests/btrfs/066
 create mode 100644 tests/btrfs/066.out

diff --git a/tests/btrfs/066 b/tests/btrfs/066
new file mode 100755
index 000..d8e165d
--- /dev/null
+++ b/tests/btrfs/066
@@ -0,0 +1,117 @@
+#! /bin/bash
+# FSQA Test No. btrfs/066
+#
+# Run btrfs subvolume create/mount/umount/delete and btrfs defrag
+# operation simultaneously, with fsstress running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local with_compress=$2
+   local subvol_mnt=$TEST_DIR/$seq.mnt
+
+   echo Test $mkfs_opts with $with_compress $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start subvolume worker:  $seqres.full
+   _btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt 
/dev/null 21 
+   subvol_pid=$!
+   echo $subvol_pid $seqres.full
+
+   echo -n Start defrag worker:  $seqres.full
+   _btrfs_stress_defrag $SCRATCH_MNT $with_compress /dev/null 21 
+   defrag_pid=$!
+   echo $defrag_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+
+   kill $subvol_pid $defrag_pid
+   wait
+   # wait for btrfs defrag process to exit, otherwise it will block umount
+   while ps aux | grep btrfs filesystem defrag | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   # in case the subvolume is still mounted
+   $UMOUNT_PROG $subvol_mnt /dev/null 21
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t nocompress
+   run_test $t compress
+done
+
+status=0
+exit
diff --git a/tests/btrfs/066.out b/tests/btrfs/066.out
new file mode 100644
index 000..b60cc24
--- /dev/null
+++ b/tests/btrfs/066.out
@@ -0,0 +1,2 @@
+QA output created by 066
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 4685970..bbfbb6a 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -68,3 +68,4 @@
 063 auto balance replace
 064 auto subvol replace
 065 auto subvol scrub
+066 auto subvol defrag compress
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 09/15] btrfs: new case to run subvolume create/delete and remount with defferent compress algorithms

2014-09-17 Thread Eryu Guan

Run btrfs subvolume create/mount/umount/delete and remount with
different compress algorithms simultaneously, with fsstress running in
background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/067 | 116 
 tests/btrfs/067.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 119 insertions(+)
 create mode 100755 tests/btrfs/067
 create mode 100644 tests/btrfs/067.out

diff --git a/tests/btrfs/067 b/tests/btrfs/067
new file mode 100755
index 000..033f3a5
--- /dev/null
+++ b/tests/btrfs/067
@@ -0,0 +1,116 @@
+#! /bin/bash
+# FSQA Test No. btrfs/067
+#
+# Run btrfs subvolume create/mount/umount/delete and remount with
+# different compress algorithms simultaneously, with fsstress running
+# in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local subvol_mnt=$TEST_DIR/$seq.mnt
+
+   echo Test $mkfs_opts with $with_compress $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start subvolume worker:  $seqres.full
+   _btrfs_stress_subvolume $SCRATCH_DEV $SCRATCH_MNT subvol_$$ $subvol_mnt 
/dev/null 21 
+   subvol_pid=$!
+   echo $subvol_pid $seqres.full
+
+   echo -n Start remount worker:  $seqres.full
+   _btrfs_stress_remount_compress $SCRATCH_MNT /dev/null 21 
+   remount_pid=$!
+   echo $remount_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+
+   kill $subvol_pid $remount_pid
+   wait
+   # wait for the remount loop process to finish
+   while ps aux | grep mount.*$SCRATCH_MNT | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   # in case the subvolume is still mounted
+   $UMOUNT_PROG $subvol_mnt /dev/null 21
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/067.out b/tests/btrfs/067.out
new file mode 100644
index 000..daa1545
--- /dev/null
+++ b/tests/btrfs/067.out
@@ -0,0 +1,2 @@
+QA output created by 067
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index bbfbb6a..d515ff5 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -69,3 +69,4 @@
 064 auto subvol replace
 065 auto subvol scrub
 066 auto subvol defrag compress
+067 auto subvol remount compress
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 11/15] btrfs: new case to run device replace and defrag operations simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs replace operations and defrag simultaneously with fsstress
running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/069 | 125 
 tests/btrfs/069.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 128 insertions(+)
 create mode 100755 tests/btrfs/069
 create mode 100644 tests/btrfs/069.out

diff --git a/tests/btrfs/069 b/tests/btrfs/069
new file mode 100755
index 000..882b34b
--- /dev/null
+++ b/tests/btrfs/069
@@ -0,0 +1,125 @@
+#! /bin/bash
+# FSQA Test No. btrfs/069
+#
+# Run btrfs replace operations and defrag simultaneously with fsstress
+# running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 5
+_require_scratch_dev_pool_equal_size
+_btrfs_get_profile_configs replace
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local with_compress=$2
+   local saved_scratch_dev_pool=$SCRATCH_DEV_POOL
+
+   echo Test $mkfs_opts with $with_compress $seqres.full
+
+   # remove the last device from the SCRATCH_DEV_POOL list so
+   # _scratch_pool_mkfs won't use all devices in pool
+   local last_dev=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`
+   SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e s# *$last_dev *##`
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+   return
+   fi
+   _scratch_mount $seqres.full 21
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start replace worker:  $seqres.full
+   _btrfs_stress_replace $SCRATCH_MNT $seqres.full 21 
+   replace_pid=$!
+   echo $replace_pid $seqres.full
+
+   echo -n Start defrag worker:  $seqres.full
+   _btrfs_stress_defrag $SCRATCH_MNT $with_compress /dev/null 21 
+   defrag_pid=$!
+   echo $defrag_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $replace_pid $defrag_pid
+   wait
+
+   # wait for the defrag and replace operations to finish
+   while ps aux | grep btrfs filesystem defrag | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep replace start | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t nocompress
+   run_test $t compress
+done
+
+status=0
+exit
diff --git a/tests/btrfs/069.out b/tests/btrfs/069.out
new file mode 100644
index 000..532a929
--- /dev/null
+++ b/tests/btrfs/069.out
@@ -0,0 +1,2 @@
+QA output created by 069
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 1e83505..53f5f4b 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -71,3 +71,4 @@
 066 auto subvol defrag compress
 067 auto subvol remount compress
 068 auto replace scrub
+069 auto replace defrag compress
-- 
1.8.3.1

--
To

[PATCH v3 04/15] btrfs: new case to run btrfs balance and remount with different compress algorithms

2014-09-17 Thread Eryu Guan

Run btrfs balance and remount with different compress algorithms
simultaneously, with fsstress running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 common/rc   |  14 +++
 tests/btrfs/062 | 114 
 tests/btrfs/062.out |   2 +
 tests/btrfs/group   |   1 +
 4 files changed, 131 insertions(+)
 create mode 100755 tests/btrfs/062
 create mode 100644 tests/btrfs/062.out

diff --git a/common/rc b/common/rc
index a333af8..571dfad 100644
--- a/common/rc
+++ b/common/rc
@@ -2614,6 +2614,20 @@ _btrfs_stress_defrag()
done
 }
 
+# stress btrfs by remounting it with different compression algorithms in a loop
+# run this with fsstress running at background could exercise the compression
+# code path and ensure no race when switching compression algorithm with 
constant
+# I/O activity.
+_btrfs_stress_remount_compress()
+{
+   local btrfs_mnt=$1
+   while true; do
+   for algo in no zlib lzo; do
+   $MOUNT_PROG -o remount,compress=$algo $btrfs_mnt
+   done
+   done
+}
+
 init_rc()
 {
if [ $iam == new ]
diff --git a/tests/btrfs/062 b/tests/btrfs/062
new file mode 100755
index 000..274da5c
--- /dev/null
+++ b/tests/btrfs/062
@@ -0,0 +1,114 @@
+#! /bin/bash
+# FSQA Test No. btrfs/062
+#
+# Run btrfs balance and remount with different compress algorithms
+# simultaneously, with fsstress running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+
+   echo Test $mkfs_opts $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start balance worker:  $seqres.full
+   _btrfs_stress_balance $SCRATCH_MNT /dev/null 21 
+   balance_pid=$!
+   echo $balance_pid $seqres.full
+
+   echo -n Start remount worker:  $seqres.full
+   _btrfs_stress_remount_compress $SCRATCH_MNT /dev/null 21 
+   remount_pid=$!
+   echo $remount_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $balance_pid $remount_pid
+   wait
+   # wait for the balance and remount loop to finish
+   while ps aux | grep balance start | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep mount.*$SCRATCH_MNT | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/062.out b/tests/btrfs/062.out
new file mode 100644
index 000..a1578f4
--- /dev/null
+++ b/tests/btrfs/062.out
@@ -0,0 +1,2 @@
+QA output created by 062
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 85d30b3..7aa9bf3 100644
---

[PATCH v3 05/15] btrfs: new case to run btrfs balance and device replace operations simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs balance and replace operations simultaneously with fsstress
running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 common/rc   |  73 +++
 tests/btrfs/063 | 122 
 tests/btrfs/063.out |   2 +
 tests/btrfs/group   |   1 +
 4 files changed, 198 insertions(+)
 create mode 100755 tests/btrfs/063
 create mode 100644 tests/btrfs/063.out

diff --git a/common/rc b/common/rc
index 571dfad..3291823 100644
--- a/common/rc
+++ b/common/rc
@@ -2157,6 +2157,24 @@ _require_scratch_dev_pool()
done
 }
 
+# ensure devices in SCRATCH_DEV_POOL are of the same size
+# must be called after _require_scratch_dev_pool
+_require_scratch_dev_pool_equal_size()
+{
+   local _size
+   local _newsize
+   local _dev
+
+   # SCRATCH_DEV has been set to the first device in SCRATCH_DEV_POOL
+   _size=`_get_device_size $SCRATCH_DEV`
+   for _dev in $SCRATCH_DEV_POOL; do
+   _newsize=`_get_device_size $_dev`
+   if [ $_size -ne $_newsize ]; then
+   _notrun This test requires devices in SCRATCH_DEV_POOL 
have the same size
+   fi
+   done
+}
+
 # We will check if the device is deletable
 _require_deletable_scratch_dev_pool()
 {
@@ -2628,6 +2646,61 @@ _btrfs_stress_remount_compress()
done
 }
 
+# stress btrfs by replacing devices in a loop
+# Note that at least 3 devices are needed in SCRATCH_DEV_POOL and the last
+# device should be free(not used by btrfs)
+_btrfs_stress_replace()
+{
+   local btrfs_mnt=$1
+
+   # The device number in SCRATCH_DEV_POOL should be at least 3,
+   # one is SCRATCH_DEV, one is to be replaced, one is free device
+   # we won't replace SCRATCH_DEV, see below for reason
+   if [ `echo $SCRATCH_DEV_POOL | wc -w` -lt 3 ]; then
+   echo _btrfs_stress_replace requires at least 3 devices in 
SCRATCH_DEV_POOL
+   return
+   fi
+
+   # take the last device as the first free_dev
+   local free_dev=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`
+
+   # free_dev should be really free
+   if $BTRFS_UTIL_PROG filesystem show | grep -q $free_dev; then
+   echo _btrfs_stress_replace: $free_dev is used by btrfs
+   return
+   fi
+
+   # dev_pool is device list being currently used by btrfs (excluding 
SCRATCH_DEV)
+   # and can be replaced. We don't replace SCRATCH_DEV because it will be 
used in
+   # _scratch_mount and _check_scratch_fs etc.
+   local dev_pool=`echo $SCRATCH_DEV_POOL | sed -e s# *$SCRATCH_DEV *## \
+   -e s# *$free_dev *##`
+
+   # set the first device in dev_pool as the first src_dev to be replaced
+   local src_dev=`echo $dev_pool | $AWK_PROG '{print $1}'`
+
+   echo dev_pool=$dev_pool
+   echo free_dev=$free_dev, src_dev=$src_dev
+   while true; do
+   echo Replacing $src_dev with $free_dev
+   $BTRFS_UTIL_PROG replace start -fB $src_dev $free_dev $btrfs_mnt
+   if [ $? -ne 0 ]; then
+   # don't update src_dev and free_dev if replace failed
+   continue
+   fi
+   dev_pool=$dev_pool $free_dev
+   dev_pool=`echo $dev_pool | sed -e s# *$src_dev *##`
+   free_dev=$src_dev
+   src_dev=`echo $dev_pool | $AWK_PROG '{print $1}'`
+   done
+}
+
+# return device size in kb
+_get_device_size()
+{
+   grep `_short_dev $1` /proc/partitions | awk '{print $3}'
+}
+
 init_rc()
 {
if [ $iam == new ]
diff --git a/tests/btrfs/063 b/tests/btrfs/063
new file mode 100755
index 000..94adc21
--- /dev/null
+++ b/tests/btrfs/063
@@ -0,0 +1,122 @@
+#! /bin/bash
+# FSQA Test No. btrfs/063
+#
+# Run btrfs balance and replace operations simultaneously with fsstress
+# running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f

[PATCH v3 14/15] btrfs: new case to run btrfs scrub and remount with different compress algorithms simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs scrub and remount with different compress algorithms
simultaneously with fsstress running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/072 | 114 
 tests/btrfs/072.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 117 insertions(+)
 create mode 100755 tests/btrfs/072
 create mode 100644 tests/btrfs/072.out

diff --git a/tests/btrfs/072 b/tests/btrfs/072
new file mode 100755
index 000..b185f85
--- /dev/null
+++ b/tests/btrfs/072
@@ -0,0 +1,114 @@
+#! /bin/bash
+# FSQA Test No. btrfs/072
+#
+# Run btrfs scrub and remount with different compress algorithms
+# simultaneously with fsstress running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+
+   echo Test $mkfs_opts $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start scrub worker:  $seqres.full
+   _btrfs_stress_scrub $SCRATCH_MNT /dev/null 21 
+   scrub_pid=$!
+   echo $scrub_pid $seqres.full
+
+   echo -n Start remount worker:  $seqres.full
+   _btrfs_stress_remount_compress $SCRATCH_MNT /dev/null 21 
+   remount_pid=$!
+   echo $remount_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $scrub_pid $remount_pid
+   wait
+   # wait for the scrub and remount operations to finish
+   while ps aux | grep scrub start | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep mount.*$SCRATCH_MNT | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/072.out b/tests/btrfs/072.out
new file mode 100644
index 000..590bbc6
--- /dev/null
+++ b/tests/btrfs/072.out
@@ -0,0 +1,2 @@
+QA output created by 072
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 5377acf..e631d5b 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -74,3 +74,4 @@
 069 auto replace defrag compress
 070 auto replace remount compress
 071 auto scrub defrag compress
+072 auto scrub remount compress
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 13/15] btrfs: new case to run btrfs scrub and defrag operations simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs scrub and defrag operations simultaneously with fsstress
running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/071 | 116 
 tests/btrfs/071.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 119 insertions(+)
 create mode 100755 tests/btrfs/071
 create mode 100644 tests/btrfs/071.out

diff --git a/tests/btrfs/071 b/tests/btrfs/071
new file mode 100755
index 000..643b489
--- /dev/null
+++ b/tests/btrfs/071
@@ -0,0 +1,116 @@
+#! /bin/bash
+# FSQA Test No. btrfs/071
+#
+# Run btrfs scrub and defrag operations simultaneously with fsstress
+# running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 4
+_btrfs_get_profile_configs
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local with_compress=$2
+
+   echo Test $mkfs_opts with $with_compress $seqres.full
+
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   return
+   fi
+   _scratch_mount $seqres.full 21
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start scrub worker:  $seqres.full
+   _btrfs_stress_scrub $SCRATCH_MNT /dev/null 21 
+   scrub_pid=$!
+   echo $scrub_pid $seqres.full
+
+   echo -n Start defrag worker:  $seqres.full
+   _btrfs_stress_defrag $SCRATCH_MNT $with_compress /dev/null 21 
+   defrag_pid=$!
+   echo $defrag_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $scrub_pid $defrag_pid
+   wait
+   # wait for the scrub and defrag operations to finish
+   while ps aux | grep scrub start | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep btrfs filesystem defrag | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t nocompress
+   run_test $t compress
+done
+
+status=0
+exit
diff --git a/tests/btrfs/071.out b/tests/btrfs/071.out
new file mode 100644
index 000..9a9ef40
--- /dev/null
+++ b/tests/btrfs/071.out
@@ -0,0 +1,2 @@
+QA output created by 071
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index f68370d..5377acf 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -73,3 +73,4 @@
 068 auto replace scrub
 069 auto replace defrag compress
 070 auto replace remount compress
+071 auto scrub defrag compress
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 12/15] btrfs: new case to run device replace and remount with different compress algorithms simultaneously

2014-09-17 Thread Eryu Guan

Run btrfs replace operations and remount with different compress
algorithms simultaneously with fsstress running in background.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/070 | 123 
 tests/btrfs/070.out |   2 +
 tests/btrfs/group   |   1 +
 3 files changed, 126 insertions(+)
 create mode 100755 tests/btrfs/070
 create mode 100644 tests/btrfs/070.out

diff --git a/tests/btrfs/070 b/tests/btrfs/070
new file mode 100755
index 000..4f76455
--- /dev/null
+++ b/tests/btrfs/070
@@ -0,0 +1,123 @@
+#! /bin/bash
+# FSQA Test No. btrfs/070
+#
+# Run btrfs replace operations and remount with different compress
+# algorithms simultaneously with fsstress running in background.
+#
+#---
+# Copyright (C) 2014 Red Hat Inc. All rights reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo QA output created by $seq
+
+here=`pwd`
+tmp=/tmp/$$
+status=1
+trap _cleanup; exit \$status 0 1 2 3 15
+
+_cleanup()
+{
+   cd /
+   rm -f $tmp.*
+}
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+# we check scratch dev after each loop
+_require_scratch_nocheck
+_require_scratch_dev_pool 5
+_require_scratch_dev_pool_equal_size
+_btrfs_get_profile_configs replace
+
+rm -f $seqres.full
+
+run_test()
+{
+   local mkfs_opts=$1
+   local saved_scratch_dev_pool=$SCRATCH_DEV_POOL
+
+   echo Test $mkfs_opts $seqres.full
+
+   # remove the last device from the SCRATCH_DEV_POOL list so
+   # _scratch_pool_mkfs won't use all devices in pool
+   local last_dev=`echo $SCRATCH_DEV_POOL | $AWK_PROG '{print $NF}'`
+   SCRATCH_DEV_POOL=`echo $SCRATCH_DEV_POOL | sed -e s# *$last_dev *##`
+   _scratch_pool_mkfs $mkfs_opts $seqres.full 21
+   # make sure we created btrfs with desired options
+   if [ $? -ne 0 ]; then
+   echo mkfs $mkfs_opts failed
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+   return
+   fi
+   _scratch_mount $seqres.full 21
+   SCRATCH_DEV_POOL=$saved_scratch_dev_pool
+
+   args=`_scale_fsstress_args -p 20 -n 100 $FSSTRESS_AVOID -d 
$SCRATCH_MNT/stressdir`
+   echo Run fsstress $args $seqres.full
+   $FSSTRESS_PROG $args /dev/null 21 
+   fsstress_pid=$!
+
+   echo -n Start replace worker:  $seqres.full
+   _btrfs_stress_replace $SCRATCH_MNT $seqres.full 21 
+   replace_pid=$!
+   echo $replace_pid $seqres.full
+
+   echo -n Start remount worker:  $seqres.full
+   _btrfs_stress_remount_compress $SCRATCH_MNT /dev/null 21 
+   remount_pid=$!
+   echo $remount_pid $seqres.full
+
+   echo Wait for fsstress to exit and kill all background workers 
$seqres.full
+   wait $fsstress_pid
+   kill $replace_pid $remount_pid
+   wait
+
+   # wait for the remount and replace operations to finish
+   while ps aux | grep btrfs filesystem defrag | grep -qv grep; do
+   sleep 1
+   done
+   while ps aux | grep mount.*$SCRATCH_MNT | grep -qv grep; do
+   sleep 1
+   done
+
+   echo Scrub the filesystem $seqres.full
+   $BTRFS_UTIL_PROG scrub start -B $SCRATCH_MNT $seqres.full 21
+   if [ $? -ne 0 ]; then
+   echo Scrub find errors in \$mkfs_opts\ test | tee -a 
$seqres.full
+   fi
+
+   _scratch_unmount
+   # we called _require_scratch_nocheck instead of _require_scratch
+   # do check after test for each profile config
+   _check_scratch_fs
+}
+
+echo Silence is golden
+for t in ${_btrfs_profile_configs[@]}; do
+   run_test $t
+done
+
+status=0
+exit
diff --git a/tests/btrfs/070.out b/tests/btrfs/070.out
new file mode 100644
index 000..8940c5d
--- /dev/null
+++ b/tests/btrfs/070.out
@@ -0,0 +1,2 @@
+QA output created by 070
+Silence is golden
diff --git a/tests/btrfs/group b/tests/btrfs/group
index 53f5f4b..f68370d 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -72,3 +72,4 @@
 067 auto subvol remount compress
 068 auto replace scrub
 069 auto replace defrag compress
+070 auto replace remount compress
-- 
1.8.3.1

--
To

Re: [PATCH 3/3] btrfs-progs: fix device missing of btrfs fi show with seeding devices

2014-09-17 Thread Anand Jain




 Hi Gui,

 Thanks for the attempt to fix this. more below..

On 09/18/2014 11:31 AM, Gui Hecheng wrote:

*Note*: this handles the problem under umounted state,
the problem under mounted state is already fixed by Anand.

Steps to reproduce:
# mkfs.btrfs -f /dev/sda1
# btrfstune -S 1 /dev/sda1
# mount /dev/sda1 /mnt
# btrfs dev add /dev/sda2 /mnt
# umount /mnt   == (umounted)
# btrfs fi show /dev/sda2
result:
Label: none  uuid: XX
Total devices 2 FS bytes used 368.00KiB
devid2 size 9.31GiB used 1.25GiB path /dev/sda2
*** Some devices missing
Btrfs v3.16-67-g69f54ea-dirty

It is because the @btrfs_scan_lblkid procedure is not capable of detecting
seeding devices since the seeding devices have different FSIDs from
derived devices. So when it tries to show all devices under the derived
fs, only the derived devices are shown.


 Hmm.. thats not true.  btrfs_scan_lblkid() finds all btrfs devices
 including the seed/sprout devices. However btrfs_scan_lblkid won't
 establish mapping between the seed and sprout devices.


Actually the @open_ctree deal with the seeding devices properly, so
we can make use of it to find seeding devices.
We call @open_ctree on every block device with a btrfs on it,
and all devices under the opening filesystem including the seed devices
will be ready to be shown.


 looking at the below code, I doubt if this will work with
 nested seed-sprout relations. ? what did I miss ?

 Its better to keep seed sprout mapping part separate from the device
 scan using lblkid.


Thanks, Anand



Signed-off-by: Gui Hecheng guihc.f...@cn.fujitsu.com
---
  cmds-filesystem.c | 104 --
  1 file changed, 69 insertions(+), 35 deletions(-)

diff --git a/cmds-filesystem.c b/cmds-filesystem.c
index dc5185e..f978175 100644
--- a/cmds-filesystem.c
+++ b/cmds-filesystem.c
@@ -28,6 +28,7 @@
  #include mntent.h
  #include linux/limits.h
  #include getopt.h
+#include blkid/blkid.h

  #include kerncompat.h
  #include ctree.h
@@ -268,10 +269,26 @@ static int cmp_device_id(void *priv, struct list_head *a,
da-devid  db-devid ? 1 : 0;
  }

+static void print_devices(struct btrfs_fs_devices *fs_devices, u64 *devs_found)
+{
+   struct btrfs_device *device;
+   struct list_head *cur;
+
+   list_sort(NULL, fs_devices-devices, cmp_device_id);
+   list_for_each(cur, fs_devices-devices) {
+   device = list_entry(cur, struct btrfs_device, dev_list);
+
+   printf(\tdevid %4llu size %s used %s path %s\n,
+   (unsigned long long)device-devid,
+   pretty_size(device-total_bytes),
+   pretty_size(device-bytes_used), device-name);
+   (*devs_found)++;
+   }
+}
+
  static void print_one_uuid(struct btrfs_fs_devices *fs_devices)
  {
char uuidbuf[BTRFS_UUID_UNPARSED_SIZE];
-   struct list_head *cur;
struct btrfs_device *device;
u64 devs_found = 0;
u64 total;
@@ -293,17 +310,10 @@ static void print_one_uuid(struct btrfs_fs_devices 
*fs_devices)
   (unsigned long long)total,
   pretty_size(device-super_bytes_used));

-   list_sort(NULL, fs_devices-devices, cmp_device_id);
-   list_for_each(cur, fs_devices-devices) {
-   device = list_entry(cur, struct btrfs_device, dev_list);
-
-   printf(\tdevid %4llu size %s used %s path %s\n,
-  (unsigned long long)device-devid,
-  pretty_size(device-total_bytes),
-  pretty_size(device-bytes_used), device-name);
+   if (fs_devices-seed)
+   print_devices(fs_devices-seed, devs_found);
+   print_devices(fs_devices, devs_found);

-   devs_found++;
-   }
if (devs_found  total) {
printf(\t*** Some devices missing\n);
}
@@ -489,6 +499,53 @@ out:
return ret;
  }

+static int scan_all_fs_lblkid(char *search_target)
+{
+   blkid_dev_iterate iter = NULL;
+   blkid_dev dev = NULL;
+   blkid_cache cache = NULL;
+   char path[PATH_MAX];
+   struct btrfs_fs_info *fs_info;
+   int found = 0;
+
+   if (blkid_get_cache(cache, 0)  0) {
+   printf(ERROR: lblkid cache get failed\n);
+   return -1;
+   }
+   blkid_probe_all(cache);
+   iter = blkid_dev_iterate_begin(cache);
+   blkid_dev_set_search(iter, TYPE, btrfs);
+   while (blkid_dev_next(iter, dev) == 0) {
+   dev = blkid_verify(cache, dev);
+   if (!dev)
+   continue;
+   strncpy(path, blkid_dev_devname(dev), PATH_MAX);
+   fs_info = open_ctree_fs_info(path, 0, 0, OPEN_CTREE_PARTIAL);
+   if (!fs_info)
+   continue;
+
+   if

58 matches

Mail list logo