Re: subvol copying

2013-05-15 Thread Gabriel de Perthuis
A user of a workstation has a home directory /home/john as a subvolume. I wrote a cron job to make read-only snapshots of it under /home/john/backup which was fortunate as they just ran a script that did something like rm -rf ~. Apart from copying dozens of gigs of data back, is there a

[PATCH 01/17] Btrfs: fix accessing a freed tree root

2013-05-15 Thread Miao Xie
inode_tree_del() will move the tree root into the dead root list, and then the tree will be destroyed by the cleaner. So if we remove the delayed node which is cached in the inode after inode_tree_del(), we may access a freed tree root. Fix it. Signed-off-by: Miao Xie mi...@cn.fujitsu.com ---

[PATCH 02/17] Btrfs: fix unprotected root node of the subvolume's inode rb-tree

2013-05-15 Thread Miao Xie
The root node of the rb-tree may be changed, so we should get it under the lock. Fix it. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/inode.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 7f6e78a..bf5c399 100644

[PATCH 07/17] Btrfs: don't invoke btrfs_invalidate_inodes() in the spin lock context

2013-05-15 Thread Miao Xie
btrfs_invalidate_inodes() may sleep, so we should not invoke it in the spin lock context. Fix it. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/disk-io.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 642c861..724a0da 100644

[PATCH 08/17] Btrfs: introduce per-subvolume delalloc inode list

2013-05-15 Thread Miao Xie
When we create a snapshot, we need flush all delalloc inodes in the fs, just flushing the inodes in the source tree is OK. So we introduce per-subvolume delalloc inode list. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/ctree.h | 22 --- fs/btrfs/dev-replace.c | 2 +-

[PATCH 05/17] Btrfs: cleanup the similar code of the fs root read

2013-05-15 Thread Miao Xie
There are several functions whose code is similar, such as btrfs_find_last_root() btrfs_read_fs_root_no_radix() Besides that, some functions are invoked twice, it is unnecessary, for example, we are sure that all roots which is found in btrfs_find_orphan_roots() have their orphan items, so

[PATCH 14/17] Btrfs: don't flush the delalloc inodes in the while loop if flushoncommit is set

2013-05-15 Thread Miao Xie
It is unnecessary to flush the delalloc inodes again and again because we don't care the dirty pages which are introduced after the flush, and they will be flush in the transaction commit. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/transaction.c | 26 ++ 1

[PATCH 16/17] Btrfs: remove the time check in btrfs_commit_transaction()

2013-05-15 Thread Miao Xie
We checked the commit time to avoid committing the transaction frequently, but it is unnecessary because: - It made the transaction commit spend more time, and delayed the operation of the external writers(TRANS_START/TRANS_USERSPACE). - Except the space that we have to commit transaction, such

[PATCH 13/17] Btrfs: don't wait for all the writers circularly during the transaction commit

2013-05-15 Thread Miao Xie
btrfs_commit_transaction has the following loop before we commit the transaction. do { // attempt to do some useful stuff and/or sleep } while (atomic_read(cur_trans-num_writers) 1 || (should_grow cur_trans-num_joined != joined)); This is used to prevent from the TRANS_START to

[PATCH 17/17] Btrfs: make the state of the transaction more readable

2013-05-15 Thread Miao Xie
We used 3 variants to track the state of the transaction, it was complex and wasted the memory space. Besides that, it was hard to understand that which types of the transaction handles should be blocked in each transaction state, so the developers often made mistakes. This patch improved the

[PATCH 12/17] Btrfs: remove the code for the impossible case in cleanup_transaction()

2013-05-15 Thread Miao Xie
If the transaction is removed from the transaction list, it means the transaction has been committed successfully. So it is impossible to call cleanup_transaction(), otherwise there is something wrong with the code logic. Thus, we use BUG_ON() instead of the original handle. Signed-off-by: Miao

[PATCH 03/17] Btrfs: pause the space balance when remounting to R/O

2013-05-15 Thread Miao Xie
Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/super.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index a4807ce..f0857e0 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1263,6 +1263,7 @@ static int btrfs_remount(struct super_block

[PATCH 10/17] Btrfs: just flush the delalloc inodes in the source tree before snapshot creation

2013-05-15 Thread Miao Xie
Before applying this patch, we need flush all the delalloc inodes in the fs when we want to create a snapshot, it wastes time, and make the transaction commit be blocked for a long time. It means some other user operation would also be blocked for a long time. This patch improves this problem, we

[PATCH 09/17] Btrfs: introduce per-subvolume ordered extent list

2013-05-15 Thread Miao Xie
The reason we introduce per-subvolume ordered extent list is the same as the per-subvolume delalloc inode list. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/ctree.h| 25 --- fs/btrfs/dev-replace.c | 4 +- fs/btrfs/disk-io.c | 45 +++-

[PATCH 11/17] Btrfs: cleanup unnecessary assignment when cleaning up all the residual transaction

2013-05-15 Thread Miao Xie
When we umount a fs with serious errors, we will invoke btrfs_cleanup_transactions() to clean up the residual transaction. At this time, It is impossible to start a new transaction, so we needn't assign trans_no_join to 1, and also needn't clear running transaction every time we destroy a

[PATCH 04/17] Btrfs: remove BUG_ON() in btrfs_read_fs_tree_no_radix()

2013-05-15 Thread Miao Xie
We have checked if -node is NULL or not, so it is unnecessary to use BUG_ON() to check again. Remove it. Signed-off-by: Miao Xie mi...@cn.fujitsu.com --- fs/btrfs/disk-io.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 2a9ae38..8c1e4fb 100644

[PATCH 15/17] Btrfs: remove unnecessary varient -num_joined in btrfs_transaction structure

2013-05-15 Thread Miao Xie
We used -num_joined track if there were some writers which join the current transaction when the committer was sleeping. If some writers joined the current transaction, we has to continue the while loop to do some necessary stuff, such as flush the ordered operations. But it is unnecessary because

[PATCH 00/17] improve the block time during the transaction commit

2013-05-15 Thread Miao Xie
This patchset improve the problem that the transaction may be blocked for a long time when it is being committed if there is heavy I/O. In this patchset, - 0001-0005, 0007, 0011-0012 are random fix or code cleanup patch. - 0006, 0008-0010 introduce per-subvolume delalloc inode list and ordered

[PATCH 06/17] Btrfs: introduce grab/put functions for the root of the fs/file tree

2013-05-15 Thread Miao Xie
The grab/put funtions will be used in the next patch, which need grab the root object and ensure it is not freed. We use reference counter instead of the srcu lock is to aovid blocking the memory reclaim task, which invokes synchronize_srcu(). Signed-off-by: Miao Xie mi...@cn.fujitsu.com ---

Re: [PATCH v2 0/8] Btrfs: introduce a tree for UUID to subvol ID mapping

2013-05-15 Thread Stefan Behrens
On Tue, 14 May 2013 19:11:54 +0200, Stefan Behrens wrote: On Tue, 14 May 2013 18:55:23 +0800, Liu Bo wrote: On Tue, May 14, 2013 at 11:36:52AM +0200, Stefan Behrens wrote: Mapping UUIDs to subvolume IDs is an operation with a high effort today. Today, the algorithm even has quadratic effort

Re: [PATCH] Btrfs: increase the max global reserve size to 1gig

2013-05-15 Thread David Sterba
On Fri, May 03, 2013 at 11:28:35PM +0800, Liu Bo wrote: On Fri, May 03, 2013 at 08:56:54AM -0400, Josef Bacik wrote: Apparently 512mb was too small, with a fs_mark command we could get so much delayed work built up that we'd never trip the lets commit the transaction logic until we'd gotten

Re: [PATCH] xfstests btrfs/284: shorten duration, fix output

2013-05-15 Thread Rich Johnston
On 04/26/2013 01:45 PM, Eric Sandeen wrote: test 284 had... some issues. diff --git a/tests/btrfs/284 b/tests/btrfs/284 old mode 100644 new mode 100755 index d952977..67161a3 --- a/tests/btrfs/284 +++ b/tests/btrfs/284 This patch has been committed: commit

Re: [PATCH v2 0/8] Btrfs: introduce a tree for UUID to subvol ID mapping

2013-05-15 Thread David Sterba
On Tue, May 14, 2013 at 07:11:54PM +0200, Stefan Behrens wrote: # of subvols | without| with in filesystem | UUID tree | UUID tree --++-- 2 | 0m00.004s | 0m00.003s 1000 | 0m07.010s | 0m00.004s 2000 | 0m28.210s | 0m00.004s

Re: I/O errors block the entire filesystem

2013-05-15 Thread Josef Bacik
On Sat, May 11, 2013 at 01:16:38AM -0600, Alexandre Oliva wrote: On Apr 4, 2013, Alexandre Oliva ol...@gnu.org wrote: I've been trying to figure out the btrfs I/O stack to try to understand why, sometimes (but not always), after a failure to read a (data non-replicated) block from the

[PATCH] Btrfs-progs: detect when scrub is started twice

2013-05-15 Thread Stefan Behrens
Check whether any involved device is already busy running a scrub. This would cause damaged status messages and the state aborted without the explanation that a scrub was already running. Therefore check it first, prevent it and give some feedback to the user if scrub is already running. Note that

Re: [PATCH] Btrfs: increase the max global reserve size to 1gig

2013-05-15 Thread David Sterba
On Fri, May 03, 2013 at 08:56:54AM -0400, Josef Bacik wrote: Apparently 512mb was too small, with a fs_mark command we could get so much delayed work built up that we'd never trip the lets commit the transaction logic until we'd gotten too much delayed refs built up. Increasing this to 1 gig

kernel BUG at fs/btrfs/free-space-cache.c:1567!, Kernel 3.9.1

2013-05-15 Thread Philipp Dreimann
Hello, my btrfs filesystem was not mountable anymore after a loss of power: kernel BUG at fs/btrfs/free-space-cache.c:1567! invalid opcode: [#1] SMP Modules linked in: btrfs libcrc32c xor zlib_deflate raid6_pq i915(+) i2c_algo_bit drm_kms_helper drm i2c_core video uinput CPU 3 Pid: 147,

Re: [PATCH] Btrfs-progs: fix missing recow roots when making btrfs filesystem

2013-05-15 Thread David Sterba
On Tue, May 14, 2013 at 07:50:28PM +0800, Wang Shilong wrote: When making btrfs filesystem. we firstly write root leaf to specified filed, and then we recow the root. If we don't recow, some trees are not in the correct block group. Steps to reproduce: dd if=/dev/zero of=test.img

Re: kernel BUG at fs/btrfs/free-space-cache.c:1567!, Kernel 3.9.1

2013-05-15 Thread Harald Glatt
On Wed, May 15, 2013 at 4:19 PM, Philipp Dreimann phil...@dreimann.net wrote: Hello, my btrfs filesystem was not mountable anymore after a loss of power: kernel BUG at fs/btrfs/free-space-cache.c:1567! invalid opcode: [#1] SMP Modules linked in: btrfs libcrc32c xor zlib_deflate

Re: kernel BUG at fs/btrfs/free-space-cache.c:1567!, Kernel 3.9.1

2013-05-15 Thread Philipp Dreimann
On 15 May 2013 17:00, Harald Glatt m...@hachre.de wrote: On Wed, May 15, 2013 at 4:19 PM, Philipp Dreimann phil...@dreimann.net wrote: Hello, my btrfs filesystem was not mountable anymore after a loss of power: kernel BUG at fs/btrfs/free-space-cache.c:1567! invalid opcode: [#1] SMP

Re: [PATCH v2 4/8] Btrfs: maintain subvolume items in the UUID tree

2013-05-15 Thread Stefan Behrens
On Tue, 14 May 2013 18:44:11 +0800, Liu Bo wrote: On Tue, May 14, 2013 at 11:36:56AM +0200, Stefan Behrens wrote: @@ -396,7 +403,7 @@ static noinline int create_subvol(struct inode *dir, * of create_snapshot(). */ ret = btrfs_subvolume_reserve_metadata(root, block_rsv, -

Re: subvol copying

2013-05-15 Thread Chris Murphy
On May 15, 2013, at 1:40 AM, Gabriel de Perthuis g2p.c...@gmail.com wrote: You can move subvolumes at any time, as if they were regular directories. In the example case, the subvolumes are read-only. So is it possible to make a read-only subvolume (snapshot) read-writable? And is it

Re: subvol copying

2013-05-15 Thread Harald Glatt
On Wed, May 15, 2013 at 6:43 PM, Chris Murphy li...@colorremedies.com wrote: On May 15, 2013, at 1:40 AM, Gabriel de Perthuis g2p.c...@gmail.com wrote: You can move subvolumes at any time, as if they were regular directories. In the example case, the subvolumes are read-only. So is it

Re: subvol copying

2013-05-15 Thread Chris Murphy
On May 15, 2013, at 10:44 AM, Harald Glatt m...@hachre.de wrote: You make a ro snapshot rw by creating a snapshot of it that is rw. So yes to both questions, by doing the same thing in both cases. In other words, a normal snapshot (without -r) of a read-only snapshot will create a rw

Re: subvol copying

2013-05-15 Thread Harald Glatt
On Wed, May 15, 2013 at 7:28 PM, Chris Murphy li...@colorremedies.com wrote: On May 15, 2013, at 10:44 AM, Harald Glatt m...@hachre.de wrote: You make a ro snapshot rw by creating a snapshot of it that is rw. So yes to both questions, by doing the same thing in both cases. In other words,

[PATCH] Btrfs-progs: add a newline to a free space cache message

2013-05-15 Thread Josef Bacik
Left out a newline in the generation check printf. Signed-off-by: Josef Bacik jba...@fusionio.com --- free-space-cache.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/free-space-cache.c b/free-space-cache.c index 5fb8ece..a30438b 100644 --- a/free-space-cache.c +++

Re: I/O errors block the entire filesystem

2013-05-15 Thread Alexandre Oliva
On May 14, 2013, Liu Bo bo.li@oracle.com wrote: In one of the failures that caused machine load spikes, I tried to collect info on active processes with perf top and SysRq-T, but nothing there seemed to explain the spike. Thoughts on how to figure out what's causing this? Although I've

Re: I/O errors block the entire filesystem

2013-05-15 Thread Alexandre Oliva
On May 15, 2013, Josef Bacik jba...@fusionio.com wrote: So this should only happen in the case that you are on a dm device it looks like, is that how you are running? That was my first thought, but no, I'm using partitions out of the SATA disks directly. I even checked for stray dm out of

Re: [RFC v0 1/4] vfs: add copy_range syscall and vfs entry point

2013-05-15 Thread Eric Wong
Zach Brown z...@redhat.com wrote: This adds a syscall and vfs entry point for clone_range which offloads data copying between existing files. The syscall is a thin wrapper around the vfs entry point. Its arguments are inspired by sys_splice(). Why introduce a new syscall instead of

Re: [RFC v0 1/4] vfs: add copy_range syscall and vfs entry point

2013-05-15 Thread Zach Brown
On Wed, May 15, 2013 at 07:44:05PM +, Eric Wong wrote: Why introduce a new syscall instead of extending sys_splice? Personally, I think it's ugly to have different operations use the same syscall just because their arguments match. But that preference aside, sure, if the consensus is that

Re: [RFC v0 4/4] nfs, nfsd: rough sys_copy_range and COPY support

2013-05-15 Thread J. Bruce Fields
On Tue, May 14, 2013 at 02:15:26PM -0700, Zach Brown wrote: This crude patch illustrates the simplest plumbing involved in supporting sys_call_range with the NFS COPY operation that's pending in the 4.2 draft spec. The patch is based on a previous prototype that used the COPY op to

Re: [RFC v0 4/4] nfs, nfsd: rough sys_copy_range and COPY support

2013-05-15 Thread Myklebust, Trond
On Wed, 2013-05-15 at 16:19 -0400, J. Bruce Fields wrote: On Tue, May 14, 2013 at 02:15:26PM -0700, Zach Brown wrote: This crude patch illustrates the simplest plumbing involved in supporting sys_call_range with the NFS COPY operation that's pending in the 4.2 draft spec. The patch is

Re: [RFC v0 4/4] nfs, nfsd: rough sys_copy_range and COPY support

2013-05-15 Thread J. Bruce Fields
On Wed, May 15, 2013 at 08:21:54PM +, Myklebust, Trond wrote: On Wed, 2013-05-15 at 16:19 -0400, J. Bruce Fields wrote: On Tue, May 14, 2013 at 02:15:26PM -0700, Zach Brown wrote: This crude patch illustrates the simplest plumbing involved in supporting sys_call_range with the NFS

[PATCH] btrfs-progs: restore: use long option for the path regex

2013-05-15 Thread David Sterba
Current way of specifying the path to match is not very comfortable, but the feature itself is very useful. Let's save the short option -m for a more user friendly syntax and keep a long option --path-regex with the current syntax. CC: Peter Stuge pe...@stuge.se Signed-off-by: David Sterba

Re: [PATCH v2 4/8] Btrfs: maintain subvolume items in the UUID tree

2013-05-15 Thread Liu Bo
On Wed, May 15, 2013 at 05:39:35PM +0200, Stefan Behrens wrote: On Tue, 14 May 2013 18:44:11 +0800, Liu Bo wrote: On Tue, May 14, 2013 at 11:36:56AM +0200, Stefan Behrens wrote: @@ -396,7 +403,7 @@ static noinline int create_subvol(struct inode *dir, * of create_snapshot(). */

Re: [PATCH 10/17] Btrfs: just flush the delalloc inodes in the source tree before snapshot creation

2013-05-15 Thread Liu Bo
On Wed, May 15, 2013 at 03:48:24PM +0800, Miao Xie wrote: Before applying this patch, we need flush all the delalloc inodes in the fs when we want to create a snapshot, it wastes time, and make the transaction commit be blocked for a long time. It means some other user operation would also be

Re: [PATCH 06/17] Btrfs: introduce grab/put functions for the root of the fs/file tree

2013-05-15 Thread Liu Bo
On Wed, May 15, 2013 at 03:48:20PM +0800, Miao Xie wrote: The grab/put funtions will be used in the next patch, which need grab the root object and ensure it is not freed. We use reference counter instead of the srcu lock is to aovid blocking the memory reclaim task, which invokes

Re: [PATCH 10/17] Btrfs: just flush the delalloc inodes in the source tree before snapshot creation

2013-05-15 Thread Miao Xie
On Thu, 16 May 2013 11:20:39 +0800, Liu Bo wrote: On Wed, May 15, 2013 at 03:48:24PM +0800, Miao Xie wrote: Before applying this patch, we need flush all the delalloc inodes in the fs when we want to create a snapshot, it wastes time, and make the transaction commit be blocked for a long time.

Re: [PATCH 06/17] Btrfs: introduce grab/put functions for the root of the fs/file tree

2013-05-15 Thread Miao Xie
On thu, 16 May 2013 11:36:46 +0800, Liu Bo wrote: On Wed, May 15, 2013 at 03:48:20PM +0800, Miao Xie wrote: The grab/put funtions will be used in the next patch, which need grab the root object and ensure it is not freed. We use reference counter instead of the srcu lock is to aovid blocking

Re: [PATCH 06/17] Btrfs: introduce grab/put functions for the root of the fs/file tree

2013-05-15 Thread Liu Bo
On Thu, May 16, 2013 at 12:31:11PM +0800, Miao Xie wrote: On thu, 16 May 2013 11:36:46 +0800, Liu Bo wrote: On Wed, May 15, 2013 at 03:48:20PM +0800, Miao Xie wrote: The grab/put funtions will be used in the next patch, which need grab the root object and ensure it is not freed. We use

Re: [PATCH 06/17] Btrfs: introduce grab/put functions for the root of the fs/file tree

2013-05-15 Thread Miao Xie
On Thu, 16 May 2013 13:15:57 +0800, Liu Bo wrote: On Thu, May 16, 2013 at 12:31:11PM +0800, Miao Xie wrote: On thu, 16 May 2013 11:36:46 +0800, Liu Bo wrote: On Wed, May 15, 2013 at 03:48:20PM +0800, Miao Xie wrote: The grab/put funtions will be used in the next patch, which need grab the