On Fri, Nov 01, 2013 at 10:16:37AM +0100, Jan Schmidt wrote: > (cc Arne) > > On Thu, October 24, 2013 at 16:49 (+0200), Wang Shilong wrote: > > Hello Jan, > > > >> btrfs_dec_ref() queued a delayed ref for owner of a tree block. The qgroup > >> tracking is based on delayed refs. The owner of a tree block is set when a > >> tree block is allocated, it is never updated. > >> > >> When you allocate a tree block and then remove the subvolume that did the > >> allocation, the qgroup accounting for that removal is correct. However, the > >> removal was accounted again for each subvolume deletion that also > >> referenced > >> the tree block, because accounting was erroneously based on the owner. > >> > >> Instead of queueing delayed refs for the non-existent owner, we now > >> queue delayed refs for the root being removed. This fixes the qgroup > >> accounting. > > > > Thanks for tracking this, i apply your patch, and using the flowing patch, > > found the problem still exist, the test script like the following: > > > > #!/bin/sh > > > > for i in $(seq 1000) > > do > > dd if=/dev/zero > > of=<mnt>/$i""aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa > > bs=10K count=1 > > done > > > > btrfs sub snapshot <mnt> <mnt>/1 > > for i in $(seq 100) > > do > > btrfs sub snapshot <mnt>/$i <mnt>/$(($i+1)) > > done > > > > for i in $(seq 101) > > do > > btrfs sub delete <mnt>/$i > > done > > I've understood the problem this reproducer creates. In fact, you can shorten > it > dramatically. The story of qgroups is going to turn awkward at this point. > > mkfs and enable quota, put some data in (needs a level 2 tree) > -> this accounts rfer and excl for qgroup 5 > > take a snapshot > -> this creates qgroup 257, which gets rfer(257) = rfer(5) and excl(257) = 0, > excl(5) = 0. > > now make sure you don't cow anything (which we always did in our extensive > tests), just drop the newly created snapshot. > -> excl(5) ought to become what it was before the snapshot, and there's no > code > for this. This is because there is node code that brings rfer(257) to zero, > the > data extents are not touched because the tree blocks of 5 and 257 are shared. > > Drop tree does not go down the whole tree, when it finds a tree block with > refcnt > 1 it just decrements it and is done. This is very efficient but is > bad > the qgroup numbers. > > We have got three possibile solutions in mind: > > A: Always walk down the whole tree for quota-enabled fs tree drops. Can be > done > with the read-ahead code, but is potentially a whole lot of work for large > file > systems. >
No. Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html