On Wed, Nov 04, 2015 at 09:01:36AM +0800, Qu Wenruo wrote:
> 
> 
> Mark Fasheh wrote on 2015/11/03 11:26 -0800:
> >On Mon, Nov 02, 2015 at 09:34:24AM +0800, Qu Wenruo wrote:
> >>
> >>
> >>Stefan Priebe wrote on 2015/11/01 21:49 +0100:
> >>>Hi,
> >>>
> >>>this one: http://www.spinics.net/lists/linux-btrfs/msg47377.html
> >>>
> >>>adds a regression to my test systems with very large disks (30tb and 50tb).
> >>>
> >>>btrfs balance is super slow afterwards while heavily making use of cp
> >>>--reflink=always on big files (200gb - 500gb).
> >>>
> >>>Sorry didn't know how to correctly reply to that "old" message.
> >>>
> >>>Greets,
> >>>Stefan
> >>>--
> >>>To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> >>>the body of a message to majord...@vger.kernel.org
> >>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >>Thanks for the testing.
> >>
> >>Are you using qgroup or just doing normal balance with qgroup disabled?
> >>
> >>For the latter case, that's should be optimized to skip the dirty
> >>extent insert in qgroup disabled case.
> >>
> >>For qgroup enabled case, I'm afraid that's the design.
> >>As relocation will drop a subtree to relocate, and to ensure qgroup
> >>consistent, we must walk down all the tree blocks and mark them
> >>dirty for later qgroup accounting.
> >
> >Qu, we're always going to have to walk the tree when deleting it, this is
> >part of removing a subvolume. We've walked shared subtrees in this code for
> >numerous kernel releases without incident before it was removed in 4.2.
> >
> >Do you have any actual evidence that this is a major performance regression?
> > From our previous conversations you seemed convinced of this, without even
> >having a working subtree walk to test. I remember the hand wringing
> >about an individual commit being too heavy with the qgroup code (even though
> >I pointed out that tree walk is a restartable transaction).
> >
> >It seems that you are confused still about how we handle removing a volume
> >wrt qgroups.
> >
> >If you have questions or concerns I would be happy to explain them but
> >IMHO your statements there are opinion and not based in fact.
> 
> Yes, I don't deny it.
> But it's quite hard to prove it, as we need such a huge storage like Stefan.
> What I have is only several hundred GB test storage.
> Even accounting all my home NAS, I only have 2T, far from the
> storage Stefan has.
> 
> And what Stefan report should already give some hint about the
> performance issue.
> 
> In your word "it won't be doing anything (ok some kmalloc/free of a
> very tiny object)", it's already slowing down balance, since balance
> also use btrfs_drop_subtree().

When I wrote that I was under the impression that the qgroup code was doing
it's own sanity checking (it used to) and since Stephan had them disabled
they couldn't be causing the problem. I read your e-mail explaining that the
qgroup api was now intertwined with delayed ref locking after this one.

The same exact code ran in either case before and after your patches, so my
guess is that the issue is actually inside the qgroup code that shouldn't
have been run. I wonder if we even just filled up his memory but never
cleaned the objects. The only other thing I can think of is if
account_leaf_items() got run in a really tight loop for some reason.

Kmalloc in the way we are using it is not usually a performance issue,
especially if we've been reading off disk in the same process. Ask yourself
this - your own patch series does the same kmalloc for every qgroup
operation. Did you notice a complete and massive performance slowdown like
the one Stefan reported?

I will say that we never had this problem reported before, and
account_leaf_items() is always run in all kernels, even without qgroups
enabled. That will change with my new patch though.

What we can say for sure is that drop_snapshot in the qgroup case will read
more disk and obviously that will have a negative impact depending on what
the tree looks like. So IMHO we ought to be focusing on reducing the amount
of I/O involved.


> But we can't just ignore such "possible" performance issue just
> because old code did the same thing.(Although not the same now,
> we're marking all subtree blocks dirty other than shared one).

Well, I can't disagree with that - the only reason we are talking right now
is because you intentionally ignored the qgroup code in drop_snapshot(). So
let's start with this - no more 'fixing' code by tearing it out and replacing
it with /* TODO: somebody else re-implement this */   ;)
        --Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to