On Friday, 28 September 2018 14:59:48 BST Bob Peterson wrote: > ----- Original Message ----- > > > I think what's happening for us is that the work that needs to be done to > > release an rgrp lock is happening pretty fast and is about the same in all > > cases, so the stats are not providing a meaningful distinction. We see the > > same lock (or small number of locks) bouncing back and forth between nodes > > with neither node seeming to consider them congested enough to avoid, even > > though the FS is <50% full and there must be plenty of other non-full > > rgrps. > > > > -- > > Tim Smith <[email protected]> > > Hi Tim, > > Interesting. > I've done experiments in the past where I allowed resource group glocks > to take advantage of the "minimum hold time" which is today only used for > inode glocks. In my experiments it's made no appreciable difference that I > can recall, but it might be an interesting experiment for you to try.
Our second patch does that, which should in theory give the stats calculation more to go on, but was mostly to allow a bit more work on a resource group when we do get it. It helps a bit, but doesn't really seem to keep us away from contended locks very well though we do get to hold on to them longer. I speculate that it will improve things like delete operations, but we haven't measured that specifically. We also add a timestamp when we are asked to demote a lock, and then pay attention to it only for rgrp locks in inplace_reserve, trying to stay away from rgrps we've been asked to demote recently unless we're desparate. That helps a *lot*; we see two nodes fight a bit, learn to stay clear of each other, and not fight again until the FS is ~80% full All our testing is done with multiple fio jobs per node, usually filling the FS from empty, but we occasionally run one with randwrite on the files we just laid out, just to make sure we didn't break the steady-state case. I like the idea of your intra-node patches more than my coin-tossing approach, so it'll be interesting to see what results we get when Mark runs them. > Steve's right that we need to be careful not to improve one aspect of > performance while causing another aspect's downfall, like improving > intra-node congestion problems at the expense of inter-node congestion. We're also rather keen on keeping multi-node performance high. Our initial problem was that a single node was going so slowly even without competition that we couldn't reason about multiple nodes. -- Tim Smith <[email protected]>
