On Friday, 28 September 2018 14:18:59 BST Steven Whitehouse wrote: > Hi, > > On 28/09/18 13:50, Mark Syms wrote: > > Hi Bon, > > > > The patches look quite good and would seem to help in the intra-node > > congestion case, which our first patch was trying to do. We haven't tried > > them yet but I'll pull a build together and try to run it over the > > weekend. > > > > We don't however, see that they would help in the situation we saw for the > > second patch where rgrp glocks would get bounced around between hosts at > > high speed and cause lots of state flushing to occur in the process as > > the stats don't take any account of anything other than network latency > > whereas there is more involved with a rgrp glock when state needs to be > > flushed. > > > > Any thoughts on this? > > > > Thanks, > > > > Mark. > > There are a few points here... the stats measure the latency of the DLM > requests. Since in order to release a lock, some work has to be done, > and the lock is not released until that work is complete, the stats do > include that in their timings.
I think what's happening for us is that the work that needs to be done to release an rgrp lock is happening pretty fast and is about the same in all cases, so the stats are not providing a meaningful distinction. We see the same lock (or small number of locks) bouncing back and forth between nodes with neither node seeming to consider them congested enough to avoid, even though the FS is <50% full and there must be plenty of other non-full rgrps. -- Tim Smith <[email protected]>
