Re: [Cluster-devel] [GFS2 PATCH 2/2] GFS2: Split gfs2_rgrp_congested into inter-node and intra-node cases

Steven Whitehouse Thu, 25 Jan 2018 03:50:07 -0800

Hi,

Some further thoughts...

Whenever we find a problem related to a lock, it is a good plan tounderstand where the problem actually lies. In other words whether thelocking itself is slow, or whether it is some action that is beingperformed under the lock that is the issue. We have the ability toeasily create histograms of DLM lock times, and almost as easily createhistograms of the glock times (gfs2_glock_queue -> gfs2_promote). We caneasily filter on glock type (rgrp) and the lock transistions that wecare about (any -> EX) too. So it would be interesting to look at thisin order to get more of an insight into what is really going on.

Taking the raw histogram and multiplying the count by the centre of eachbucket gives us total time taken for each different lock latency. Thenit is easy to see which latencies are the ones causing the most delay.

It would also be interesting to know how long it takes to allocate anddeallocate a block. What are the operations that take the most time?Unfortunately our block allocation tracepoint doesn't give us that info,but it is probably not that tricky to alter it, so that it does.


That would give us a much more detailed picture of what is going on I think,

Steve.

Re: [Cluster-devel] [GFS2 PATCH 2/2] GFS2: Split gfs2_rgrp_congested into inter-node and intra-node cases

Reply via email to