Hi,

I understand that GFS2 by default has a limit on the rate of POSIX locks to 100 
per second.
This limit can be removed by the following entry in /etc/cluster/cluster.conf:

<gfs_controld plock_rate_limit="0" plock_ownership="1"/>


Question 1:
------------
How can I monitor the rate of POSIX lock operations?

The reason I am asking this question is that I am trying to maximise 
application throughput in a cluster. This is a database type application 
running on one node with the other node being an idle standby. Under large 
generated workload I see the system in a state in which it still has available 
unused CPU, memory, IO rate and bandwidth and network bandwidth capacity, but 
will not go any faster. I am suspecting that GFS POSIX lock processing is the 
bottleneck, but at the moment have no data to prove it, and no information on 
how to tune it to remove this bottleneck.


Question 2:
-----------
What may be limiting the throughput of GFS2 with plock_rate_limit set to 0 and 
in the absence of global physical shortage of resouces?  Could this be the 
gfs_controld process saturating one CPU core?  I indeed see gfs_lockd using 
90%+ of one CPU core in top(1).


Question 3:
-----------
What else can I tune to get higher maximum throughput from GFS2 used in this 
asymmetrical configuration?  Potentially, I need much more throughput, as my 
real production cluster is to support 1,000+ transactional users.


Question 4:
-----------
Is there a leaner, more efficient way of using GFS2 for such asymmetrical 
operation when all accesses are from only one node and the other acts as a hot 
standby with no fsck needed on failover of the service to the other node?  I am 
in a position to move the mounting of the GFS2 fiesystem into the service 
script if that would be of help.

Your comments and ideas will be much appreciated.

Regards,

Chris

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to