Hi,

That's related to a thing I'm fighting for.

An option to skip X lost monitoring attempts is planned, but not implemented yet, as far as I know.

Regards,

Klecho


On 30/05/18 06:08, 范国腾 wrote:
Hi,

The cluster uses the PAF to manage the postgres db, and it use the GFS2 to 
manage the shared storage. The configuration is as attachment.

When we are doing the performance test, the CPU is very high. We set the op 
monitor timeout 100 seconds. PAF call pg_isready to monitor the db. When the 
call load becoming higher, the pg_isready response time increase. When it has 
no response after 100 seconds, the pacemaker restarts the PAF resource. Then 
there is many kernel log and then the PAF resource start fails.

So my question is:
1. When the monitor operation is timeout, there is many kernel log printed in 
/var/log/messages, could you please help check if this log shows the cluster 
has anything wrong? It seems like the share disk storage error prevents the 
database to start.
2.. When the cluster runs as product, it could not avoid the call load become 
high for some time and the monitor will become timeout. Then the PAF resource 
will be restarted. Is there any way to avoid the resource to restart when the 
system is busy?

Thanks
Steven




_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

--
Klecho

_______________________________________________
Users mailing list: Users@clusterlabs.org
https://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to