Hi,
But are you telling me that in a 3-node cluster,
quorum is lost when one of the nodes ip is down?

yes. Its the limitation with Pacemaker/Corosync. If the nodes participating in cluster cannot communicate with majority of them (quorum is lost), then the cluster is shut down.


However i am setting up a additional node to test a 4-node setup, but
even then if i put down one node and nfs-grace_start
(/usr/lib/ocf/resource.d/heartbeat/ganesha_grace) did not run properly
on the other nodes, could it be that the whole cluster goes down as
quorum lost again?

That's strange. We have tested quite a few times such configuration but haven't hit this issue. (CCin Saurabh who has been testing many such configurations).

Recently we have observed resource agents (nfs-grace_*) timing out sometimes esp when any node is taken down. But that shouldn't cause the entire cluster to shutdown. Could you check the logs (/var/log/messages, /var/log/pacemaker.log) for any error/warning reported when one node is taken down in case of 4-node setup.

Thanks,
Soumya
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to