Dne 21. 04. 19 v 15:46 Andrei Borzenkov napsal(a):
21.04.2019 16:32, Lentes, Bernd пишет:
----- Am 21. Apr 2019 um 6:51 schrieb Andrei Borzenkov [email protected]:

20.04.2019 22:29, Lentes, Bernd пишет:


----- Am 18. Apr 2019 um 16:21 schrieb kgaillot [email protected]:


Simply stopping pacemaker and corosync by whatever mechanism your
distribution uses (e.g. systemctl) should be sufficient.

That works. But strangely is that after a reboot both nodes are
shown as UNCLEAN. Does the cluster not remeber that it has been shutdown cleanly
?

No. Pacemaker does not care what state cluster was during last shutdown.
What matters is what state cluster is now.

Aah.
Problem is that after starting pacemaker and corosync on one node the other
is fenced because of that. (pacemaker and corosync aren't started automatically
by systemd).


That is correct and expected behavior. If node still did not appear
after timeout, pacemaker assumes node is faulted and attempts to proceed
with remaining nodes (after all, it is about _availability_ and waiting
indefinitely means resources won't be available). For this it needs to
ascertain state of missing node, so pacemaker attempts to stonith it.
Otherwise each node could attempt to start resources resulting in split
brain and data corruption.

Either start pacemaker on all nodes at the same time (with reasonable
fuzz, doing "systemctl start pacemaker" in several terminal windows
sequentially should be enough) or set wait_for_all option in corosync
configuration. Note that with if you have two node cluster, two_node
corosync option also implies wait_for_all.


Hi,

but what is if one node has e.g. a hardware failure and i have to wait for the 
spare part ?
With wait_for_all it can't start the resources.

Wait_for_all is only considered during initial startup. Once cluster is
up, node can fail and pacemaker will fail over resources as appropriate.
When node comes back it will join cluster.

If your question is - how do I start incomplete cluster - well, you can
temporary unset wait_for_all, or you can remove node from cluster and
add it back when it becomes available.

Or you can do simple "pcs quorum unblock", or "pcs cluster quorum unblock" in old pcs versions.


Or you can make sure you start pacemaker on all nodes simultaneously.
You do it manually anyway, so what prevents you from starting pacemaker
on all nodes close to each other? If you are using pcs, "pcs cluster
start --all" should do it for you.

Or you can live with extra stonith.

At the end it is up to you to decide what action plan is most
appropriate. What you cannot have is computer reading your mind and
knowing when it is safe to ignore missing node.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to