Hi,
So you are saying I should not use the notion of "primary" ok. 

When I have 3 nodes, won't 1 node have the VIP? How is this node defined in 
Pacemaker's terminology if "primary" is inappropriate?

Best Regards



________________________________
 From: Digimer <[email protected]>
To: Hermes Flying <[email protected]>; General Linux-HA mailing list 
<[email protected]> 
Sent: Sunday, December 2, 2012 8:22 PM
Subject: Re: [Linux-HA] Corosync on cluster with 3+ nodes
 
On 12/02/2012 02:56 AM, Hermes Flying wrote:
> Hi,
> For a cluster with 2 nodes I was explained what would happen. The other node 
> will take over using fencing.

It will take over *after* fencing. Two separate concepts.

Fencing ensures that a lost node is truly gone and not just partitioned.
Once fencing succeeds and the lost node is known to be down, _then_
recovery of service(s) that had been running on the victim will begin.

> But in clusters with 3+ nodes what happens when corosync fails? I assume that 
> if the communication fails with the primary, all other nodes consider 
> themselves eligible to become primaries. Is this the case?

Corosync failing will be treated as a failure in the node and the node
will be removed and fenced. Any services that had been running on it may
or may not be recovered, depending on the rules defined for that given
service. If it is recovered, then where it is restarted again depends on
how each service was configured.

> 1)If a node has problem communicating with the primary AND has network 
> problem with the rest of the network (clients) does it still try to become 
> the primary (try to kill other nodes?)

Please drop the idea of pacemaker being "primary"; that's the wrong way
to look at it.

If pacemaker (via corosync) loses contact with it's peer(s), then it
checks the quorum policy. If quorum is enabled, it checks to see if it
had quorum. If it does, it will try to fence it's peer. If it doesn't,
it will shut down any services it might have been running. Likely in
this case, one of the nodes with quorum will fence it shortly.

> 2) In practice if the corosync fails but the primary is still up and running 
> and serving requests, is primary attempted to be "killed" by the other 
> nodes?Or you use some other way to figure out that this is a network failure, 
> primary has not crashed?

Again, drop the notion of "primary". Whether a node tries to fence it's
peer is a question of whether it has quorum (or if quorum is disabled).
Failing corosync is the same as failing the whole node. Pacemaker will
fail is corosync dies.

> 3)Finally on corosync failure I assume the primary does nothing, as it does 
> not care about the backups. Is this correct?

This question doesn't make sense.

> Thank you!

np

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to