On 05/03/2017 05:29 PM, Adam Carheden wrote:
On 05/03/2017 01:46 AM, Alexandre DERUMIER wrote:
Maybe this is because node only reboot if an HA vm is present on the node ?


if you had HA vm on all 4 nodes, I think that all nodes should be reboot by 
watchdog. (as you lost quorum on 4 nodes)
That must be it. I have one HA VM and a few non-HA VMs, all just
testing. I think the takeaway is not to run an even number of HA nodes
in production (or to use a corosync QDevice as Thomas suggests).

Yes, as my answer stated. ;-)


Am I correct that all PVE nodes contribute to the quorum voting even if
they're not part of an HA group?
Yes, because quorum is not just used for HA, quorum is there for all cluster activities as they all need to be consistent and reliably synchronized.


My production cluster will have 6 nodes (more redundancy, same
datacetner, less network risk). To prevent cluster shutdown in
production when I'll have lots more HA VMs, I can just add an old
cheap-o box as 7th node for quorum and not put it in any HA groups?

Yes, that would be an good option. But you certainly need to but some thought on where you place this machine. If it is, for example, in room A and room B gets on fire (just for the examples sake :) ) then the nodes in Room A are still quorate, but not vice versa. I.e. if Room A gets cut off Room B will not have quorum. An option would be to place it in a third room so that is an independent arbitrator.
But that is naturally not an option for all..


Alternatively, is there a way exclude one of my 6 nodes from the HA
quorum voting? In a CEPH cluster, quorum is determined by nodes running
the monitor service, and not all nodes have to run the monitor service.
Is there an equivalent "no monitor" configuration in PVE?

As Dietmar hinted: you can configure how many votes a node provides (must be >= 1). This can be configured either on node addition or by editing the corosync configuration file in:
/etc/pve/corosync.conf

So you could just give one node in the 'more reliable' room two votes and you achieve the same
as with a additional machine in the same room.

See:
http://pve.proxmox.com/pve-docs/chapter-pvecm.html#edit-corosync-conf
# man corosync.conf

cheers,
Thomas

Thanks

----- Mail original -----
De: "Adam Carheden" <carhe...@ucar.edu>
À: "proxmoxve" <pve-user@pve.proxmox.com>
Envoyé: Mardi 2 Mai 2017 17:40:37
Objet: [PVE-User] Expected fencing behavior on a bifurcated 4-node HA   cluster

What's supposed to happen if two nodes in a 4-node HA cluster go offline?


I have a 4-node test cluster, two nodes are in one server room and the
other two in another server room. I had HA inadvertently tested for me
this morning due to an unexpected network issue and watchdog rebooted
two of the nodes.

I think this is the expected behavior, and certainly seems like what I
want to happen. However, quorum is 3, not 2, so why didn't all 4 nodes
reboot?

# pvecm status
Quorum information
------------------
Date: Tue May 2 09:35:23 2017
Quorum provider: corosync_votequorum
Nodes: 4
Node ID: 0x00000001
Ring ID: 4/524
Quorate: Yes

Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate

Membership information
----------------------
Nodeid Votes Name
0x00000004 1 192.168.0.11
0x00000003 1 192.168.0.203
0x00000001 1 192.168.0.204 (local)
0x00000002 1 192.168.0.206

# ha-manager status
quorum OK
master node3 (active, Tue May 2 09:35:24 2017)
lrm node1 (idle, Tue May 2 09:35:27 2017)
lrm node2 (active, Tue May 2 09:35:26 2017)
lrm node3 (idle, Tue May 2 09:35:23 2017)
lrm node3 (idle, Tue May 2 09:35:23 2017)

Somehow proxmox was smart enough to keep two of the nodes online, but
with a quorum of 3 neither group should have had quorum. How does it
decide which group to keep online?

Thanks

_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user



_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to