Hi, check your Switchconfig and make sure IGMP Snooping is configured correctly. Had this problem a few month ago.
Mit freundlichen Gr??en / Best regards Kevin Meyer Projekte Treml & Sturm Datentechnik GmbH M?hlheimer Stra?e 209 D-63075 Offenbach am Main Deutschland/Germany Telefon: +49 (0) 69 - 8990820 Telefax: +49 (0) 69 - 89908233 E-Mail: [email protected]<mailto:[email protected]> Internet: http://www.treml-sturm.de<http://www.treml-sturm.de/> Gesch?ftsf?hrende Gesellschafter: Johannes Treml und Roland Sturm Sitz der Gesellschaft: Offenbach am Main Registergericht: Amtsgericht Offenbach am Main Registernummer: 5 HRB 10140 USt-ID: DE 182038999 Von: pve-user [mailto:[email protected]] Im Auftrag von Guy Plunkett Gesendet: Mittwoch, 17. Februar 2016 12:47 An: PVE User List <[email protected]> Betreff: Re: [PVE-User] Proxmox 4.1 cluster issue I've just rebuild all my proxmox heads and created a new cluster. No HA. This was working just fine before upgrading to proxmox 4.1 Within 5 minutes adding all 4 systems to the cluster proxmox03 and proxmox01 have dropped from the cluster group. I'm seeing the following filling up the logs Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:42:23 proxmox04 corosync[34115]: [TOTEM ] Retransmit List: 4d7 4d8 4d9 4da 4db 4dc 4dd 4de 4df 4e0 4e1 4e2 4e Feb 17 11:44:48 proxmox01 corosync[3195]: [MAIN ] Completed service synchronization, ready to provide servi Feb 17 11:44:54 proxmox01 corosync[3195]: [TOTEM ] A new membership (10.240.0.100:220) was formed. Members l Feb 17 11:44:54 proxmox01 corosync[3195]: [TOTEM ] Failed to receive the leave message. failed: 3 1 Feb 17 11:44:54 proxmox01 pmxcfs[3172]: [dcdb] notice: members: 4/3172 Feb 17 11:44:54 proxmox01 pmxcfs[3172]: [status] notice: members: 4/3172 Feb 17 11:44:54 proxmox01 pmxcfs[3172]: [status] notice: node lost quorum Feb 17 11:44:54 proxmox01 corosync[3195]: [QUORUM] This node is within the non-primary component and will NO Feb 17 11:44:54 proxmox01 corosync[3195]: [QUORUM] Members[1]: 4 Feb 17 11:44:54 proxmox01 corosync[3195]: [MAIN ] Completed service synchronization, ready to provide servi Feb 17 11:44:54 proxmox01 pmxcfs[3172]: [dcdb] crit: received write while not quorate - trigger resync Feb 17 11:44:54 proxmox01 pmxcfs[3172]: [dcdb] crit: leaving CPG group Feb 17 11:44:55 proxmox01 pmxcfs[3172]: [dcdb] notice: start cluster connection Feb 17 11:44:55 proxmox01 pmxcfs[3172]: [dcdb] notice: members: 4/3172 Feb 17 11:44:55 proxmox01 pmxcfs[3172]: [dcdb] notice: all data is up to date ---- Guy On 17 Feb 2016, at 07:23, Thomas Lamprecht <[email protected]<mailto:[email protected]>> wrote: Note that /etc/cluster/cluster.conf isn't needed anymore, everything cluster relevant will we read out of /etc/pve/corosync.conf (which looks good as far as I can see). You said you upgrade, are you really _really_ sure you did not miss a step (no offense)? I assume you rebuild the cluster cleanly with pvecm addnode <...>? Can you post also your /etc/hostname and /etc/network/interfaces, but it seems to be able to connect initially, thus they should be fine... proxmox04 seems to be the problem, as the other can connect just fine. Can you post whats happening there with: $ journalctl -u corosync.service -u pve-cluster.service -b So we filter out (possible) irrelevant other logging. cheers, Thomas On 02/16/2016 07:46 PM, Guy Plunkett wrote: Hello, I've upgraded my Dell M1000 blade centre to Proxmox 4.1. The upgrade seems to go fine, however I can't seem to have all 4 nodes connected at once. It seems to work for a short time then then one node will disappear, I can SSH to it just fine, and have to restart corosync and pve-cluster and it will join again, however shortly later another node will disappear. Finally a node crashes and restarts. There is nothing present in the syslogs as to why this node cashed. I've spent 2 days fighting with this to try and resolve it. This was working just fine on 3.x. Please can someone help here I'm pulling my hair out trying to get this working, and I don't have much left! Cheers, -Guy Feb 16 16:32:50 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35536) was formed. Members Feb 16 16:32:50 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:32:50 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:32:53 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35540) was formed. Members Feb 16 16:32:53 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:32:53 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:32:56 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35544) was formed. Members Feb 16 16:32:56 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:32:56 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:32:59 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35548) was formed. Members Feb 16 16:32:59 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:32:59 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:33:02 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35552) was formed. Members Feb 16 16:33:02 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:33:02 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:33:05 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35556) was formed. Members Feb 16 16:33:05 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:33:05 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:33:08 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35560) was formed. Members Feb 16 16:33:08 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:33:08 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:33:11 proxmox01 corosync[5747]: [TOTEM ] A new membership (10.240.0.100:35564) was formed. Members Feb 16 16:33:11 proxmox01 corosync[5747]: [QUORUM] Members[3]: 4 3 2 Feb 16 16:33:11 proxmox01 corosync[5747]: [MAIN ] Completed service synchronization, ready to provide service. Feb 16 16:36:45 proxmox01 rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="2723" x-info="http://www.rsyslog.com<http://www.rsyslog.com/>"] start Feb 16 16:36:45 proxmox01 systemd-modules-load[999]: Module 'fuse' is builtin Feb 16 16:36:45 proxmox01 systemd-modules-load[999]: Inserted module 'vhost_net' Feb 16 16:36:45 proxmox01 hdparm[1031]: Setting parameters of disc: (none). Feb 16 16:36:45 proxmox01 lvm[1280]: 3 logical volume(s) in volume group "pve" now active # cat /etc/cluster/cluster.conf <?xml version="1.0"?> <cluster name="Cork-Training" config_version="6"> <cman keyfile="/var/lib/pve-cluster/corosync.authkey"> </cman> <clusternodes> <clusternode name="proxmox01" votes="1" nodeid="1"/> <clusternode name="proxmox02" votes="1" nodeid="2"/><clusternode name="proxmox03" votes="1" nodeid="3"/><clusternode name="proxmox04" votes="1" nodeid="4"/></clusternodes> </cluster> # cat /etc/pve/corosync.conf logging { debug: off to_syslog: yes } nodelist { node { name: proxmox04 nodeid: 1 quorum_votes: 1 ring0_addr: proxmox04 } node { name: proxmox03 nodeid: 2 quorum_votes: 1 ring0_addr: proxmox03 } node { name: proxmox02 nodeid: 3 quorum_votes: 1 ring0_addr: proxmox02 } node { name: proxmox01 nodeid: 4 quorum_votes: 1 ring0_addr: proxmox01 } } quorum { provider: corosync_votequorum } totem { cluster_name: Cork-Training config_version: 6 ip_version: ipv4 secauth: on version: 2 interface { bindnetaddr: 10.240.0.100 ringnumber: 0 } } ---- Guy _______________________________________________ pve-user mailing list [email protected]<mailto:[email protected]> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user _______________________________________________ pve-user mailing list [email protected]<mailto:[email protected]> http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
_______________________________________________ pve-user mailing list [email protected] http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
