Re: [PVE-User] Whole cluster brokes

2017-03-10 Thread Daniel
HI there, i got the same error today again after adding vlans on the switch and removing them again: Mar 10 15:51:59 host01 systemd[1]: Stopping The Proxmox VE cluster filesystem... Mar 10 15:52:00 host01 corosync[14350]: [TOTEM ] A new membership (10.0.2.110:127136) was formed. Members Mar

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Thomas Lamprecht
Hi, On 03/08/2017 01:12 PM, Daniel wrote: Hi, i was able to resolve this by my self. After i restarted the network Interface (bonding) it was working again. So maybe the problem was the Bonding on that case. Ok, glad to hear! cheers, Thomas

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Daniel
Hi, i was able to resolve this by my self. After i restarted the network Interface (bonding) it was working again. So maybe the problem was the Bonding on that case. -- Grüsse Daniel Am 08.03.17, 12:51 schrieb "pve-user im Auftrag von Daniel"

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Daniel
Hi, there are absolutly no network changes at all. I got some strange errors: omping: Can't get addr info for omping: Name or service not known On Host01 it ts working with omping -c 10 -i 1 -q 10.0.2.110 10.0.2.111 On host host2 I did with error: omping -c 10 -i 1 -q 10.0.2.111 10.0.2.110

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Thomas Lamprecht
Hi, On 03/08/2017 11:38 AM, Daniel wrote: Hi, when i try the command with 2 NODES i got the follwing Error. So it seems realy to be a multicast problem. root@host01:~# omping -c 10 -i 1 -q 10.0.2.110 10.0.2.111 10.0.2.111 : waiting for response msg 10.0.2.111 : waiting for response msg

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Daniel
Hi, ok it seems that Multicast is not working anymore. But how can this happen? It was working before without any trouble. -- Grüsse Daniel Am 08.03.17, 11:15 schrieb "pve-user im Auftrag von Thomas Lamprecht" :

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Daniel
And i got a new error: When it run the imping command I got this: omping -c 10 -i 1 -q 10.0.2.111 omping: Can't find local address in arguments Maybe this is correct? -- Grüsse Daniel Am 08.03.17, 11:15 schrieb "pve-user im Auftrag von Thomas Lamprecht"

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Daniel
Hi, when i try the command with 2 NODES i got the follwing Error. So it seems realy to be a multicast problem. root@host01:~# omping -c 10 -i 1 -q 10.0.2.110 10.0.2.111 10.0.2.111 : waiting for response msg 10.0.2.111 : waiting for response msg I cant restart pve-cluster – I get errors.

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Thomas Lamprecht
Hi, On 03/08/2017 11:02 AM, Daniel wrote: HI, the Cluster was working all the time pretty cool. Yes, but if this particular node acted as a querier the cluster would have worked great but removing it results in no more querier and so problems. It's at least worth a try to look this up,

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Daniel
HI, the Cluster was working all the time pretty cool. So actually I found out that PVE File-System is not mounted. And here you also can see some logs you ask for ;) ● corosync.service - Corosync Cluster Engine Loaded: loaded (/lib/systemd/system/corosync.service; enabled) Active: active

Re: [PVE-User] Whole cluster brokes

2017-03-08 Thread Thomas Lamprecht
On 03/08/2017 10:40 AM, Daniel wrote: Hi there, one College remove one server from the datacenter and after that the whole cluster is broken: Did this server act as a multicast querier? Could explain the behavior. Check if your switch has setup IGMP snooping, if yes you could disable it

[PVE-User] Whole cluster brokes

2017-03-08 Thread Daniel
Hi there, one College remove one server from the datacenter and after that the whole cluster is broken: Mar 8 10:35:00 host01 pvestatd[2090]: ipcc_send_rec failed: Connection refused Mar 8 10:35:00 host01 pvestatd[2090]: ipcc_send_rec failed: Connection refused Mar 8 10:35:00 host01