Or, not doing that and making sure that multicast is enabled and omping works, like mentioned in the docs.
> On Oct 30, 2018, at 08:37, Gilberto Nunes <[email protected]> wrote: > > Consider reinstall proxmox > --- > Gilberto Nunes Ferreira > > (47) 3025-5907 > (47) 99676-7530 - Whatsapp / Telegram > > Skype: gilberto.nunes36 > > > > > > Em ter, 30 de out de 2018 às 13:28, Adam Weremczuk <[email protected]> > escreveu: > >> It doesn't appear to be related to /etc/hosts. >> I've reverted them to defaults on all systems, commented out IPv6 >> sections and restarted all nodes. >> The problem on node1 (lion) persists: >> >> systemctl status pve-cluster.service >> ● pve-cluster.service - The Proxmox VE cluster filesystem >> Loaded: loaded (/lib/systemd/system/pve-cluster.service; enabled; >> vendor preset: enabled) >> Active: active (running) since Tue 2018-10-30 16:18:10 GMT; 3min 7s ago >> Process: 1864 ExecStartPost=/usr/bin/pvecm updatecerts --silent >> (code=exited, status=0/SUCCESS) >> Process: 1819 ExecStart=/usr/bin/pmxcfs (code=exited, status=0/SUCCESS) >> Main PID: 1853 (pmxcfs) >> Tasks: 6 (limit: 4915) >> Memory: 46.4M >> CPU: 699ms >> CGroup: /system.slice/pve-cluster.service >> └─1853 /usr/bin/pmxcfs >> >> Oct 30 16:18:08 lion pmxcfs[1853]: [dcdb] crit: can't initialize service >> Oct 30 16:18:08 lion pmxcfs[1853]: [status] crit: cpg_initialize failed: 2 >> Oct 30 16:18:08 lion pmxcfs[1853]: [status] crit: can't initialize service >> Oct 30 16:18:10 lion systemd[1]: Started The Proxmox VE cluster filesystem. >> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: update cluster info >> (cluster name MS-HA-Cluster, version = 1) >> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: node has quorum >> Oct 30 16:18:14 lion pmxcfs[1853]: [dcdb] notice: members: 1/1853 >> Oct 30 16:18:14 lion pmxcfs[1853]: [dcdb] notice: all data is up to date >> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: members: 1/1853 >> Oct 30 16:18:14 lion pmxcfs[1853]: [status] notice: all data is up to date >> >> >>> On 30/10/18 15:06, Adam Weremczuk wrote: >>> I have modified /etc/hosts on all nodes indeed. >>> That's because DNS will be served from one of containers on the cluster. >>> I don't want for cluster nodes to rely on DNS when communicating with >>> each other. >>> Maybe I'm trying to duplicate what Proxmox already does under the hood? >>> >>> Anyway my hosts files look like below: >>> >>> node1 >>> 192.168.8.101 node1.example.com node1 pvelocalhost >>> 192.168.8.102 node2.example.com node2 >>> 192.168.8.103 node3.example.com node3 >>> >>> node2 >>> 192.168.8.101 node1.example.com node1 >>> 192.168.8.102 node2.example.com node2 pvelocalhost >>> 192.168.8.103 node3.example.com node3 >>> >>> node3 >>> 192.168.8.101 node1.example.com node1 >>> 192.168.8.102 node2.example.com node2 >>> 192.168.8.103 node3.example.com node3 pvelocalhost >>> >>> + IPv6 section (identical on all) which I should probably comment out: >>> >>> ::1 ip6-localhost ip6-loopback >>> fe00::0 ip6-localnet >>> ff00::0 ip6-mcastprefix >>> ff02::1 ip6-allnodes >>> ff02::2 ip6-allrouters >>> ff02::3 ip6-allhosts >>> >>> >>>> On 30/10/18 14:54, Gilberto Nunes wrote: >>>> HOw about /etc/hosts file? >>>> Remember that Proxmox need to know about his IP and hostname >>>> correctly, in order to start CRM accordingly >>>> --- >>>> Gilberto Nunes Ferreira >>>> >>>> (47) 3025-5907 >>>> (47) 99676-7530 - Whatsapp / Telegram >>>> >>>> Skype: gilberto.nunes36 >>>> >>>> >>>> >>>> >>>> >>>> Em ter, 30 de out de 2018 às 11:47, Adam Weremczuk >>>> <[email protected] <mailto:[email protected]>> escreveu: >>>> >>>> Yes, I have 3 nodes (2 x Lenovo servers + a VM) all on the same >>>> LAN with >>>> static IPv4 addresses. >>>> They can happily ping each other and Proxmox web GUI looks ok on >>>> all 3. >>>> No IPv6 in use. >>>> >>>> "Systemctl status pve-cluster.service" looks clean on the other >>>> nodes >>>> but on this troublesome one returns: >>>> >>>> Active: active (running) >>>> (...) >>>> Oct 30 14:17:10 lion pmxcfs[18003]: [dcdb] crit: can't initialize >>>> service >>>> Oct 30 14:17:10 lion pmxcfs[18003]: [status] crit: cpg_initialize >>>> failed: 2 >>>> Oct 30 14:17:10 lion pmxcfs[18003]: [status] crit: can't >>>> initialize service >>>> >>>> >>>>> On 30/10/18 14:38, Gilberto Nunes wrote: >>>>> Hi >>>>> >>>>> It's seems to be a problem with the network connection between >>>> the servers. >>>>> They can ping each others? >>>>> Is this a separated network, isolated from you LAN Network? >>>>> >>>>> --- >>>>> Gilberto Nunes Ferreira >>>>> >>>>> (47) 3025-5907 >>>>> (47) 99676-7530 - Whatsapp / Telegram >>>>> >>>>> Skype: gilberto.nunes36 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Em ter, 30 de out de 2018 às 11:36, Adam Weremczuk >>>> <[email protected] <mailto:[email protected]>> >>>>> escreveu: >>>>> >>>>>> Hi all, >>>>>> >>>>>> My errors: >>>>>> >>>>>> Connection error 500: RPCEnvironment init request failed: >>>> Unable to load >>>>>> access control list: Connection refused >>>>>> >>>>>> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[1] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[2] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:06 lion pveproxy[14464]: ipcc_send_rec[3] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[1] failed: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[2] failed: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:06 lion pvesr[17960]: ipcc_send_rec[3] failed: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:06 lion pvesr[17960]: Unable to load access >>>> control list: >>>>>> Connection refused >>>>>> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Main process >>>> exited, >>>>>> code=exited, status=111/n/a >>>>>> Oct 30 14:17:06 lion systemd[1]: Failed to start Proxmox VE >>>> replication >>>>>> runner. >>>>>> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Unit entered >>>> failed state. >>>>>> Oct 30 14:17:06 lion systemd[1]: pvesr.service: Failed with >>>> result >>>>>> 'exit-code'. >>>>>> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[1] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[2] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:07 lion pveproxy[17194]: ipcc_send_rec[3] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:07 lion ntpd[1700]: Soliciting pool server >>>> 2001:4860:4806:8:: >>>>>> Oct 30 14:17:07 lion pve-ha-lrm[1980]: updating service status >>>> from >>>>>> manager failed: Connection refused >>>>>> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[1] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[2] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:08 lion pveproxy[17194]: ipcc_send_rec[3] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[1] failed: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[2] failed: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[3] failed: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:08 lion pvestatd[1879]: ipcc_send_rec[4] failed: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:08 lion pvestatd[1879]: status update error: >>>> Connection >>>>>> refused >>>>>> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[1] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[2] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:09 lion pveproxy[17194]: ipcc_send_rec[3] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[1] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[2] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:10 lion pveproxy[17194]: ipcc_send_rec[3] failed: >>>>>> Connection refused >>>>>> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: State >>>>>> 'stop-sigterm' timed out. Killing. >>>>>> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Killing >>>> process >>>>>> 1813 (pmxcfs) with signal SIGKILL. >>>>>> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Main >>>> process >>>>>> exited, code=killed, status=9/KILL >>>>>> Oct 30 14:17:10 lion systemd[1]: Stopped The Proxmox VE cluster >>>> filesystem. >>>>>> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Unit >>>> entered >>>>>> failed state. >>>>>> Oct 30 14:17:10 lion systemd[1]: pve-cluster.service: Failed >>>> with result >>>>>> 'timeout'. >>>>>> >>>>>> System info: >>>>>> >>>>>> pveversion -v >>>>>> proxmox-ve: 5.2-2 (running kernel: 4.15.17-1-pve) >>>>>> pve-manager: 5.2-10 (running version: 5.2-10/6f892b40) >>>>>> pve-kernel-4.15: 5.2-1 >>>>>> pve-kernel-4.15.17-1-pve: 4.15.17-9 >>>>>> corosync: 2.4.2-pve5 >>>>>> criu: 2.11.1-1~bpo90 >>>>>> glusterfs-client: 3.8.8-1 >>>>>> ksm-control-daemon: 1.2-2 >>>>>> libjs-extjs: 6.0.1-2 >>>>>> libpve-access-control: 5.0-8 >>>>>> libpve-apiclient-perl: 2.0-5 >>>>>> libpve-common-perl: 5.0-40 >>>>>> libpve-guest-common-perl: 2.0-18 >>>>>> libpve-http-server-perl: 2.0-11 >>>>>> libpve-storage-perl: 5.0-23 >>>>>> libqb0: 1.0.1-1 >>>>>> lvm2: 2.02.168-pve6 >>>>>> lxc-pve: 3.0.2+pve1-3 >>>>>> lxcfs: 3.0.2-2 >>>>>> novnc-pve: 1.0.0-2 >>>>>> proxmox-widget-toolkit: 1.0-20 >>>>>> pve-cluster: 5.0-30 >>>>>> pve-container: 2.0-23 >>>>>> pve-docs: 5.2-8 >>>>>> pve-firewall: 3.0-14 >>>>>> pve-firmware: 2.0-5 >>>>>> pve-ha-manager: 2.0-5 >>>>>> pve-i18n: 1.0-6 >>>>>> pve-libspice-server1: 0.12.8-3 >>>>>> pve-qemu-kvm: 2.11.1-5 >>>>>> pve-xtermjs: 1.0-5 >>>>>> qemu-server: 5.0-38 >>>>>> smartmontools: 6.5+svn4324-1 >>>>>> spiceterm: 3.0-5 >>>>>> vncterm: 1.5-3 >>>>>> zfsutils-linux: 0.7.11-pve1~bpo1 >>>>>> >>>>>> Any idea what's wrong with my (fresh and default) installation? >>>>>> >>>>>> Thanks, >>>>>> Adam >>>>>> >>>>>> _______________________________________________ >>>>>> pve-user mailing list >>>>>> [email protected] <mailto:[email protected]> >>>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=J583oNe6UySZrXl9dmRGYYWtv3F4criIo2nAlEyb1N8&s=O1znuDHRDxzP-CfEXskid3_dkVoOiWfmp9A6HYv_-7Q&e= >>>>>> >>>>> _______________________________________________ >>>>> pve-user mailing list >>>>> [email protected] <mailto:[email protected]> >>>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=J583oNe6UySZrXl9dmRGYYWtv3F4criIo2nAlEyb1N8&s=O1znuDHRDxzP-CfEXskid3_dkVoOiWfmp9A6HYv_-7Q&e= >>>> >>> >>> _______________________________________________ >>> pve-user mailing list >>> [email protected] >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=J583oNe6UySZrXl9dmRGYYWtv3F4criIo2nAlEyb1N8&s=O1znuDHRDxzP-CfEXskid3_dkVoOiWfmp9A6HYv_-7Q&e= >> >> > _______________________________________________ > pve-user mailing list > [email protected] > https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=J583oNe6UySZrXl9dmRGYYWtv3F4criIo2nAlEyb1N8&s=O1znuDHRDxzP-CfEXskid3_dkVoOiWfmp9A6HYv_-7Q&e= _______________________________________________ pve-user mailing list [email protected] https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
