Why did you decide to not use multicast?
> On Aug 22, 2018, at 22:58, Ml Ml <[email protected]> wrote: > > Hello, > > i could need some hint/help since one cluster is letting me down since > 29.07.2018 . > Thats when one of my three nodes started to freeze and stop. > > In syslog the last entries are: > > Aug 21 02:33:00 node10 systemd[1]: Starting Proxmox VE replication runner... > Aug 21 02:33:01 node10 systemd[1]: Started Proxmox VE replication runner. > Aug 21 02:33:01 node10 CRON[1870491]: (root) CMD (/usr/bin/puppet > agent -vt --color false --logdest /var/log/puppet/agent.log > 1>/dev/null) > ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ > > > or: > > Aug 22 16:11:12 node08 pmxcfs[5227]: [dcdb] notice: cpg_send_message > retried 1 times > Aug 22 16:11:12 node08 pmxcfs[5227]: [status] notice: members: 1/5227, 2/5058 > Aug 22 16:11:12 node08 pmxcfs[5227]: [status] notice: starting data > syncronisation > ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ > > I already posted it here: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__forum.proxmox.com_threads_periodic-2Dnode-2Dcrash-2Dfreeze.46407_&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=zpOdKmRPAro1hJw-CO0lkGqmzXn8fQ4Ye5aJvsC8lbk&s=fRGRq_-sMJvikzFr6peWj3oZxkZ5eHY434Re48Mv9mI&e= > > It happened at: > 29.07.2018 node09 / pve 4.4 > 07.08.2018 node08 / pve 4.4 ( then i decided to upgrade) > 21.08.2018 node10 / pve 5.2 > 22.08.2018 node08 / pve 5.2 > > ...and i am getting nervous now since there are 60 important VMs on it. > As you can see it happened across multiple nodes with diffrent PVE Versions. > > Memtest is okay. > > As far as i googled the "^@^@^@^@^@^" appear is syslog because i can > not fully write the file to disk? > > Maybe something triggers some totem/watchdog stuff which then ends in > a disaster? > > My Ideas from here: > - disable corosync/totem and see if the problems stop > > Have you any ideas which could narrow my problem down? > > > My Setup is a 3 Node Cluster (node08, node09, node10) with ceph. > > I have 4 other 3-NodeCluster running just fine. > > Thanks a lot. > > Mario > _______________________________________________ > pve-user mailing list > [email protected] > https://urldefense.proofpoint.com/v2/url?u=https-3A__pve.proxmox.com_cgi-2Dbin_mailman_listinfo_pve-2Duser&d=DwIGaQ&c=teXCf5DW4bHgLDM-H5_GmQ&r=THf3d3FQjCY5FQHo3goSprNAh9vsOWPUM7J0jwvvVwM&m=zpOdKmRPAro1hJw-CO0lkGqmzXn8fQ4Ye5aJvsC8lbk&s=8K2XEB3Soz8V0JMR6hzvc78bjDExInI2vC2LC_FfljI&e= _______________________________________________ pve-user mailing list [email protected] https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
