Thank you very much Alexey, I will certainly try that and update you on the result.
Best regards! Il giorno lun 12 mag 2025 alle ore 22:36 <ale...@pavlyuts.ru> ha scritto: > Hi, > > > > Occasionally, I have pacemaker as a base layer of custom clustering > solution and I have a script to rebuild the second node from the first one. > I can’t share the script itself as is has a lot of solution-dependent > references, but I can share the sequence to rebuild the failed node: > > 1. Setup the new node with the same IP and hostname > 2. (optional) setup passwordless mutual key-based SSH access. It is > not necessary, but make a lot of things easy. > 3. Copy files from survived host to the new one: > 1. /etc/corosync/authkey > 2. /etc/corosync/corosync.conf > 3. /etc/drbd.d/*.res > 4. /etc/pacemaker/authkey > 4. Set *hacluster* user pass to the same as it was on the survived > node. > 5. Re-auth pcs nodes with command > pcs host auth <host1_name> <host2_name> -u hacluster -p > <ha_cluster_pass> > 6. Reboot the restored server > 7. PROFIT!!! > > > > If you use no arbiter (corosync-qnetd) this should be enough for your new > cluster node up and running. If you use corosync-qnetd, you need also > restore corosync-qdevice nssdb keys for the second host connect the arbiter > node: > > 1. On old host, extract your arbiter certificate from nssdb on the > survived host: > certutil -L -d /etc/corosync/qdevice/net/nssdb -n 'QNet CA' -r > > /root/qnetd-cert.crt > 2. Copy certificate to the new host, assume the path on the new host > is the same > 3. On the new host, Init new nssdb with certificate: > corosync-qdevice-net-certutil -i -c /root/qnetd-cert.crt > 4. Copy certificate and key at location > /etc/corosync/qdevice/net/nssdb/qdevice-net-node.p12 from old node to > new one > 5. On the new node: Import certificate and key: > corosync-qdevice-net-certutil -m -c > /etc/corosync/qdevice/net/nssdb/qdevice-net-node.p12 > 6. Enable or restart corosync-qdevice: > systemctl enable –now corosync-qdevice.service > or > systemctl restart corosync-qdevice.service > 7. Enjoy! > > > > That’s what practically work for me and included in service scripts of our > product, based on Pacemaker. > > > > Hope this could help! > > > > Sincerely, > > > > Alex > > > > > > *From:* Users <users-boun...@clusterlabs.org> *On Behalf Of *Fabrizio > Ermini > *Sent:* Friday, May 9, 2025 5:26 PM > *To:* users@clusterlabs.org > *Subject:* [ClusterLabs] Rebuild of failed node > > > > Hi all! Freshmen here, just joined. > > > > I'm currently in the need to rebuild a failed node on a > pacemaker2.1/corosync3.1 2-node cluster with drbd storage. > > I've searched in Pacemaker docs and in the list archives, but I haven't > found a clear guide on how to proceed in this task. So far, I've > reinstalled a new server, configured the same IP and hostname of the failed > one, and installed all the software. I've also fixed DRBD layer and started > the resync of the volumes. But it's not clear to me how to proceed - I've > found some hints online pointing to the need of manually copying corosync > config, but they were quite old and probably obsolete. I'm using pcs as a > shell and I haven't found a command designed to replace a node, only to add > or remove them. > > It seems really strange to me that there isn't a guide, since this should > be a very basic operation and it's quite important to know how to do it - > HW breaks, as a matter of fact :D > > So I'll be very grateful if anyone can point me in the right direction. > > Thanks in advance, and best regards > > > > Fabrizio > > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/