Hello, Am 25.07.2017 um 16:19 schrieb J. Smith: > Does anyone has any suggestions in setting up high availability and > automatic failover between two servers that run a Controller daemon, > Database daemon and Mysql Database (i.e replication vs galera cluster)? > > Any input would be appreciated.
we use ganeti instances for most services. In our case KVM (configurable on a per cluster basis) + DRBD (instance storage) On Debian they are rock solid. While HA is experimentally possible, the default is intentionally going without automatic fail-over: http://docs.ganeti.org/ganeti/2.15/html/design-linuxha.html#risks From my point of view a failing Slurm controller is such a rare event, that I prefer having a look first and only then be able to do a manually triggered fast fail-over. On the other hand the (unwritten) expected SLA for most services here is 90% per week & month, 95% year -- sure, relaxed; not knowing your needs, that might just be a HPC-kindergarden from your perspective. Regards, Benjamin -- FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html ☎ +49 3641 9 44323
smime.p7s
Description: S/MIME Cryptographic Signature
