Unsubscribe Automatically:
http://lists.schedmd.com/cgi-bin/dada/mail.cgi/u/slurmdev/archive/mail-archive.com/
Hello everyone,

the subject says it all, now I tell you
how far I went and what piece of info I
miss.

###################
# hostnames & IPs #
###################

Pretty standard cluster:

+------+                     +-------+
| cyan +------+          +---+ node1 |
+------+      |          |   +-------+
               |          |
           +---+---+      |   +-------+
           | switch+------+---+ node2 +
           +---+---+      |   +-------+
               |          ~
+------+      |          |   +-------+
| blue +------+          +---| modeN |
+------+                     +-------+

hostname server A:        cyan
IP server A:              192.168.0.1

hostname server B:        blue
IP server B:              192.168.0.2

hostname floating server: pink
floating IP:              192.168.0.3

#########
# Mysql #
#########

I've replicated '/etc/mysql/debian.cnf' on both cyan and
blue:

### debian.cnf ###
[client]
host     = localhost
user     = debian-sys-maint
password = SysMaintPass
socket   = /var/run/mysqld/mysqld.sock
[mysql_upgrade]
host     = localhost
user     = debian-sys-maint
password = SysMaintPass
socket   = /var/run/mysqld/mysqld.sock
basedir  = /usr
##################

so to have the same SysMaintPass for
the debian-sys-maint user on both servers.

Moreover I've issued the following mysql command:

grant all on slurm_acct_db.* TO 'slurm'@'localhost' \
   identified by 'SlurmDBDPass' with grant option;

########
# DRBD #
########

drbd (active/passive) manages this folder (NFS):

/var/lib/mysql

(slurm database is in /var/lib/mysql/slurm_acct_db)

#############
# pacemaker #
#############

pacemaker keeps all the following services/servers
always running on the "active" server only:

floating IP
mysql file system
mysql server
slurmdbd
slurmctld

#################
# slurmdbd.conf #
#################

I've replicated '/etc/slurm/slurmdbd.conf' on both
cyan and blue servers:

### slurmdbd.conf ###
AuthType=auth/munge
DbdHost=localhost    <<<<<<--------- is this correct?
SlurmUser=slurm
StorageHost=localhost   <<<<<<------ is this correct?
StoragePass=SlurmDBDPass
StorageType=accounting_storage/mysql
StorageUser=slurm
StorageLoc=slurm_acct_db
...
#####################

I DID NOT define DbdBackupHost and StorageBackupHost
intentionally.

I believe that using 'localhost' for both DbdHost and
StorageHost is correct because both slurmdbd and the
mysql servers will always be running side by side either
on cyan or blue.

What I miss now is how to configure slurm.conf properly
(see QUESTIONS section below)

##############
# slurm.conf #
##############

I've replicated '/etc/slurm/slurm.conf' on cyan and blue
servers + ALL NODES.

### slurm.conf ###
ControlMachine=cyan
ControlAddr=cyan
AccountingStorageHost=cyan
AccountingStorageType=accounting_storage/slurmdbd
AuthType=auth/munge
CryptoType=crypto/munge
SlurmUser=slurm
SlurmdUser=root
StateSaveLocation=/var/run/slurm/slurmctld
...
##################

#############
# QUESTIONS #
#############

1)

   What we want is to have an identical/replicated
   slurm.conf file on all hosts right?

   Or should I have a different slurm.conf file on
   the nodes?

2)

   I believe I do not have to define the following
   keywords in slurm.conf:

   BackupController=
   BackupAddr=
   AccountingStorageBackupHost=

   because the HA I want is obtained by 'moving' all
   the services I need to the failover server (do not
   make use of the embedded HA in slurm).

   Is that correct?

3)

   Assuming point (2) is correct, should I apply the
   following changes to slurm.conf?

   ControlMachine=pink
   ControlAddr=192.168.0.3
   AccountingStorageHost=pink

   Or, how would I have to get them set?

4)

   Shall I also add to my DBD/hearbeat/pacemaker
   configuration the management of the following
   folder:

   /var/run/slurm

   which is specified for 'StateSaveLocation'?


Thanks for your input.

--matt

.

Reply via email to