Adam, That error looks like you already have a slurmctld running on this host. (or possibly some other program that is listening on the same TCP port).
By default slurmctld binds to TCP/6817 and I don’t see a different port specified in your config file. That is probably fine, don’t change it if you don’t need to. Try running netstat to see what is currently listening on that port: # netstat -ltpn|grep 6817 tcp 0 0 0.0.0.0:6817 0.0.0.0:* LISTEN 11143/slurmctld It is likely a stale slurmctld process. If so just kill it and try to start again. Mike > On Jun 15, 2015, at 9:02 AM, Cooper, Adam <[email protected]> wrote: > > Hi, > I am new to SLURM and I have been tasked to install it on a cluster of 15 > servers. Right now, I have just installed SLURM on the master, and hope to > get the daemons running and scheduling jobs there before I try to get it > working for the whole cluster. All of the machines are running Ubuntu 12.04. > I have worked through some errors already; however, currently when I run: > > sudo slurmctld -Dv > > I get this out: > slurmctld: pidfile not locked, assuming no running daemon > > slurmctld: slurmctld version 14.11.7 started on cluster cluster > > slurmctld: OpenSSL cryptographic signature plugin loaded > > slurmctld: preempt/none loaded > > slurmctld: ExtSensors NONE plugin loaded > > slurmctld: Accounting storage NOT INVOKED plugin loaded > > slurmctld: layouts: no layout to initialize > > slurmctld: topology NONE plugin loaded > > slurmctld: sched: Backfill scheduler plugin loaded > > slurmctld: route default plugin loaded > > slurmctld: layouts: loading entities/relations information > > slurmctld: Recovered state of 1 nodes > > slurmctld: Recovered information about 0 jobs > > slurmctld: Recovered state of 0 reservations > > slurmctld: State of 0 triggers recovered > > slurmctld: read_slurm_conf: backup_controller not specified. > > slurmctld: Running as primary controller > > slurmctld: error: Error binding slurm stream socket: Address already in use > > slurmctld: fatal: slurm_init_msg_engine_addrname_port error Address already > in use > > > > By the way, I am running the daemon with root because my boss does not want > me to create a separate 'slurm' user. Any idea what might cause this fatal > error? I've attached an rtf of the current slurm configuration file (I've > REDACTED some things to keep private), which I made using the online > configuration tool. > > Please let me know any more relevant information that your need. Thank you in > advance, and sorry for my lack of knowledge; this is very new work for me. > > > > Adam Cooper > > Brown University Computer Engineering '16 > > > > > > / > > <slurm_conf_current.rtf>
smime.p7s
Description: S/MIME cryptographic signature
