Felix, Do the IP addresses associated with the NodeName's return proper matches when you run lookups?
What happens if you don't use IP addresses and only host names within your Slurm configuration? John DeSantis 2015-03-17 11:30 GMT-04:00 John Desantis <[email protected]>: > Felix, > > My fault, I suggested something that you already checked! > > John DeSantis > > 2015-03-17 11:28 GMT-04:00 John Desantis <[email protected]>: > > Felix, >> >> Can you ping the nodes from the controller and vise versa? >> >> The snippet below looks like a potential firewall issue: >> >> [2015-03-16T15:40:02.845] debug2: Error connecting slurm stream socket at >> ***.***.***.***52:6818: Connection timed out >> >> Try telnet'ing from the controller to each node on port 6818 and then >> telnet'ing from each node to the controller on port 6817. >> >> John DeSantis >> >> >> 2015-03-17 11:23 GMT-04:00 Yann Sagon <[email protected]>: >> >> >>> >>> 2015-03-17 13:31 GMT+01:00 Felix Willenborg < >>> [email protected]>: >>> >>>> >>>> Hi there, >>>> >>>> first of all, i'm kinda new to slurm, so hopefully i may have missed >>>> something very basic here. >>>> >>>> >>>> slurmctld.log >>>> ------------------------------------------------------------ >>>> ------------------------------------------------------------ >>>> ------------------------------------ >>>> [2015-03-16T15:39:54.813] debug: sched: slurmctld starting >>>> [2015-03-16T15:39:54.817] error: Configured MailProg is invalid >>>> [2015-03-16T15:39:54.817] debug3: Trying to load plugin >>>> /usr/lib/slurm/accounting_storage_filetxt.so >>>> [2015-03-16T15:39:54.817] debug2: slurmdb_init() called >>>> [2015-03-16T15:39:54.817] Accounting storage FileTxt plugin loaded >>>> [2015-03-16T15:39:54.818] debug3: Success. >>>> [2015-03-16T15:39:54.818] debug3: not enforcing associations and no >>>> list was given so we are giving a blank list >>>> [2015-03-16T15:39:54.818] debug3: Version in assoc_mgr_state header is 1 >>>> [2015-03-16T15:39:54.818] slurmctld version 2.6.5 started on cluster >>>> cluster >>>> >>> >>> As you are new to slurm, I would as first step suggest to try with >>> latest slurm version. (14.11.4) >>> >>> >>> >> >> >
