I'm trying to set up slurmdbd in slurm-2.2.4 in order to play around with multi-cluster support and having some problems. my cluster name is 'test', and slurmdbd keeps complaining:
slurmdbd: error: It looks like the storage has gone away trying to reconnect slurmdbd: error: We should have gotten a new id: Table 'slurm_acct_db.test1_job_table' doesn't exist slurmctld shows: error: slurmdbd: DBD_ID_RC is -1 sacct and sreport both seem to connect just fine but there are no job records being saved in the database. Relevant configs: # slurmdbd.conf ArchiveEvents=yes ArchiveJobs=yes ArchiveSteps=no ArchiveSuspend=no #ArchiveScript=/usr/sbin/slurm.dbd.archive AuthInfo=/var/run/munge/munge.socket.2 AuthType=auth/none DbdHost=ufm1 DebugLevel=4 PurgeEventAfter=1month PurgeJobAfter=12month PurgeStepAfter=1month PurgeSuspendAfter=1month LogFile=/var/log/slurmdbd.log PidFile=/var/run/slurmdbd.pid SlurmUser=slurm StorageHost=ufm1 StoragePass=slurm StorageType=accounting_storage/mysql StorageUser=slurm # scontrol show config Configuration data as of 2011-04-25T13:34:15 AccountingStorageBackupHost = (null) AccountingStorageEnforce = none AccountingStorageHost = ufm1 AccountingStorageLoc = N/A AccountingStoragePort = 6819 AccountingStorageType = accounting_storage/slurmdbd AccountingStorageUser = N/A AuthType = auth/none BackupAddr = (null) BackupController = (null) BatchStartTimeout = 120 sec BOOT_TIME = 2011-04-25T13:29:09 CacheGroups = 0 CheckpointType = checkpoint/none ClusterName = test1 CompleteWait = 0 sec ControlAddr = 10.24.250.13 ControlMachine = ufm1 CryptoType = crypto/openssl DebugFlags = (null) DefMemPerCPU = UNLIMITED DisableRootJobs = NO EnforcePartLimits = NO Epilog = /usr/local/slurm/prolog EpilogMsgTime = 2000 usec EpilogSlurmctld = (null) FastSchedule = 1 FirstJobId = 1 GetEnvTimeout = 2 sec GresTypes = (null) GroupUpdateForce = 0 GroupUpdateTime = 600 sec HashVal = Match HealthCheckInterval = 300 sec HealthCheckProgram = /usr/local/slurm/node-health InactiveLimit = 0 sec JobAcctGatherFrequency = 30 sec JobAcctGatherType = jobacct_gather/linux JobCheckpointDir = /var/slurm/checkpoint JobCompHost = localhost JobCompLoc = /var/log/slurm_jobcomp.log JobCompPort = 0 JobCompType = jobcomp/none JobCompUser = root JobCredentialPrivateKey = /usr/local/slurm/etc/key/slurm.key JobCredentialPublicCertificate = /usr/local/slurm/etc/key/slurm.cert JobFileAppend = 0 JobRequeue = 1 JobSubmitPlugins = (null) KillOnBadExit = 0 KillWait = 30 sec Licenses = (null) MailProg = /bin/mail MaxJobCount = 65533 MaxMemPerCPU = UNLIMITED MaxTasksPerNode = 128 MessageTimeout = 100 sec MinJobAge = 30 sec MpiDefault = none MpiParams = (null) MULTIPLE_SLURMD = 1 NEXT_JOB_ID = 177090 OverTimeLimit = 0 min PluginDir = /usr/local/slurm-2.2.4/lib/slurm PlugStackConfig = /usr/local/slurm-2.2.4/etc/plugstack.conf PreemptMode = OFF PreemptType = preempt/none PriorityType = priority/basic PrivateData = none ProctrackType = proctrack/linuxproc Prolog = /usr/local/slurm/epilog PrologSlurmctld = (null) PropagatePrioProcess = 0 PropagateResourceLimits = ALL PropagateResourceLimitsExcept = (null) ResumeProgram = (null) ResumeRate = 300 nodes/min ResumeTimeout = 60 sec ResvOverRun = 0 min ReturnToService = 1 SallocDefaultCommand = (null) SchedulerParameters = HostFormat=0;JobAggregationTime=10 SchedulerPort = 7321 SchedulerRootFilter = 1 SchedulerTimeSlice = 30 sec SchedulerType = sched/wiki SelectType = select/cons_res SelectTypeParameters = CR_CORE SlurmUser = slurm(251) SlurmctldDebug = 3 SlurmctldLogFile = (null) SlurmSchedLogFile = (null) SlurmctldPort = 6817 SlurmctldTimeout = 300 sec SlurmdDebug = 3 SlurmdLogFile = (null) SlurmdPidFile = /var/run/slurmd.pid SlurmdSpoolDir = /tmp/slurmd SlurmdTimeout = 600 sec SlurmdUser = root(0) SlurmSchedLogLevel = 0 SlurmctldPidFile = /var/run/slurmctld.pid SLURM_CONF = /usr/local/slurm-2.2.4/etc/slurm.conf SLURM_VERSION = 2.2.4 SrunEpilog = /usr/local/slurm/srunepilog SrunProlog = /usr/local/slurm/srunprolog StateSaveLocation = /var/lib/slurm/state SuspendExcNodes = (null) SuspendExcParts = (null) SuspendProgram = (null) SuspendRate = 60 nodes/min SuspendTime = NONE SuspendTimeout = 30 sec SwitchType = switch/none TaskEpilog = (null) TaskPlugin = task/affinity TaskPluginParam = (null type) TaskProlog = (null) TmpFS = /tmp TopologyPlugin = topology/none TrackWCKey = 0 TreeWidth = 50 UsePam = 0 UnkillableStepProgram = (null) UnkillableStepTimeout = 60 sec VSizeFactor = 0 percent WaitTime = 0 sec -JE On Thu, 2011-04-14 at 13:47 -0700, Auble, Danny wrote: > Thanks for the info. All the multi-cluster stuff is client based for the > moment, so it shouldn't matter what sched or priority plugin you use. Let us > know how it goes, and if you are able to come back to SLURM's native > scheduling. There are very sophisticated preemption models inherent in > SLURM, and with the use of QOS and association limits you might be able to > satisfy your requirements without much effort. > > Danny > > > -----Original Message----- > > From: [email protected] [mailto:owner-slurm- > > [email protected]] On Behalf Of Josh England > > Sent: Thursday, April 14, 2011 1:39 PM > > To: [email protected] > > Subject: Re: [slurm-dev] two clusters / one scheduler > > > > We have sched/wiki tied in to a custom allocator that does all the > > actual scheduling. At the time it was designed, the driving factors > > were priority-based preemption, user-specified priorities, and job > > affinity. We use select/cons_res and typically run 1 job per core. > > There is a preference for certain jobs to be located on the same node as > > other jobs but still have their own allocation. I'd be interested in > > exploring alternatives that bring the scheduling back into slurm's > > arena, but for now we are using sched/wiki. > > > > -JE > > > > On Thu, 2011-04-14 at 11:01 -0700, Danny Auble wrote: > > > I don't know what you are using the sched/wiki for, but the perlapi > > > should work just fine. You might consider using the > > > priority/multifactor plugin for your priority calculation along with > > > the sched/backfill if you are using something else today. > > > > > > > > > In any case the multi cluster stuff should work fine for most cases. I > > > am interested what you are using sched/wiki for though. > > > > > > > > > Danny > > > > > > > > > > I hadn't read into the multi-cluster functionality yet. That might > > > be > > > > > > > just the way to go but we're making heavy use of the sched/wiki > > > > > > > interface and perlapi bindings. Is the multi-cluster functionality > > > > > > > exposed to those layers? > > > > > > > > > > > > > > -JE > > > > > > > > > > > > > > On Thu, 2011-04-14 at 09:56 -0700, Auble, Danny wrote: > > > > > > > > I am guessing you have each one of these clusters in a separate > > > partition. > > > > > > > > > > > > > > > > How big are these clusters? You can turn off the communication by > > > just set up the treewidth to the number of nodes in you system. > > > > > > > > > > > > > > > > Is there any reason you don't want to/can't use the multi cluster > > > functionality, and operate in traditional SLURM fashion with 1 > > > slurmctld per cluster? > > > > > > > > > > > > > > > > Danny > > > > > > > > > > > > > > > > > -----Original Message----- > > > > > > > > > From: [email protected] [mailto:owner-slurm- > > > > > > > > > [email protected]] On Behalf Of Josh England > > > > > > > > > Sent: Thursday, April 14, 2011 9:50 AM > > > > > > > > > To: [email protected] > > > > > > > > > Subject: [slurm-dev] two clusters / one scheduler > > > > > > > > > > > > > > > > > > I'd like to have a single slurm instance schedule jobs onto two > > > > > > > > > physically disjoint clusters. The compute nodes of one cluster > > > cannot > > > > > > > > > reach the compute nodes of the other cluster, but they can all > > > see the > > > > > > > > > scheduler nodes. With slurm's hierarchical communication, when > > > some > > > > > > > > > nodes can't reach others slurm thinks the nodes are not > > > responding and > > > > > > > > > would eventually mark them offline. Is there any way to > > > logically group > > > > > > > > > nodes into separate communication groups to avoid this problem? > > > > > > > > > > > > > > > > > > -JE > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
