Dear all


We are running SLURM version 2.3 and openmpi 1.6.0.

In order to run openmpi jobs and inherit the correct task affinity from SLURM, 
jobs are executed with 'srun --resv-ports ./the_job' under sbatch (or salloc).



Pure mpi tasks with --cpus-per-task=1 run fine.

The issue is when attempting a hybrid mpi-omp task, with --cpus-per-task > 1,  
the job fails when using 'srun --resv-ports'.

Many error messages are printed, along the lines of

' ORTE_ERROR_LOG: Not found in file ess_slurmd_module.c at line 504'



I am not the administrator of the cluster, only a user, but I was hoping we 
might be able to point the administrators in a useful direction to solve the 
issue.

Is this a known issue?  E.g. due to some incompatibility between this SLURM 
version and the OpenMPI we have installed?  Would updating SLURM and/or OpenMPI 
solve this issue?  Or could it be a configuration issue that is easily fixed?  
(see config file below)



As a side issue, maybe related, we find that

- We can run multiple threads per task if we execute using mpirun (e.g. mpirun 
-bind-to-socket -bysocket), but mpirun does not know anything about what cores 
it has been allocated, so it only works with exclusive node option.  On shared 
nodes it will often crash.

- We don't use mpirun for pure MPI jobs since we find tasks do not have the 
correct task affinity/binding (in this case, no binding).  Hence we use 'srun' 
since nodes are shared.

- With srun we must use '--resv-ports'.  Without resv-ports results in the 
error message:

  orte_grpcomm_modex failed
  --> Returned "A message is attempting to be sent to a process whose contact 
information is unknown" (-117) instead of "Success" (0)



Hopefully someone can advise how we can make it work for multiple threaded 
jobs?  Thanks in advance.

Andy


Andrew Turner
Culham Centre for Fusion Energy
Culham Science Centre
Abingdon
Oxfordshire
OX14 3DB

www.ccfe.ac.uk<http://www.ccfe.ac.uk/>

Our slurm.conf file

ClusterName=erik
ControlMachine=erik000
BackupController=erik001
SlurmUser=slurm
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
StateSaveLocation=/home/sysadmin/SlurmState
SlurmdSpoolDir=/tmp/slurmd
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
Proctracktype=proctrack/linuxproc
CacheGroups=0
ReturnToService=1
TaskPlugin=task/affinity
# TIMERS
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
#
# SCHEDULING
SchedulerType=sched/wiki
SchedulerPort=7321
SelectType=select/cons_res
FastSchedule=1
# LOGGING
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurmd.log
JobCompType=jobcomp/filetxt
JobCompLoc=/var/slurm/accounting
#
# ACCOUNTING
JobAcctGatherType=jobacct_gather/linux
JobAcctGatherFrequency=30
#
AccountingStorageType=accounting_storage/filetxt
#
# MPI
MpiParams=ports=12000-12999
#
# COMPUTE NODES
NodeName=erik000 Procs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 
State=UNKNOWN
NodeName=DEFAULT Procs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 
RealMemory=129009 State=UNKNOWN
NodeName=erik[001-044]
PartitionName=erik Nodes=erik[001-044] Default=YES MaxTime=INFINITE State=UP

Reply via email to