Hmmm….I haven’t seen someone using OMPI 1.6.0 in a very long time. Please note that the latest OMPI release is now 1.10.0, so your installation is rather far behind.
At the very least, I would start by updating OMPI to the 1.8.8 or 1.10.0 level. You will then find that the SLURM integration has improved quite a bit, and you no longer need to use the —resv-ports option. OMPI will run with the standard PMI library. You will also find that mpirun will respect the SLURM-assigned task affinity. You may also want to update SLURM, but I leave that to others to advise - the OMPI change by itself should resolve the problem. > On Aug 27, 2015, at 2:26 AM, Turner, Andrew <[email protected]> wrote: > > Dear all > > We are running SLURM version 2.3 and openmpi 1.6.0. > In order to run openmpi jobs and inherit the correct task affinity from > SLURM, jobs are executed with 'srun --resv-ports ./the_job' under sbatch (or > salloc). > > Pure mpi tasks with --cpus-per-task=1 run fine. > The issue is when attempting a hybrid mpi-omp task, with --cpus-per-task > 1, > the job fails when using 'srun --resv-ports'. > Many error messages are printed, along the lines of > ' ORTE_ERROR_LOG: Not found in file ess_slurmd_module.c at line 504' > > I am not the administrator of the cluster, only a user, but I was hoping we > might be able to point the administrators in a useful direction to solve the > issue. > Is this a known issue? E.g. due to some incompatibility between this SLURM > version and the OpenMPI we have installed? Would updating SLURM and/or > OpenMPI solve this issue? Or could it be a configuration issue that is > easily fixed? (see config file below) > > As a side issue, maybe related, we find that > - We can run multiple threads per task if we execute using mpirun (e.g. > mpirun -bind-to-socket -bysocket), but mpirun does not know anything about > what cores it has been allocated, so it only works with exclusive node > option. On shared nodes it will often crash. > - We don’t use mpirun for pure MPI jobs since we find tasks do not have the > correct task affinity/binding (in this case, no binding). Hence we use > ‘srun’ since nodes are shared. > - With srun we must use ‘--resv-ports’. Without resv-ports results in the > error message: > orte_grpcomm_modex failed > --> Returned "A message is attempting to be sent to a process whose contact > information is unknown" (-117) instead of "Success" (0) > > Hopefully someone can advise how we can make it work for multiple threaded > jobs? Thanks in advance. > > Andy > > > Andrew Turner > Culham Centre for Fusion Energy > Culham Science Centre > Abingdon > Oxfordshire > OX14 3DB > > www.ccfe.ac.uk <http://www.ccfe.ac.uk/> > > Our slurm.conf file > > ClusterName=erik > ControlMachine=erik000 > BackupController=erik001 > SlurmUser=slurm > SlurmctldPort=6817 > SlurmdPort=6818 > AuthType=auth/munge > StateSaveLocation=/home/sysadmin/SlurmState > SlurmdSpoolDir=/tmp/slurmd > SwitchType=switch/none > MpiDefault=none > SlurmctldPidFile=/var/run/slurmctld.pid > SlurmdPidFile=/var/run/slurmd.pid > Proctracktype=proctrack/linuxproc > CacheGroups=0 > ReturnToService=1 > TaskPlugin=task/affinity > # TIMERS > SlurmctldTimeout=300 > SlurmdTimeout=300 > InactiveLimit=0 > MinJobAge=300 > KillWait=30 > Waittime=0 > # > # SCHEDULING > SchedulerType=sched/wiki > SchedulerPort=7321 > SelectType=select/cons_res > FastSchedule=1 > # LOGGING > SlurmctldDebug=3 > SlurmctldLogFile=/var/log/slurmctld.log > SlurmdDebug=3 > SlurmdLogFile=/var/log/slurmd.log > JobCompType=jobcomp/filetxt > JobCompLoc=/var/slurm/accounting > # > # ACCOUNTING > JobAcctGatherType=jobacct_gather/linux > JobAcctGatherFrequency=30 > # > AccountingStorageType=accounting_storage/filetxt > # > # MPI > MpiParams=ports=12000-12999 > # > # COMPUTE NODES > NodeName=erik000 Procs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 > State=UNKNOWN > NodeName=DEFAULT Procs=16 Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 > RealMemory=129009 State=UNKNOWN > NodeName=erik[001-044] > PartitionName=erik Nodes=erik[001-044] Default=YES MaxTime=INFINITE State=UP
