On 02/28/2014 09:13 AM, L. Shawn Matott wrote:

Danny,

That's good to know. Which of the steps causes the loss of functionality (rankfile, ssh as plm, or mpirun instead of srun)?
Yes :).  Primarily mpirun instead of srun though.

--- Shawn

-----Original Message----- From: Danny Auble
Sent: Friday, February 28, 2014 12:09 PM
To: slurm-dev
Subject: [slurm-dev] Re: openmpi misbehaves when started under slurm


Just a notice to those attempting to run this way, Slurm will not be
able to monitor the step or keep accounting or enforce memory limits
when running this way.

On 02/28/2014 09:01 AM, L. Shawn Matott wrote:

On our cluster we use SLURM v2.6.3 with cpusets enabled. We sometimes see problems with openmpi and incorrect cpu pinning. As a workaround we use the following bit of bash code to manually assemble an openmpi rankfile, switch
from slurm to ssh as the process launch module, and finally launch using
mpirun instead of srun. Hope this is helpful to someone.....

----
L. Shawn Matott, PhD
Computational Scientist
University at Buffalo,
Center for Computational Research
701 Ellicott Street, Buffalo, New York 14203

#
================================================================================================
# create rank file to explicitly bind cores
echo "creating hostfile and rankfile"
uid=`id -u`
jid=$SLURM_JOB_ID
nodes=`nodeset -e $SLURM_NODELIST`

# trigger creation of cpuset information and save to working dir
srun bash -c "cat /cgroup/cpuset/slurm/uid_${uid}/job_${jid}/cpuset.cpus >
cpus.\`hostname\`.$SLURM_JOB_ID"

RANKFILE=rankfile.$$
NODEFILE=nodefile.$$

rm -f $RANKFILE
rm -f $NODEFILE
rank=0
for i in ${nodes}; do
 # extract space-separated list of assigned cpus
 cpus=`cat cpus.${i}.${SLURM_JOB_ID}`
 cpus=`nodeset -Re $cpus`
 # add cpu assignments to the rank file
 for j in ${cpus}; do
   echo "rank ${rank}=$i slot=$j" >> $RANKFILE
   echo "$i" >> $NODEFILE
   rank=`expr $rank + 1`
   if [ "$rank" == "$SLURM_NPROCS" ]; then
     break;
   fi
 done
 if [ "$rank" == "$SLURM_NPROCS" ]; then
   break;
 fi
done

# use ssh instead of slurm as the launcher
# the rankfile that was just created will ensure cpusets are still honored.
export OMPI_MCA_plm=rsh

# launch application using mpirun
echo "Launching application using mpirun"
mpirun \
 -h $NODEFILE \
 --rankfile $RANKFILE  \
 --prefix $OMPI \
 --n $SLURM_NPROCS \
 --display-map  \
 --verbose $EXE $ARGS
#
================================================================================================

Reply via email to