SLURM's PMI library gets the task rank from the environment variable SLURM_PROCID. I believe that you are not using that library. You can at least confirm that the environment variable is set correctly by running something like this: $ srun -n4 -l printenv SLURM_PROCID | sort 0: 0 1: 1 2: 2 3: 3
If you get output like above then the problem is definitely that of not using SLURM's PMI library. Quoting Sarah Mulholland <[email protected]>: > I should add that I have MpiDefault=none in my slurm configuration > file as suggested by the slurm configuration tool. > > From: Sarah Mulholland > Sent: Thursday, June 28, 2012 3:59 PM > To: '[email protected]' > Subject: slurm and mpich2 > > I have installed slurm-2.3.5 and mpich2-1.4.1p1. We are using the > hydra process manager for mpich. As suggested on the ANL web site, > I installed configured mpich2 with > -with-hydra-bss=ssh,rsh,fork,slurm. Yet when I launch a process > with srun all tasks are rank 0. > > I tried building mpich2 with slurm's native PMI library by > configuring --with-pmi=slurm -with-pm=no > -with-slurm=[/our/path/here], but autoconf didn't find slurm in the > given location. > > Has anybody else experienced this? Any suggestions? >
