Lothar, it seems you did not configure Open MPI with --with-pmi=<path to SLURM's PMI>
If SLURM was built with PMIx support, then an other option is to use that. First, srun --mpi=list will show you the list of available MPI modules, and then you could srun --mpi=pmix_v2 ... MPI_Hellow If you believe that should be the default, then you should contact your sysadmin that can make that for you. You you want to use PMIx, then I recommend you configure Open MPI with the same external PMIx that was used to build SLURM (e.g. configure --with-pmix=<path to PMIx>). Though PMIx has cross version support, using the same PMIx will avoid you running incompatible PMIx versions. Cheers, Gilles On Fri, Nov 23, 2018 at 5:20 PM Lothar Brendel <lothar.bren...@uni-due.de> wrote: > > Hi guys, > > I've always been somewhat at a loss regarding slurm's idea about tasks vs. > jobs. That didn't cause any problems, though, until passing to OpenMPI2 > (2.0.2 that is, with slurm 16.05.9). > > Running http://mpitutorial.com/tutorials/mpi-hello-world as an example with > just > > srun -n 2 MPI-hellow > > yields > > Hello world from processor node31, rank 0 out of 1 processors > Hello world from processor node31, rank 0 out of 1 processors > > i.e. the two tasks don't see each other MPI-wise. Well, srun doesn't include > an mpirun. > > But running > > srun -n 2 mpirun MPI-hellow > > produces > > Hello world from processor node31, rank 1 out of 2 processors > Hello world from processor node31, rank 0 out of 2 processors > Hello world from processor node31, rank 1 out of 2 processors > Hello world from processor node31, rank 0 out of 2 processors > > i.e. I get *two* independent MPI-tasks with 2 processors each. (The same > applies if stating explicitly "mpirun -np 2".) > I never could make sense of this squaring, I rather used to run my jobs like > > srun -c 2 mpirun -np 2 MPI-hellow > > which provided the desired job with *one* task using 2 processors. Passing > from OpenMPI 1.6.5 to 2.0.2 (Debian Jessie -> Stretch), though, I'm getting > the error > "There are not enough slots available in the system to satisfy the 2 slots > that were requested by the application: > MPI-hellow" now. > > The environment on the node contains > > SLURM_CPUS_ON_NODE=2 > SLURM_CPUS_PER_TASK=2 > SLURM_JOB_CPUS_PER_NODE=2 > SLURM_NTASKS=1 > SLURM_TASKS_PER_NODE=1 > > which looks fine to me, but mpirun infers slots=1 from that (confirmed by > ras_base_verbose 5). In deed, looking into > orte/mca/ras/slurm/ras_slurm_module.c, I find that while > orte_ras_slurm_allocate() reads the value of SLURM_CPUS_PER_TASK into its > local variable cpus_per_task, it doesn't use it anywhere. Rather, the number > of slots is determined from SLURM_TASKS_PER_NODE. > > Is this intended behaviour? > > What's wrong here? I know that I can use --oversubscribe, but that seems > rather a workaround. > > Thanks in advance, > Lothar > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users