Lothar,

it seems you did not configure Open MPI with --with-pmi=<path to SLURM's PMI>

If SLURM was built with PMIx support, then an other option is to use that.
First, srun --mpi=list will show you the list of available MPI
modules, and then you could
srun --mpi=pmix_v2 ... MPI_Hellow
If you believe that should be the default, then you should contact
your sysadmin that can make that for you.

You you want to use PMIx, then I recommend you configure Open MPI with
the same external PMIx that was used to
build SLURM (e.g. configure --with-pmix=<path to PMIx>). Though PMIx
has cross version support, using the same PMIx will avoid you running
incompatible PMIx versions.


Cheers,

Gilles
On Fri, Nov 23, 2018 at 5:20 PM Lothar Brendel
<lothar.bren...@uni-due.de> wrote:
>
> Hi guys,
>
> I've always been somewhat at a loss regarding slurm's idea about tasks vs. 
> jobs. That didn't cause any problems, though, until passing to OpenMPI2 
> (2.0.2 that is, with slurm 16.05.9).
>
> Running http://mpitutorial.com/tutorials/mpi-hello-world as an example with 
> just
>
>         srun -n 2 MPI-hellow
>
> yields
>
> Hello world from processor node31, rank 0 out of 1 processors
> Hello world from processor node31, rank 0 out of 1 processors
>
> i.e. the two tasks don't see each other MPI-wise. Well, srun doesn't include 
> an mpirun.
>
> But running
>
>         srun -n 2 mpirun MPI-hellow
>
> produces
>
> Hello world from processor node31, rank 1 out of 2 processors
> Hello world from processor node31, rank 0 out of 2 processors
> Hello world from processor node31, rank 1 out of 2 processors
> Hello world from processor node31, rank 0 out of 2 processors
>
> i.e. I get *two* independent MPI-tasks with 2 processors each. (The same 
> applies if stating explicitly "mpirun -np 2".)
> I never could make sense of this squaring, I rather used to run my jobs like
>
>         srun -c 2 mpirun -np 2 MPI-hellow
>
> which provided the desired job with *one* task using 2 processors. Passing 
> from OpenMPI 1.6.5 to 2.0.2 (Debian Jessie -> Stretch), though, I'm getting 
> the error
> "There are not enough slots available in the system to satisfy the 2 slots
> that were requested by the application:
>   MPI-hellow" now.
>
> The environment on the node contains
>
> SLURM_CPUS_ON_NODE=2
> SLURM_CPUS_PER_TASK=2
> SLURM_JOB_CPUS_PER_NODE=2
> SLURM_NTASKS=1
> SLURM_TASKS_PER_NODE=1
>
> which looks fine to me, but mpirun infers slots=1 from that (confirmed by 
> ras_base_verbose 5). In deed, looking into 
> orte/mca/ras/slurm/ras_slurm_module.c, I find that while 
> orte_ras_slurm_allocate() reads the value of SLURM_CPUS_PER_TASK into its 
> local variable cpus_per_task, it doesn't use it anywhere. Rather, the number 
> of slots is determined from SLURM_TASKS_PER_NODE.
>
> Is this intended behaviour?
>
> What's wrong here? I know that I can use --oversubscribe, but that seems 
> rather a workaround.
>
> Thanks in advance,
>         Lothar
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to