You can try to disable SLURM :

mpirun --mca ras ^slurm --mca plm ^slurm --mca ess ^slurm,slurmd ...

That will require you are able to SSH between compute nodes.
Keep in mind this is far form ideal since it might leave some MPI
processes on nodes if you cancel a job, and mess SLURM accounting too.


Cheers,

Gilles

On Wed, May 16, 2018 at 3:50 PM, Nicolas Deladerriere
<nicolas.deladerri...@gmail.com> wrote:
> Hi all,
>
>
>
> I am trying to run mpi application through SLURM job scheduler. Here is my
> running sequence
>
>
> sbatch --> my_env_script.sh --> my_run_script.sh --> mpirun
>
>
> In order to minimize modification of my production environment, I had to
> setup following hostlist management in different scripts:
>
>
> my_env_script.sh
>
>
> build host list from SLURM resource manager information
>
> Example: node01 nslots=2 ; node02 nslots=2 ; node03 nslots=2
>
>
> my_run_script.sh
>
>
> Build host list according to required job (process mapping depends on job
> requirement).
>
> Nodes are always fully dedicated to my job, but I have to manage different
> master-slave situation with corresponding mpirun command:
>
> as many process as number of slots:
>
> mpirun -H node01 -np 1 process_master.x : -H node02,node02,node03,node03 -np
> 4 process_slave.x
>
> only one process per node (slots are usually used through openMP threading)
>
> mpirun -H node01 -np 1 other_process_master.x : -H node02,node03 -np 2
> other_process_slave.x
>
>
>
> However, I realized that whatever I specified through my mpirun command,
> process mapping is overridden at run time by slurm according to slurm
> setting (either default setting or sbatch command line). For example, if I
> run with:
>
>
> sbatch -N 3 --exclusive my_env_script.sh myjob
>
>
> where final mpirun command (depending on myjob) is:
>
>
> mpirun -H node01 -np 1 other_process_master.x : -H node02,node03 -np 2
> other_process_slave.x
>
>
> It will be run with process mapping corresponding to:
>
>
> mpirun -H node01 -np 1 other_process_master.x : -H node02,node02 -np 2
> other_process_slave.x
>
>
> So far I did not find a way to force mpirun to run with host mapping from
> command line instead of slurm one. Is there a way to do it (either by using
> MCA parameters of slurm configuration or …) ?
>
>
> openmpi version : 1.6.5
>
> slurm version : 17.11.2
>
>
>
> Ragards,
>
> Nicolas
>
>
> Note 1: I know, it would be better to let slurm manage my process mapping by
> only using slurm parameters and not specifying host mapping in my mpirun
> command, but in order to minimize modification in my production environment
> I had to use such solution.
>
> Note 2: I know I am using old openmpi version !
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to