When I send it´s (mpirun n2 –host compute-0-0,compute-0-1 myApplication)
works fine, 

 

[root@cluster bin]# mpirun -n 3 --host compute-0-0,compute-0-1,compute-0-2
mpi
Hello World from MPI Process 0 on machine compute-0-0.local
Hello World from MPI Process 2 on machine compute-0-2.local
Hello World from MPI Process 1 on machine compute-0-1.local

 

 

but when I send the same application whit slurm, I had the same error.

[root@cluster bin]# srun scriptmpi
/usr/bin/mpi: error while loading shared libraries: libltdl.so.7: cannot
open shared object file: No such file or directory
srun: error: compute-0-0: task 0: Exited with exit code 127
srun: error: compute-0-0: task 0: Exited with exit code 127

 

It looks like a problem is with slurm. By default slurm it allows to launch
MPI jobs? or I have to make some settings?

 

 

 

De: Manuel Rodríguez Pascual [mailto:[email protected]] 
Enviado el: viernes, 18 de septiembre de 2015 5:23
Para: slurm-dev
Asunto: [slurm-dev] Re: Now I have this error with libltdl.so.7

 

Hi Fany,

 

It looks like a problem with your MPI configuration, not Slurm. As a first
test, are you able to run the application without Slurm? 

 

You can try

 

./yourApplication   

 

to run it with a single MPI task,  and

 

mpiexec -N 2 ./yourApplication   (or something similar; this is what I do
with MPICH)

 

to run it with two tasks. Do it on your master node and on the computing
elements, so you can discard configuration problems external to Slurm. 

 

 

 

2015-09-17 20:18 GMT+02:00 Fany Pagés Díaz <[email protected]
<mailto:[email protected]> >:

 

Hello,

 

I'm trying to run the job and I get this error.

 

[root@cluster bin]# srun scriptmpi
/usr/bin/mpi: error while loading shared libraries: libltdl.so.7: cannot
open shared object file: No such file or directory
srun: error: compute-0-0: task 0: Exited with exit code 127
srun: error: compute-0-0: task 0: Exited with exit code 127

 

 

And this is my script

 

#!/bin/bash

 

#SBATCH --job-name="mpi"

#SBATCH --partition="cluster"

#SBACTH --nodes=compute-0-0,compute-0-1,compute-0-2

#SBATCH -n 3

#SBATCH --output=test-srun.out

#SBATCH --error=test-srun.err

source /etc/profile

module load openmpi-x86_64

 

srun mpi

 

Thanks, 

Ing.Fany Pagés Díaz

 

 

 

 

 





 

-- 

Dr. Manuel Rodríguez-Pascual
skype: manuel.rodriguez.pascual
phone: (+34) 913466173 // (+34) 679925108
 
CIEMAT-Moncloa
Edificio 22, desp. 1.25
Avenida Complutense, 40 
28040- MADRID
SPAIN



<<attachment: winmail.dat>>

Reply via email to