Hello!
I have Open MPI v1.9a1r32142 and slurm 2.5.6.

I can not use mpirun after salloc:

$salloc -N2 --exclusive -p test -J ompi
$LD_PRELOAD=/mnt/data/users/dm2/vol3/semenov/_scratch/mxm/mxm-3.0/lib/libmxm.so 
mpirun -np 1 hello_c
-----------------------------------------------------------------------------------------------------
An ORTE daemon has unexpectedly failed after launch and before
communicating back to mpirun. This could be caused by a number
of factors, including an inability to create a connection back
to mpirun due to a lack of common network interfaces and/or no
route found between them. Please check network connectivity
(including firewalls and network routing requirements).
------------------------------------------------------------------------------------------------------
But if i use mpirun in sbutch script it looks correct:
$cat ompi_mxm3.0
#!/bin/sh
LD_PRELOAD=/mnt/data/users/dm2/vol3/semenov/_scratch/mxm/mxm-3.0/lib/libmxm.so  
mpirun  -x LD_PRELOAD -x MXM_SHM_KCOPY_MODE=off --map-by slot:pe=8 "$@"

$sbatch -N2  --exclusive -p test -J ompi  ompi_mxm3.0 ./hello_c
Submitted batch job 645039
$cat slurm-645039.out 
[warn] Epoll ADD(1) on fd 0 failed.  Old events were 0; read change was 1 
(add); write change was 0 (none): Operation not permitted
[warn] Epoll ADD(4) on fd 1 failed.  Old events were 0; read change was 0 
(none); write change was 1 (add): Operation not permitted
Hello, world, I am 0 of 2, (Open MPI v1.9a1, package: Open MPI 
semenov@compiler-2 Distribution, ident: 1.9a1r32142, repo rev: r32142, Jul 04, 
2014 (nightly snapshot tarball), 146)
Hello, world, I am 1 of 2, (Open MPI v1.9a1, package: Open MPI 
semenov@compiler-2 Distribution, ident: 1.9a1r32142, repo rev: r32142, Jul 04, 
2014 (nightly snapshot tarball), 146)

Regards,
Timur

Reply via email to