I have not tested this type of setup so the following disclaimer needs to
be said. These are not exactly the same release number. They are close but
their code could have something in them that makes them incompatible.
One idea comes to mind is whether the two nodes are on the same subnet?
If they are not on the same subnet I think there is a bug in which the TCP
BTL will recuse itself from communications between the two nodes.
--td
Date: Mon, 28 Jul 2008 16:58:57 -0400
From: "Alexander Shabarshin" <ashabars...@developonbox.com>
Subject: [OMPI users] Communitcation between OpenMPI and ClusterTools
To: <us...@open-mpi.org>
Message-ID: <010001c8f0f4$c1ec8990$e7afcea7@Shabarshin>
Content-Type: text/plain; format=flowed; charset="koi8-r";
reply-type=original
Hello
I try to launch the same MPI sample code on Linux PC (Intel processors)
servers with OpenMPI 1.2.5 and SunFire X2100 (AMD Opteron) servers with
Solaris 10 and ClusterTools 7.1 (it looks like OpenMPI 1.2.5) using TCP
through Ethernet. Linux PC with Linux PC work fine. SunFire with SunFire
work fine. But when I launch the same task on Linux AND SunFire I get this
error message:
--------------------------------------------------------------------------
Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
Process 0.1.0 is unable to reach 0.1.1 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or
environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
mpirun noticed that job rank 1 with PID 25782 on node 10.0.0.2 exited on
signal 15 (Terminated).
it was launched by this command:
mpirun --mca btl tcp,self --hostfile mpshosts -np 2 /mpi/sample
/mpi/sample exists on both platforms compiled properly for each particular
platform
Linux machines have replicated path for SUN-like orted launch:
/opt/SUNWhpc/HPC7.1/bin/orted
Servers are pingable from each other. SSH works fine in both directions.
But OpenMPI doesn't work on these servers... How can I make them
understand each other? Thank you!
Alexander Shabarshin
P.S. This is output of ompi_info diagnostic for ClusterTools 7.1:
Open MPI: 1.2.5r16572-ct7.1b003r3852
Open MPI SVN revision: 0
Open RTE: 1.2.5r16572-ct7.1b003r3852
Open RTE SVN revision: 0
OPAL: 1.2.5r16572-ct7.1b003r3852
OPAL SVN revision: 0
Prefix: /opt/SUNWhpc/HPC7.1
Configured architecture: i386-pc-solaris2.10
Configured by: root
Configured on: Tue Oct 30 17:37:07 EDT 2007
Configure host: burpen-csx10-0
Built by:
Built on: Tue Oct 30 17:52:10 EDT 2007
Built host: burpen-csx10-0
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: yes
Fortran90 bindings size: small
C compiler: cc
C compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/cc
C++ compiler: CC
C++ compiler absolute: /ws/ompi-tools/SUNWspro/SOS11/bin/CC
Fortran77 compiler: f77
Fortran77 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f77
Fortran90 compiler: f95
Fortran90 compiler abs: /ws/ompi-tools/SUNWspro/SOS11/bin/f95
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: yes
C++ exceptions: yes
Thread support: no
Internal debug support: no
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
libltdl support: yes
Heterogeneous support: yes
mpirun default --prefix: yes
MCA backtrace: printstack (MCA v1.0, API v1.0, Component v1.2.5)
MCA paffinity: solaris (MCA v1.0, API v1.0, Component v1.2.5)
MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5)
MCA timer: solaris (MCA v1.0, API v1.0, Component v1.2.5)
MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5)
MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5)
MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5)
MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5)
MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5)
MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5)
MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5)
MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5)
MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5)
MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5)
MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5)
MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5)
MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
MCA btl: udapl (MCA v1.0, API v1.0, Component v1.2.5)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5)
MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5)
MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5)
MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5)
MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5)
MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.5)
MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.5)
MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.5)
MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.5)
MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.5)
MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.5)
MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.5)
MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.5)
MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.5)
MCA ras: tm (MCA v1.0, API v1.3, Component v1.2.5)
MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.5)
MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.5)
MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.5)
MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
v1.2.5)
MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.5)
MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.5)
MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.5)
MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.5)
MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.5)
MCA pls: tm (MCA v1.0, API v1.3, Component v1.2.5)
MCA sds: env (MCA v1.0, API v1.0, Component v1.2.5)
MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.5)
MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.5)
MCA sds: singleton (MCA v1.0, API v1.0, Component v1.2.5)
and output of ompi_info diagnostic for OpenMPI 1.2.5 compiled on Linux:
Open MPI: 1.2.5
Open MPI SVN revision: r16989
Open RTE: 1.2.5
Open RTE SVN revision: r16989
OPAL: 1.2.5
OPAL SVN revision: r16989
Prefix: /usr/local
Configured architecture: i686-pc-linux-gnu
Configured by: shaos
Configured on: Thu Jul 24 12:07:38 EDT 2008
Configure host: remote-linux
Built by: shaos
Built on: Thu Jul 24 12:23:40 EDT 2008
Built host: remote-linux
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: no
Fortran90 bindings size: na
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fortran77 compiler: g77
Fortran77 compiler abs: /usr/bin/g77
Fortran90 compiler: none
Fortran90 compiler abs: none
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: no
C++ exceptions: no
Thread support: posix (mpi: no, progress: no)
Internal debug support: no
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
libltdl support: yes
Heterogeneous support: yes
mpirun default --prefix: no
MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.5)
MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.5)
MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.5)
MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5)
MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.5)
MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5)
MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5)
MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5)
MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5)
MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5)
MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5)
MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5)
MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5)
MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5)
MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5)
MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5)
MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5)
MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5)
MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5)
MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5)
MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5)
MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5)
MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.5)
MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.5)
MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.5)
MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.5)
MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.5)
MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.5)
MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.5)
MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.5)
MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.5)
MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.5)
MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.5)
MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.5)
MCA rds: resfile (MCA v1.0, API v1.3, Component v1.2.5)
MCA rmaps: round_robin (MCA v1.0, API v1.3, Component
v1.2.5)
MCA rmgr: proxy (MCA v1.0, API v2.0, Component v1.2.5)
MCA rmgr: urm (MCA v1.0, API v2.0, Component v1.2.5)
MCA rml: oob (MCA v1.0, API v1.0, Component v1.2.5)
MCA pls: gridengine (MCA v1.0, API v1.3, Component v1.2.5)
MCA pls: proxy (MCA v1.0, API v1.3, Component v1.2.5)
MCA pls: rsh (MCA v1.0, API v1.3, Component v1.2.5)
MCA pls: slurm (MCA v1.0, API v1.3, Component v1.2.5)
MCA sds: env (MCA v1.0, API v1.0, Component v1.2.5)
MCA sds: pipe (MCA v1.0, API v1.0, Component v1.2.5)
MCA sds: seed (MCA v1.0, API v1.0, Component v1.2.5)
MCA sds: singleton (MCA v1.0, API v1.0, Component v1.2.5)
MCA sds: slurm (MCA v1.0, API v1.0, Component v1.2.5)
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users