Re: [OMPI users] mpirun unsuccessful when run across multiple nodes

2011-04-18 Thread Reuti
Am 18.04.2011 um 15:40 schrieb chenjie gu:

> I am a green hand on Openmpi, I have the following Openmpi structure, however 
> it has problem when running across multiple nodes.
> I am trying to build a Bewolf Cluster between 6 nodes of our serve (HP 
> Proliant G460 G7), I have installed the Openmpi on one node (assuming at 
> /mirror),
> ./configure --prefix=/mirror/openmpi CC=icc CXX=icpc F77=ifort FC=ifort
> make all install 
> 
> using NFS, the directory of /mirror was successfully exported to the rest of 
> 5 nodes. Now as I test the Openmpi, it runs very well on a single node, 
> however it hangs across multiple nodes. 
> 
> Now one possible reason as I know is that Openmpi uses TCP to exchange data 
> between different nodes, so I am worried about 
> whether there are firewalls between each nodes, which can be factory 
> integrated at somewhere(switch/NIC). Could anyone give me some 
> information on this point?

It's not only about MPI communcation. Before you need some means to allow the 
startup of the local orte daemons on each machine by passphraseless ssh-keys or 
better hostbased authentication 
http://arc.liv.ac.uk/SGE/howto/hostbased-ssh.html , or enable `rsh` on the 
machines and tell Open MPI to use it. Is:

mpiexec hostname

giving you a list of the involved machines?

-- Reuti


> Thanks a lot,
> Regards,
> ArchyGU
> Nanyang Technological University
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] mpirun unsuccessful when run across multiple nodes

2011-04-18 Thread chenjie gu
Dear all,
I am a green hand on Openmpi, I have the following Openmpi structure,
however it has problem when running across multiple nodes.
I am trying to build a Bewolf Cluster between 6 nodes of our serve (HP
Proliant G460 G7), I have installed the Openmpi on one node (assuming at
/mirror),
./configure --prefix=/mirror/openmpi CC=icc CXX=icpc F77=ifort FC=ifort
make all install

using NFS, the directory of /mirror was successfully exported to the rest of
5 nodes. Now as I test the Openmpi, it runs very well on a single node,
however it hangs across multiple nodes.

Now one possible reason as I know is that Openmpi uses TCP to exchange data
between different nodes, so I am worried about
whether there are firewalls between each nodes, which can be factory
integrated at somewhere(switch/NIC). Could anyone give me some
information on this point?

Thanks a lot,
Regards,
ArchyGU
Nanyang Technological University


Re: [OMPI users] Try to submit OMPI job to SGE gives ERRORS (orte_plm_base_select failed & orte_ess_set_name failed) (Reuti)

2011-04-18 Thread Reuti
Am 17.04.2011 um 01:21 schrieb Derrick LIN:

> 
> > Well, does `mpiexec` point to the correct one?
> 
> I don't really get this. I only installed one and only one OpenMPI on the 
> node. There shouldn't have another 'mpiexec' on the system.

It could be one from any other MPI implementation by accident.


> It's worthy to mention that every node is deployed from a master image. So 
> everything is exactly the same except IP and DNS name.
> > I thought you compiled it on your own with --with-sge. What about: 
> 
> pwbcad@sgeqexec01:~$ ompi_info | grep grid
>  MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.4.1)

Fine.


> Is there any location I can find a more meaningful OpenMPI log?

Can you run a simple `mpiexec hostname` in the script?


> I will try to install openmpi 1.4.3 and see if that works.
> 
> I want to confirm one more thing: does SGE's master host need to have OpenMPI 
> installed? Is it relevant?

In principle: no. But often it's installed too, as you will compile on either 
the master machine or a dedicated login server.

-- Reuti


> Many thanks Reuti
> 
> Derrick
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Ofed v1.5.3?

2011-04-18 Thread Jeff Squyres
Yes.

On Apr 16, 2011, at 1:34 PM, Michael Di Domenico wrote:

> Does OpenMPI v1.5.3 support Ofed v.1.5.3.1 ?
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/