Re: [OMPI users] Multiple mpiexec's within a job (schedule within a scheduled machinefile/job allocation)

2009-07-29 Thread Ralph Castain
Oh my - that does take me back a long way! :-) Do you need these processes to be mapped byslot (i.e., do you care if the process ranks are sharing nodes)? If not, why not add "-bynode" to your cmd line? Alternatively, given the mapping you want, just do mpirun -npernode 1 application.exe

Re: [OMPI users] Multiple mpiexec's within a job (schedule within a scheduled machinefile/job allocation)

2009-07-29 Thread Adams, Brian M
Hi Ralph (all), I'm resurrecting this 2006 thread for a status check. The new 1.3.x machinefile behavior is great (thanks!) -- I can use machinefiles to manage multiple simultaneous mpiruns within a single torque allocation (where the hosts are a subset of $PBS_NODEFILE). However, this

Re: [OMPI users] Test works with 3 computers, but not 4?

2009-07-29 Thread Ralph Castain
Ah, so there is a firewall involved? That is always a problem. I gather that node 126 has clear access to all other nodes, but nodes 122, 123, and 125 do not all have access to each other? See if your admin is willing to open at least one port on each node that can reach all other nodes.

Re: [OMPI users] Test works with 3 computers, but not 4?

2009-07-29 Thread David Doria
On Wed, Jul 29, 2009 at 4:15 PM, Ralph Castain wrote: > Using direct can cause scaling issues as every process will open a socket > to every other process in the job. You would at least have to ensure you > have enough file descriptors available on every node. > The most

Re: [OMPI users] Test works with 3 computers, but not 4?

2009-07-29 Thread Nifty Tom Mitchell
On Wed, Jul 29, 2009 at 01:42:39PM -0600, Ralph Castain wrote: > > It sounds like perhaps IOF messages aren't getting relayed along the > daemons. Note that the daemon on each node does have to be able to send > TCP messages to all other nodes, not just mpirun. > > Couple of things you can do

Re: [OMPI users] Test works with 3 computers, but not 4?

2009-07-29 Thread Ralph Castain
Using direct can cause scaling issues as every process will open a socket to every other process in the job. You would at least have to ensure you have enough file descriptors available on every node. The most likely cause is either (a) a different OMPI version getting picked up on one of

Re: [OMPI users] Test works with 3 computers, but not 4?

2009-07-29 Thread David Doria
On Wed, Jul 29, 2009 at 3:42 PM, Ralph Castain wrote: > It sounds like perhaps IOF messages aren't getting relayed along the > daemons. Note that the daemon on each node does have to be able to send TCP > messages to all other nodes, not just mpirun. > > Couple of things you

Re: [OMPI users] Test works with 3 computers, but not 4?

2009-07-29 Thread Ralph Castain
It sounds like perhaps IOF messages aren't getting relayed along the daemons. Note that the daemon on each node does have to be able to send TCP messages to all other nodes, not just mpirun. Couple of things you can do to check: 1. -mca routed direct - this will send all messages direct

[OMPI users] Test works with 3 computers, but not 4?

2009-07-29 Thread David Doria
I wrote a simple program to display "hello world" from each process. When I run this (126 - my machine, 122, and 123), everything works fine: [doriad@daviddoria MPITest]$ mpirun -H 10.1.2.126,10.1.2.122,10.1.2.123 hello-mpi >From process 1 out of 3, Hello World! >From process 2 out of 3, Hello

Re: [OMPI users] strange IMB runs

2009-07-29 Thread Dorian Krause
Hi, --mca mpi_leave_pinned 1 might help. Take a look at the FAQ for various tuning parameters. Michael Di Domenico wrote: I'm not sure I understand what's actually happened here. I'm running IMB on an HP superdome, just comparing the PingPong benchmark HP-MPI v2.3 Max ~ 700-800MB/sec

[OMPI users] strange IMB runs

2009-07-29 Thread Michael Di Domenico
I'm not sure I understand what's actually happened here. I'm running IMB on an HP superdome, just comparing the PingPong benchmark HP-MPI v2.3 Max ~ 700-800MB/sec OpenMPI v1.3 -mca btl self,sm - Max ~ 125-150MB/sec -mca btl self,tcp - Max ~ 500-550MB/sec Is this behavior expected? Are there

Re: [OMPI users] users Digest, Vol 1302, Issue 1

2009-07-29 Thread Ricardo Fonseca
Yes, I am using the right one. I've installed the freshly compiled openmpi into /opt/openmpi/1.3.3-g95-32. If I edit the mpif.h file by hand and put "error!" in the first line I get: zamblap:sandbox zamb$ edit /opt/openmpi/1.3.3-g95-32/include/mpif.h zamblap:sandbox zamb$ mpif77

[OMPI users] Jeffrey M Ceason is out of the office.

2009-07-29 Thread Jeffrey M Ceason
I will be out of the office starting 07/28/2009 and will not return until 08/03/2009. I will respond to your message when I return.

Re: [OMPI users] OMPI users] MPI_IN_PLACE in Fortran withMPI_REDUCE / MPI_ALLREDUCE

2009-07-29 Thread Jeff Squyres
Can you confirm that you're using the right mpif.h? Keep in mind that each MPI implementation's mpif.h is different -- it's a common mistake to assume that the mpif.h from one MPI implementation should work with another implementation (e.g., someone copied mpif.h from one MPI to your

Re: [OMPI users] users Digest, Vol 1296, Issue 6

2009-07-29 Thread Josh Hursey
This mailing list supports the Open MPI implementation of the MPI standard. If you have concerns about Intel MPI you should contact their support group. The ompi_checkpoint/ompi_restart routines are designed to work with Open MPI, and will certainly fail when used with other MPI

Re: [OMPI users] users Digest, Vol 1296, Issue 6

2009-07-29 Thread Mallikarjuna Shastry
DEAR SIR/MADAM kindly tell the commands for checkpointing and restarting of mpi programs using intel mpi i tried the following commands they did not work ompi_checkpoint ompi_restart file name of global snap shot with regards mallikarjuna shastry