The slaves send specific requests to the master and then waits for a
reply to that request. For instance it might send a request to read a
variable from the file. The master will read the variable and send it
back with the same tag in response. Thus there is never more than one
response at a time to a given slave. We do not use any broadcast
functions in the code.
The fact that it run ok on one host but not more than one host
seems to
indicate something else is the problem. The code has been used for 13
years in parallel and runs with PVM and other MPI distros without any
problems. The communication patterns are very simple and only require
that message order be preserved.
-----Original Message-----
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Tuesday, January 30, 2007 8:44 AM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using ssh
starteronmultiple nodes.
On Jan 30, 2007, at 9:35 AM, Fisher, Mark S wrote:
The master process uses both MPI_ANY_SOURCE and MPI_ANY_TAG while
waiting for requests from slave processes. The slaves sometimes use
MPI_ANY_TAG but the source is always specified.
I think you said that you only had corruption issues on the slave,
right? If so, the ANY_SOURCE/ANY_TAG on the master probably aren't
the
issue.
But if you're doing ANY_TAG on the slaves, you might want to double
check that that code is doing exactly what you think it's doing. Are
there any race conditions such that a message could be received on
that
ANY_TAG that you did not intend to receive there? Look especially
hard
at non-blocking receives with ANY_TAG.
We have run the code through valgrid for a number of cases including
the one being used here.
Excellent.
The code is Fortran 90 and we are using the FORTRAN 77 interface so I
do not believe this is a problem.
Agreed; should not be an issue.
We are using Gigabit Ethernet.
Ok, good.
I could look at LAM again to see if it would work. The code needs to
be in a specific working directory and we need some environment
variable set. This was not supported well in pre MPI 2. versions of
MPI. For
MPICH1 I actually launch a script for the slaves so that we have the
proper setup before running the executable. Note I had tried that
with
OpenMPI and it had an internal error in orterun. This is not a
problem
Really? OMPI's mpirun does not depend on the executable being an MPI
application -- indeed, you can "mpirun -np 2 uptime" with no problem.
What problem did you run into here?
since the mpirun can setup everything we need. If you think it is
worth while I will download and try it.
From what you describe, it sounds like order of messaging may be the
issue, not necessarily MPI handle types. So let's hold off on that
one
for the moment (although LAM should be pretty straightforward to
try --
you should be able to mpirun scripts with no problems; perhaps you can
try it as a background effort when you have spare cycles / etc.), and
look at your slave code for receiving.
-----Original Message-----
From: Jeff Squyres [mailto:jsquy...@cisco.com]
Sent: Monday, January 29, 2007 7:54 PM
To: Open MPI Users
Subject: Re: [OMPI users] Scrambled communications using ssh starter
onmultiple nodes.
Without analyzing your source, it's hard to say. I will say that
OMPI
may send fragments out of order, but we do, of course, provide the
same message ordering guarantees that MPI mandates. So let me ask a
few leading questions:
- Are you using any wildcards in your receives, such as
MPI_ANY_SOURCE
or MPI_ANY_TAG?
- Have you run your code through a memory-checking debugger such as
valgrind?
- I don't know what Scali MPI uses, but MPICH and Intel MPI use
integers for MPI handles. Have you tried LAM/MPI as well? It, like
Open MPI, uses pointers for MPI handles. I mention this because apps
that unintentionally have code that takes advantage of integer
handles
can sometimes behave unpredictably when switching to a pointer-based
MPI implementation.
- What network interconnect are you using between the two hosts?
On Jan 25, 2007, at 4:22 PM, Fisher, Mark S wrote:
Recently I wanted to try OpenMPI for use with our CFD flow solver
WINDUS. The code uses a master/slave methodology were the master
handles I/O and issues tasks for the slaves to perform. The original
parallel implementation was done in 1993 using PVM and in 1999 we
added support for MPI.
When testing the code with Openmpi 1.1.2 it ran fine when running on
a
single machine. As soon as I ran on more than one machine I started
getting random errors right away (the attached tar ball has a good
and
bad output). It looked like either the messages were out of order or
were for the other slave process. In the run mode used there is no
slave to slave communication. In the file the code died near the
beginning of the communication between master and slave.
Sometimes it
will run further before it fails.
I have included a tar file with the build and configuration info.
The
two nodes are identical Xeon 2.8 GHZ machines running SLED 10. I am
running real-time (no queue) using the ssh starter using the
following
appt file.
-x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent /usr/bin/ssh --host
skipper2 -wdir /opt/scratch/m209290/ol.scr.16348 -np 1 ./
__bcfdbeta.exe -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent
/usr/bin/ssh --host copland -wdir /tmp/mpi.m209290 -np 2
./__bcfdbeta.exe
The above file fails but the following works:
-x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent /usr/bin/ssh --host
skipper2 -wdir /opt/scratch/m209290/ol.scr.16348 -np 1 ./
__bcfdbeta.exe -x PVMTASK -x BCFD_PS_MODE --mca pls_rsh_agent
/usr/bin/ssh --host
skipper2 -wdir /tmp/mpi.m209290 -np 2 ./__bcfdbeta.exe
The first process is the master and the second two are the slaves.
I am
not sure what is going wrong, the code runs fine with many other MPI
distributions (MPICH1/2, Intel, Scali...). I assume that either I
built it wrong or am not running it properly but I cannot see what I
am doing wrong. Any help would be appreciated!
<<mpipb.tgz>>
<mpipb.tgz>
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users