Thank you Jeff! That solves the problem. :-) You are the lifesaver!
So does that means I always need to copy my application to all the nodes? Or should I give the pathname of the my executable in a different way to avoid this? Do I need a network file system for that?


Jeff Squyres wrote:
The short version of the answer is to check to see that the executable is in the same location on both nodes (apparently: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out). Open MPI is complaining that it can't find that specific executable on the .194 node.

See below for more detail.


On Nov 5, 2009, at 3:19 PM, qing pang wrote:

1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an
ethernet router through ethernet cable. Both running Ubuntu 8.10.
There's no other connections. - Is this setting OK to run OpenMPI?


Yes.

2) Prerequisites

SSH has been set up so that the master node can access the client node
through passwordless ssh. I do notice that it takes 10~15 seconds
between me entering '>ssh <slave ip address>'command and getting onto
the client node.
--- Could this be too slow for openmpi to run properlly?


Nope -- should be ok.

I do not have programs like network file system, network time protocol,
resource management, scheduler, etc installed.
--- Does OpenMPI need any prerequites other than passwordless ssh?


Not in this case, no.

3) OpenMPI is installed on both nodes - downloaded from open-mpi.org,
and do configure/make all using Default Settings.

4) PATH and LD_LIBRARY_PATH
On both nodes,
PATH is
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games,
which is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the
file, 'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'
So when I echo them on both nodes, I get:
 >echo $PATH
>/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 >echo $LD_LIBRARY_PATH
 >usr/local/lib:usr/lib

But, if I do
 >ssh <client_ip> 'echo $LD_LIBRARY_PATH'
nothing comes back.

while
 >ssh <client_ip> 'echo $PATH'
comes back with the right path.

Is that a problem?


No.

4) Problem:
I compiled the example Hello_c using
 >mpicc hello_c.c -o hello_c.out
and run them on both nodes locally, everything works fine.

But when I tried to run it on 2 nodes (-np 2)
 >mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
I got the following error:

----------------------------------------------------------------------------
gordon@gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun
--machinefile machine.linux -np 2 $(pwd)/hello_c.out
-------------------------------------------------------------------------- mpirun was unable to launch the specified application as it could not access
or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: 192.168.0.194


You are giving an absolute pathname in the mpirun command line:

mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out


Hence, it's looking for exactly /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out on both nodes. If the executable is in a different directory on the other node, that's where you're probably running into the problem.


Reply via email to