Thank you Jeff! That solves the problem. :-) You are the lifesaver!
So does that means I always need to copy my application to all the
nodes? Or should I give the pathname of the my executable in a different
way to avoid this? Do I need a network file system for that?
Jeff Squyres wrote:
The short version of the answer is to check to see that the executable
is in the same location on both nodes (apparently:
/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out). Open MPI is
complaining that it can't find that specific executable on the .194 node.
See below for more detail.
On Nov 5, 2009, at 3:19 PM, qing pang wrote:
1) I'm trying to run opemMPI with the following setting:
1 PC (as master node) and 1 notebook (as client node) connected to an
ethernet router through ethernet cable. Both running Ubuntu 8.10.
There's no other connections. - Is this setting OK to run OpenMPI?
Yes.
2) Prerequisites
SSH has been set up so that the master node can access the client node
through passwordless ssh. I do notice that it takes 10~15 seconds
between me entering '>ssh <slave ip address>'command and getting onto
the client node.
--- Could this be too slow for openmpi to run properlly?
Nope -- should be ok.
I do not have programs like network file system, network time protocol,
resource management, scheduler, etc installed.
--- Does OpenMPI need any prerequites other than passwordless ssh?
Not in this case, no.
3) OpenMPI is installed on both nodes - downloaded from open-mpi.org,
and do configure/make all using Default Settings.
4) PATH and LD_LIBRARY_PATH
On both nodes,
PATH is
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games,
which is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the
file, 'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'
So when I echo them on both nodes, I get:
>echo $PATH
>/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
>echo $LD_LIBRARY_PATH
>usr/local/lib:usr/lib
But, if I do
>ssh <client_ip> 'echo $LD_LIBRARY_PATH'
nothing comes back.
while
>ssh <client_ip> 'echo $PATH'
comes back with the right path.
Is that a problem?
No.
4) Problem:
I compiled the example Hello_c using
>mpicc hello_c.c -o hello_c.out
and run them on both nodes locally, everything works fine.
But when I tried to run it on 2 nodes (-np 2)
>mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
I got the following error:
----------------------------------------------------------------------------
gordon@gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun
--machinefile machine.linux -np 2 $(pwd)/hello_c.out
--------------------------------------------------------------------------
mpirun was unable to launch the specified application as it could not
access
or execute an executable:
Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: 192.168.0.194
You are giving an absolute pathname in the mpirun command line:
mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
Hence, it's looking for exactly
/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out on both
nodes. If the executable is in a different directory on the other
node, that's where you're probably running into the problem.