jody wrote:
Hi Tim
(I accidentally sent the previous message before it was ready - here's
the complete one)
Thank You for your reply.
Unfortunately my workstation, on which i could successfully run openmpi
applications, has died. But one my replacement machine (which
i assume i have setup in an equivalent way) i now get errors even when i try
to run an openmpi application in a simple way:

jody@aim-nano_02 /home/aim-cari/jody $  mpirun -np 2 --hostfile hostfile ./a.out
bash: orted: command not found
[aim-nano_02:22145] ERROR: A daemon on node 130.60.49.129 failed to
start as expected.
[aim-nano_02:22145] ERROR: There may be more information available from
[aim-nano_02:22145] ERROR: the remote shell (see above).
[aim-nano_02:22145] ERROR: The daemon exited unexpectedly with status 127.
[aim-nano_02:22145] ERROR: A daemon on node 130.60.49.128 failed to
start as expected.
[aim-nano_02:22145] ERROR: There may be more information available from
[aim-nano_02:22145] ERROR: the remote shell (see above).
[aim-nano_02:22145] ERROR: The daemon exited unexpectedly with status 127.

However, i set PATH and LD_LIBRARY_PATH to the correct paths both in
.bashrc AND .bash_profile.
I assume you are using bash. You might try changing your .profile as well.


For example:
jody@aim-nano_02 /home/aim-cari/jody $ ssh 130.60.49.128 echo $PATH
/opt/openmpi/bin:/usr/local/bin:/usr/bin:/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/4.1.2:/opt/sun-jdk-1.4.2.10/bin:/opt/sun-jdk-1.4.2.10/jre/bin:/opt/sun-jdk-1.4.2.10/jre/javaws:/usr/qt/3/bin

When you do this, $PATH gets interpreted on the local host, not the remote host. Try instead:

ssh 130.60.49.128 printenv |grep PATH


But:
jody@aim-nano_02 /home/aim-cari/jody $ ssh 130.60.49.128 orted
bash: orted: command not found

You could also do:
ssh 130.60.49.128 which orted

This will show you the paths it looked in for the orted.

Do You have any suggestions?
To avoid dealing with paths (assuming everything is installed in the same directory on all nodes) you can also try the suggestion here (although I think that once it is setup modifying PATHs is the easier way to go, less typing :):
http://www.open-mpi.org/faq/?category=running#mpirun-prefix


Hope this helps,

Tim

Thank You
 Jody

On 7/9/07, Tim Prins <tpr...@open-mpi.org> wrote:
Hi Jody,

Sorry for the super long delay. I don't know how this one got lost...

I run like this all the time. Unfortunately, it is not as simple as I
would like. Here is what I do:

1. Log into the machine using ssh -X
2. Run mpirun with the following parameters:
        -mca pls rsh  (This makes sure that Open MPI uses the rsh/ssh launcher.
It may not be necessary depending on your setup)
        -mca pls_rsh_agent "ssh -X" (To make sure X information is forwarded.
This might not be necessary if you have ssh setup to always forward X
information)
        --debug-daemons (This ensures that the ssh connections to the backed
nodes are kept open. Otherwise, they are closed and X information cannot
be forwarded. Unfortunately, this will also cause some debugging output
to be printed, but right now there is no other way :( )

So, the complete command is:
mpirun -np 4 -mca pls rsh -mca pls_rsh_agent "ssh -X" --debug-daemons
xterm -e gdb my_prog

I hope this helps. Let me know if you are still experiencing problems.

Tim


jody wrote:
Hi
For debugging i usually run each process in a separate X-window.
This works well if i set the DISPLAY variable to the computer
from which i am starting my OpenMPI application.

This method fails however, if i log in (via ssh) to my workstation
from a third computer and then start my OpenMPI application,
only the processes running on the workstation i logged into can
open their windows on the third computers. The processes on
the other computers cant open their windows.

This is how i start the processes

mpirun -np 4 -x DISPLAY run_gdb.sh ./TestApp

where run_gdb.sh looks like this
-------------------------
#!/bin/csh -f

echo "Running GDB on node `hostname`"
xterm -e gdb $*
exit 0
-------------------------
The output from the processes on the other computer:
    xterm Xt error: Can't open display: localhost:12.0

I there a way to tell OpenMPI to forward the X windows
over yet another ssh connection?

Thanks
  Jody
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to