Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)

2008-09-24 Thread Thomas Ropars

I works find with konsole.
Thank you for the advise.

Thomas.

Samuel Sarholz wrote:

Hi,

I think the problem is that xterm (probably) has the userid bit set 
and thus deletes the LD_LIBRARY_PATH.

Try setting the path again before you start gdb, e.g:
mpirun -n 2 -x DISPLAY=:0.0 xterm -e LD_LIBRARY_PATH=

or use the -Wl,-rpath= to compiler the search path 
into the executable.


best regards,
Samuel

P.S.: This xterm behavior causes us a lot of problems as well. Other 
terminals like konsole don't have that problem.


Thomas Ropars wrote:

Hi,

I'm trying to use gdb and xterm with open mpi on my computer (Ubuntu 
8.04).
When I run an application without gdb on my computer in works find 
but if I try to use gdb in xterm I get the following error:


mpirun -n 2 -x DISPLAY=:0.0 xterm -e gdb ./ring.out

(gdb) run
Starting program: /media/sda5/tempo/openmpi/tests/ring.out
/media/sda5/tempo/openmpi/tests/ring.out: error while loading shared 
libraries: libmpi.so.0: cannot open shared object file: No such file 
or directory


Program exited with code 0177.

When I try to use a shell script to launch gdb as mentioned bellow, I 
get the same error.


Thomas

Jeff Squyres wrote:

On Feb 7, 2008, at 10:07 AM, jody wrote:

 
I wrote a little command called envliblist which consists of this  
line:

printenv | grep PATH | gawk -F "_PATH=" '{ print $2 }' | gawk -F ":"
'{ print $1 }' | xargs ls -al

When i do
mpirun -np 5 -hostfile testhosts -x DISPLAY xterm -hold -e ./ 
envliblist
all  xterms (local & remote) display the contents of the 
openmpi/lib  directory.



Ok, good.

 

Another strange result:
I have a shell script for launching the debugger in an xterm:
[jody]:/mnt/data1/neander:$cat run_gdb.sh
#!/bin/sh
#
# save the program name
export PROG="$1"
# shift away program name (leaves program params)
shift
# create a command file for gdb, to start it automatically
echo run $*  > gdb.cmd
# do the term
xterm -e gdb -x gdb.cmd $PROG

exit 0

When i run
 mpirun -np 5 --hostfile testhosts -x DISPLAY ./run_gdb.sh ./MPITest
it works!

Just to compare
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e ./MPITest
does not work.



It seems that if you launch shell scripts, things work.  But if you  
run xterm without a shell script, it does not work.  I do not think 
it  is a difference of -hold vs. no -hold.  Indeed, I can run both 
of  these commands just fine on my system:


% mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm - 
hold -e gdb ~/mpi/hello


% mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm  
-e  gdb ~/mpi/hello


Note that my setup is a little different than yours; I'm using a 
Mac  laptop and ssh'ing to a server where I'm invoking mpirun.  The  
hostfile "h" contains a 2nd server where xterm/gdb/hello are running.



 

I notice the only difference between the to above commands is that
in the run_gdb script xterm has no "-hold" parameter!
Indeed,
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -e ./MPITest
does work. To actually see that it works (MPITest is simple Hello MPI
app) i had to do
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e
"./MPITest >> output.txt"
and check output.txt.

Does anybody have an explanation for this weird happening?

Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




  


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)

2008-09-24 Thread Thomas Ropars

Hi,

I'm trying to use gdb and xterm with open mpi on my computer (Ubuntu 8.04).
When I run an application without gdb on my computer in works find but 
if I try to use gdb in xterm I get the following error:


mpirun -n 2 -x DISPLAY=:0.0 xterm -e gdb ./ring.out

(gdb) run
Starting program: /media/sda5/tempo/openmpi/tests/ring.out
/media/sda5/tempo/openmpi/tests/ring.out: error while loading shared 
libraries: libmpi.so.0: cannot open shared object file: No such file or 
directory


Program exited with code 0177.

When I try to use a shell script to launch gdb as mentioned bellow, I 
get the same error.


Thomas

Jeff Squyres wrote:

On Feb 7, 2008, at 10:07 AM, jody wrote:

  
I wrote a little command called envliblist which consists of this  
line:

printenv | grep PATH | gawk -F "_PATH=" '{ print $2 }' | gawk -F ":"
'{ print $1 }' | xargs ls -al

When i do
mpirun -np 5 -hostfile testhosts -x DISPLAY xterm -hold -e ./ 
envliblist
all  xterms (local & remote) display the contents of the openmpi/lib  
directory.



Ok, good.

  

Another strange result:
I have a shell script for launching the debugger in an xterm:
[jody]:/mnt/data1/neander:$cat run_gdb.sh
#!/bin/sh
#
# save the program name
export PROG="$1"
# shift away program name (leaves program params)
shift
# create a command file for gdb, to start it automatically
echo run $*  > gdb.cmd
# do the term
xterm -e gdb -x gdb.cmd $PROG

exit 0

When i run
 mpirun -np 5 --hostfile testhosts -x DISPLAY ./run_gdb.sh ./MPITest
it works!

Just to compare
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e ./MPITest
does not work.



It seems that if you launch shell scripts, things work.  But if you  
run xterm without a shell script, it does not work.  I do not think it  
is a difference of -hold vs. no -hold.  Indeed, I can run both of  
these commands just fine on my system:


% mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm - 
hold -e gdb ~/mpi/hello


% mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm  -e  
gdb ~/mpi/hello


Note that my setup is a little different than yours; I'm using a Mac  
laptop and ssh'ing to a server where I'm invoking mpirun.  The  
hostfile "h" contains a 2nd server where xterm/gdb/hello are running.



  

I notice the only difference between the to above commands is that
in the run_gdb script xterm has no "-hold" parameter!
Indeed,
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -e ./MPITest
does work. To actually see that it works (MPITest is simple Hello MPI
app) i had to do
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e
"./MPITest >> output.txt"
and check output.txt.

Does anybody have an explanation for this weird happening?

Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




  




Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)

2008-02-07 Thread Jeff Squyres

On Feb 7, 2008, at 10:07 AM, jody wrote:

I wrote a little command called envliblist which consists of this  
line:

printenv | grep PATH | gawk -F "_PATH=" '{ print $2 }' | gawk -F ":"
'{ print $1 }' | xargs ls -al

When i do
mpirun -np 5 -hostfile testhosts -x DISPLAY xterm -hold -e ./ 
envliblist
all  xterms (local & remote) display the contents of the openmpi/lib  
directory.


Ok, good.


Another strange result:
I have a shell script for launching the debugger in an xterm:
[jody]:/mnt/data1/neander:$cat run_gdb.sh
#!/bin/sh
#
# save the program name
export PROG="$1"
# shift away program name (leaves program params)
shift
# create a command file for gdb, to start it automatically
echo run $*  > gdb.cmd
# do the term
xterm -e gdb -x gdb.cmd $PROG

exit 0

When i run
 mpirun -np 5 --hostfile testhosts -x DISPLAY ./run_gdb.sh ./MPITest
it works!

Just to compare
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e ./MPITest
does not work.


It seems that if you launch shell scripts, things work.  But if you  
run xterm without a shell script, it does not work.  I do not think it  
is a difference of -hold vs. no -hold.  Indeed, I can run both of  
these commands just fine on my system:


% mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm - 
hold -e gdb ~/mpi/hello


% mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm  -e  
gdb ~/mpi/hello


Note that my setup is a little different than yours; I'm using a Mac  
laptop and ssh'ing to a server where I'm invoking mpirun.  The  
hostfile "h" contains a 2nd server where xterm/gdb/hello are running.





I notice the only difference between the to above commands is that
in the run_gdb script xterm has no "-hold" parameter!
Indeed,
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -e ./MPITest
does work. To actually see that it works (MPITest is simple Hello MPI
app) i had to do
mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e
"./MPITest >> output.txt"
and check output.txt.

Does anybody have an explanation for this weird happening?

Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)

2008-02-07 Thread jody
Hi Jeff

> The results of these two commands do seem to contradict each other;
> hmm.  Just to be absolutely sure, did you cut-n-paste the
> LD_LIBRARY_PATH directory output from printenv and try to "ls" it to
> ensure that it's completely spelled right, etc.?  I suspect that it's
> right since your other commands work, but at this point, it's worth
> checking the "obvious" things as well...

I wrote a little command called envliblist which consists of this line:
printenv | grep PATH | gawk -F "_PATH=" '{ print $2 }' | gawk -F ":"
'{ print $1 }' | xargs ls -al

When i do
mpirun -np 5 -hostfile testhosts -x DISPLAY xterm -hold -e ./envliblist
all  xterms (local & remote) display the contents of the openmpi/lib directory.

Another strange result:
I have a shell script for launching the debugger in an xterm:
[jody]:/mnt/data1/neander:$cat run_gdb.sh
#!/bin/sh
#
# save the program name
export PROG="$1"
# shift away program name (leaves program params)
shift
# create a command file for gdb, to start it automatically
echo run $*  > gdb.cmd
# do the term
xterm -e gdb -x gdb.cmd $PROG

exit 0

When i run
  mpirun -np 5 --hostfile testhosts -x DISPLAY ./run_gdb.sh ./MPITest
it works!

Just to compare
 mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e ./MPITest
does not work.

I notice the only difference between the to above commands is that
in the run_gdb script xterm has no "-hold" parameter!
Indeed,
 mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -e ./MPITest
does work. To actually see that it works (MPITest is simple Hello MPI
app) i had to do
 mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e
"./MPITest >> output.txt"
and check output.txt.

Does anybody have an explanation for this weird happening?

Jody


Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)

2008-02-07 Thread Jeff Squyres
The whole question of how to invoke xterms for gdb via mpirun keeps  
coming up, so when this thread is done, I'll add a pile of this  
information to the FAQ.


More below.

On Feb 6, 2008, at 10:52 AM, jody wrote:


I now solved the "ssh" part of my Problem
The XServer is being started with the nolisten option (thanks Allen).
In Fedora (Gnome) this can easily be changed by choosing the
the "Login Screen" tool from the System|Administration Menu.
There, under the tab "Security", remove the checkmark from
"Deny TCP connections from xserver"
Of course, this needs root access - fortunately,
i am the boss of my computer ;)
Additionally, at least the port 6000 should be open.

This leaves me with my second problem

$mpirun -np 5 -hostfile testhosts -x DISPLAY=plankton:0.0 xterm -hold
-e ./MPITest
Opens 2 xterms from nano (remote) and 3 xterms from plankton(local).
The local screens display the message:
./MPITest: error while loading shared libraries: libmpi_cxx.so.0:
cannot open shared object file: No such file or directory

Which is unbelievably strange, since for all xterms (local & remote)
the output of
  $mpirun -np 5 -hostfile testhosts -x DISPLAY=plankton:0.0 xterm
-hold -e printenv
contains the PATH variable containing the path to openmpi/bin and the
LD_LIBRARY_PATH
containing the path to openmpi/lib


The results of these two commands do seem to contradict each other;  
hmm.  Just to be absolutely sure, did you cut-n-paste the  
LD_LIBRARY_PATH directory output from printenv and try to "ls" it to  
ensure that it's completely spelled right, etc.?  I suspect that it's  
right since your other commands work, but at this point, it's worth  
checking the "obvious" things as well...


What shell are you using?  You might want to add some echo statements  
to your shell startup scripts to ensure that all the right parts are  
being run in each of the cases -- perhaps, for some weird reason, they  
aren't in the problematic cases...?  [shrug]




Doing
  $mpirun -np 5 -hostfile testhosts -x DISPLAY=plankton:0.0 xterm
-hold -e locate libmpi_cxx
returns on all xterms (local & remote)
/opt/openmpi/lib/libmpi_cxx.la
/opt/openmpi/lib/libmpi_cxx.so
/opt/openmpi/lib/libmpi_cxx.so.0
/opt/openmpi/lib/libmpi_cxx.so.0.0.0

On the other hand, the application has no problem when being called
without xterms:
$mpirun -np 5 -hostfile testhosts ./MPITest

Does anybody have an idea why that should happen?


Thanks
  Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] mpirun, paths and xterm again

2008-02-06 Thread Tim Prins

Jody,

If you want to forward X connections through ssh, you should NOT set the 
DISPLAY variable. ssh will set the proper one for you.


Tim

jody wrote:

Tim

Thank you for your explanation on how OpenMPI uses ssh.



There is a way to force the ssh sessions to stay open. However doing so
will result in a bunch of excess debug output. If you add
"--debug-daemons" to the mpirun command line, the ssh connections should
stay open.


Unfortunately this didn't work either:

[jody]:/mnt/data1/neander:$mpirun -np 4 --debug-daemons --hostfile
testhosts -x DISPLAY=plankton:0.0 xterm -hold -e ../MPITest
Daemon [0,0,1] checking in as pid 19473 on host plankton.unizh.ch
Daemon [0,0,2] checking in as pid 26531 on host nano_00
[plankton.unizh.ch:19473] [0,0,1] orted: received launch callback
[nano_00:26531] [0,0,2] orted: received launch callback
xterm Xt error: Can't open display: plankton:0.0
xterm Xt error: Can't open display: plankton:0.0
xterm Xt error: Can't open display: plankton:0.0
xterm Xt error: Can't open display: plankton:0.0
[plankton.unizh.ch:19473] [0,0,1] orted_recv_pls: received message from [0,0,0]
[plankton.unizh.ch:19473] [0,0,1] orted_recv_pls: received exit
[nano_00:26531] [0,0,2] orted_recv_pls: received message from [0,0,0]
[nano_00:26531] [0,0,2] orted_recv_pls: received exit

If i use ":0.0" instead of "plankton:0.0", at least the local
processes open their X-terms.



Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)

2008-02-06 Thread jody
I now solved the "ssh" part of my Problem
The XServer is being started with the nolisten option (thanks Allen).
In Fedora (Gnome) this can easily be changed by choosing the
the "Login Screen" tool from the System|Administration Menu.
There, under the tab "Security", remove the checkmark from
"Deny TCP connections from xserver"
Of course, this needs root access - fortunately,
i am the boss of my computer ;)
Additionally, at least the port 6000 should be open.

This leaves me with my second problem

$mpirun -np 5 -hostfile testhosts -x DISPLAY=plankton:0.0 xterm -hold
-e ./MPITest
Opens 2 xterms from nano (remote) and 3 xterms from plankton(local).
The local screens display the message:
./MPITest: error while loading shared libraries: libmpi_cxx.so.0:
cannot open shared object file: No such file or directory

Which is unbelievably strange, since for all xterms (local & remote)
the output of
   $mpirun -np 5 -hostfile testhosts -x DISPLAY=plankton:0.0 xterm
-hold -e printenv
contains the PATH variable containing the path to openmpi/bin and the
LD_LIBRARY_PATH
containing the path to openmpi/lib

Doing
   $mpirun -np 5 -hostfile testhosts -x DISPLAY=plankton:0.0 xterm
-hold -e locate libmpi_cxx
returns on all xterms (local & remote)
/opt/openmpi/lib/libmpi_cxx.la
/opt/openmpi/lib/libmpi_cxx.so
/opt/openmpi/lib/libmpi_cxx.so.0
/opt/openmpi/lib/libmpi_cxx.so.0.0.0

On the other hand, the application has no problem when being called
without xterms:
$mpirun -np 5 -hostfile testhosts ./MPITest

Does anybody have an idea why that should happen?


Thanks
   Jody


Re: [OMPI users] mpirun, paths and xterm again

2008-02-06 Thread jody
Tim

Thank you for your explanation on how OpenMPI uses ssh.


> There is a way to force the ssh sessions to stay open. However doing so
> will result in a bunch of excess debug output. If you add
> "--debug-daemons" to the mpirun command line, the ssh connections should
> stay open.

Unfortunately this didn't work either:

[jody]:/mnt/data1/neander:$mpirun -np 4 --debug-daemons --hostfile
testhosts -x DISPLAY=plankton:0.0 xterm -hold -e ../MPITest
Daemon [0,0,1] checking in as pid 19473 on host plankton.unizh.ch
Daemon [0,0,2] checking in as pid 26531 on host nano_00
[plankton.unizh.ch:19473] [0,0,1] orted: received launch callback
[nano_00:26531] [0,0,2] orted: received launch callback
xterm Xt error: Can't open display: plankton:0.0
xterm Xt error: Can't open display: plankton:0.0
xterm Xt error: Can't open display: plankton:0.0
xterm Xt error: Can't open display: plankton:0.0
[plankton.unizh.ch:19473] [0,0,1] orted_recv_pls: received message from [0,0,0]
[plankton.unizh.ch:19473] [0,0,1] orted_recv_pls: received exit
[nano_00:26531] [0,0,2] orted_recv_pls: received message from [0,0,0]
[nano_00:26531] [0,0,2] orted_recv_pls: received exit

If i use ":0.0" instead of "plankton:0.0", at least the local
processes open their X-terms.



Jody


Re: [OMPI users] mpirun, paths and xterm again

2008-02-05 Thread Tim Prins

Jody,

jody wrote:

Hi Tim


Your desktop is plankton, and you want
to run a job on both plankton and nano, and have xterms show up on nano.


Not on nano, but on plankton, but ithink this was just a typo :)

Correct.


It looks like you are already doing this, but to make sure, the way I
would use xhost is:
plankton$ xhost +nano_00
plankton$ mpirun -np 4 --hostfile testhosts -x DISPLAY=plankton:0.0
xterm -hold -e ../MPITest

This gives me 2 lines of
  xterm Xt error: Can't open display: plankton:0.0


Can you try running:
plankton$ mpirun -np 1 -host nano_00 -x DISPLAY=plankton:0.0 printenv

This yields
DISPLAY=plankton:0.0





just to make sure the environment variable is being properly set.

You might also try:
in terminal 1:
plankton$ xhost +nano_00

in terminal 2:
plankton$ ssh -x nano_00
nano_00$ export DISPLAY="plankton:0.0"
nano_00$ xterm


This experiment also gives
xterm Xt error: Can't open display: plankton:0.0


This will ssh into nano, disabling ssh X forwarding, and try to launch
an xterm. If this does not work, then something is wrong with your x
setup. If it does work, it should work with Open MPI as well.


So i guess something is wrong with my X setup.
I wonder what it could be ...


So this is an X issue, not an Open MPI issue then. I do not know enough 
about X setup to help here...




Doing the same with X11 forwarding works perfectly.
But why is X11 forwarding bad?  Or differently asked,
does Opem MPI make the ssh connection in such a way
that X11 forwarding is  disabled?


What Open MPI does is it uses ssh to launch a daemon on a remote node, 
then it disconnects the ssh session. This is done to prevent running out 
of resources at scale. We then send a message to the daemon to launch 
the client application. So we are not doing anything to prevent ssh X11 
forwarding, it is just that by the time the application launched the ssh 
sessions are no longer around.


There is a way to force the ssh sessions to stay open. However doing so 
will result in a bunch of excess debug output. If you add 
"--debug-daemons" to the mpirun command line, the ssh connections should 
stay open.


Hope this helps,

Tim


Re: [OMPI users] mpirun, paths and xterm again

2008-02-05 Thread jody
Hi Tim

> Your desktop is plankton, and you want
> to run a job on both plankton and nano, and have xterms show up on nano.

Not on nano, but on plankton, but ithink this was just a typo :)

> It looks like you are already doing this, but to make sure, the way I
> would use xhost is:
> plankton$ xhost +nano_00
> plankton$ mpirun -np 4 --hostfile testhosts -x DISPLAY=plankton:0.0
> xterm -hold -e ../MPITest
This gives me 2 lines of
  xterm Xt error: Can't open display: plankton:0.0

>
> Can you try running:
> plankton$ mpirun -np 1 -host nano_00 -x DISPLAY=plankton:0.0 printenv
This yields
DISPLAY=plankton:0.0
OMPI_MCA_orte_precondition_transports=4a0f9ccb4c13cd0e-6255330fbb0289f9
OMPI_MCA_rds=proxy
OMPI_MCA_ras=proxy
OMPI_MCA_rmaps=proxy
OMPI_MCA_pls=proxy
OMPI_MCA_rmgr=proxy
SHELL=/bin/bash
SSH_CLIENT=130.60.49.141 59524 22
USER=jody
LD_LIBRARY_PATH=/opt/openmpi/lib
SSH_AUTH_SOCK=/tmp/ssh-enOzt24653/agent.24653
MAIL=/var/mail/jody
PATH=/opt/openmpi/bin:/usr/local/bin:/bin:/usr/bin
PWD=/home/jody
SHLVL=1
HOME=/home/jody
LOGNAME=jody
SSH_CONNECTION=130.60.49.141 59524 130.60.49.128 22
_=/opt/openmpi/bin/orted
OMPI_MCA_mpi_yield_when_idle=0
OMPI_MCA_mpi_paffinity_processor=0
OMPI_MCA_universe=j...@aim-plankton.unizh.ch:default-universe-10265
OMPI_MCA_ns_replica_uri=0.0.0;tcp://130.60.49.141:50310
OMPI_MCA_gpr_replica_uri=0.0.0;tcp://130.60.49.141:50310
OMPI_MCA_orte_app_num=0
OMPI_MCA_orte_base_nodename=nano_00
OMPI_MCA_ns_nds=env
OMPI_MCA_ns_nds_cellid=0
OMPI_MCA_ns_nds_jobid=1
OMPI_MCA_ns_nds_vpid=0
OMPI_MCA_ns_nds_vpid_start=0
OMPI_MCA_ns_nds_num_procs=1


>
> just to make sure the environment variable is being properly set.
>
> You might also try:
> in terminal 1:
> plankton$ xhost +nano_00
>
> in terminal 2:
> plankton$ ssh -x nano_00
> nano_00$ export DISPLAY="plankton:0.0"
> nano_00$ xterm
>
This experiment also gives
xterm Xt error: Can't open display: plankton:0.0

> This will ssh into nano, disabling ssh X forwarding, and try to launch
> an xterm. If this does not work, then something is wrong with your x
> setup. If it does work, it should work with Open MPI as well.
>
So i guess something is wrong with my X setup.
I wonder what it could be ...
Doing the same with X11 forwarding works perfectly.
But why is X11 forwarding bad?  Or differently asked,
does Opem MPI make the ssh connection in such a way
that X11 forwarding is  disabled?

Thank YOu
  Jody


Re: [OMPI users] mpirun, paths and xterm again

2008-02-05 Thread Tim Prins

Hi Jody,

Just to make sure I understand. Your desktop is plankton, and you want 
to run a job on both plankton and nano, and have xterms show up on nano.


It looks like you are already doing this, but to make sure, the way I 
would use xhost is:

plankton$ xhost +nano_00
plankton$ mpirun -np 4 --hostfile testhosts -x DISPLAY=plankton:0.0 
xterm -hold -e ../MPITest


Can you try running:
plankton$ mpirun -np 1 -host nano_00 -x DISPLAY=plankton:0.0 printenv

just to make sure the environment variable is being properly set.

You might also try:
in terminal 1:
plankton$ xhost +nano_00

in terminal 2:
plankton$ ssh -x nano_00
nano_00$ export DISPLAY="plankton:0.0"
nano_00$ xterm

This will ssh into nano, disabling ssh X forwarding, and try to launch 
an xterm. If this does not work, then something is wrong with your x 
setup. If it does work, it should work with Open MPI as well.


For your second question: I'm not sure why there would be a difference 
in finding the shared libraries in gdb vs. with the xterm.


Tim

jody wrote:

Hi
Sorry to bring this subject up again -
but i have a problem getting xterms
running for all of my processes (for debugging purposes).
There are actually two problem involved:
display, and paths.


my ssh is set up so that X forwarding is allowed,
and, indeed,
  ssh nano_00 xterm
opens an xterm from the remote machine nano_00.

When i run my program normally, it works ok:
 [jody]:/mnt/data1/neander:$mpirun -np 4 --hostfile testhosts ./MPITest
[aim-plankton.unizh.ch]I am #0/4 global
[aim-plankton.unizh.ch]I am #1/4 global
[aim-nano_00]I am #2/4 global
[aim-nano_00]I am #3/4 global

But when i try to see it in xterms
[jody]:/mnt/data1/neander:$mpirun -np 4 --hostfile testhosts -x
DISPLAY xterm -hold -e  ./MPITest
xterm Xt error: Can't open display: :0.0
xterm Xt error: Can't open display: :0.0

(same happens, if i set DISPLAY=plankton:0.0, or if i use plankton's
ip address;
and xhost is enabled for nano_00)

the other two (the "local") xterms open, but they display the message:
 ./MPITest: error while loading shared libraries: libmpi_cxx.so.0:
cannot open shared object file: No such file or directory
(This also happens if i only have local processes)

So my first question is: what do i do to enable nano_00 to display an xterm
on plankton? Using normal ssh there seems to be no problem.

Second question: why does the use of xterm "hide" the open-mpi libs?
Interestingly: if i use xterm with gdb to start my application, it works.

Any ideas?

Thank you
  Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users