Re: [OMPI users] Connection to lifeline lost

2014-01-24 Thread etcamargo
You are right. The problem was solved put the entire path of one mpi version: /home/myuser/openmpi-x/bin/mpirun -hostfile machines -np 2 ./hello Thanks, Edson Em 24-01-2014 16:00, Ralph Castain escreveu: Looks to me like you are picking up a different OMPI installation on the remote node -

Re: [OMPI users] Connection to lifeline lost

2014-01-24 Thread Ralph Castain
Looks to me like you are picking up a different OMPI installation on the remote node - check that your path and ld_library_path on the remote host are being set correctly On Jan 24, 2014, at 9:41 AM, etcamargo wrote: > Hi, All! > > Please, I have a problem to run a simple "hello world" program

[OMPI users] Connection to lifeline lost

2014-01-24 Thread etcamargo
Hi, All! Please, I have a problem to run a simple "hello world" program on different hosts. The hosts are virtual machines located in the same net. The program works fine only on one host, the ssh is ok between the machines and nfs is ok, sharing the executable files between the machines.

Re: [OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-21 Thread Ralph Castain
Have you looked thru the code in orte/mca/plm/rsh/plm_rsh_module.c? It is executing a tree-like spawn pattern by default, but there isn't anything magic about what ssh is doing. However, there are things done to prep the remote shell (setting paths etc.), and the tree spawn passes some additiona

Re: [OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-21 Thread Yann RADENAC
Le 20/08/2012 15:56, Ralph Castain wrote : > You might try adding "-mca plm_base_verbose 5 --debug-daemons" to watch the debug output from the daemons as they are launched. There seems to be an interference here: my problem is "solved" by enabling option --debug-daemons with a verbose level >

Re: [OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-20 Thread Ralph Castain
Just to be clear: what you are launching is an orted daemon, not your application process. Once the daemons are running, then we use them to launch the actual application process. So the issue here is with starting the daemons themselves. You might try adding "-mca plm_base_verbose 5 --debug-dae

[OMPI users] "Connection to lifeline lost" when developing a new rsh agent

2012-08-20 Thread Yann RADENAC
Hi, I'm developing MPI support for XtreemOS (www.xtreemos.eu) so that an MPI program is managed as a single XtreemOS job. To manage all processes as a single XtreemOS job, I've developed the program xos-createProcess that plays the role of the rsh agent (replacing ssh/rsh) to start a process