Re: [OMPI devel] NetPIPE performance curves

2017-05-03 Thread George Bosilca
ules. >> > >> > Which module(s) are then responsible to send command to orted to start >> mpi application? >> > Which event names should I search for? >> > >> > Thank you, >> > Justin >> > >> > - Orig

Re: [OMPI devel] remote spawn - have no children

2017-05-03 Thread r...@open-mpi.org
Everything operates via the state machine - events trigger moving the job from one state to the next, with each state being tied to a callback function that implements that state. If you set state_base_verbose=5, you’ll see when and where each state gets executed. By default, the launch_app sta

Re: [OMPI devel] remote spawn - have no children

2017-05-03 Thread Justin Cinkelj
So "remote spawn" and children refer to orted daemons only, and I was looking into wrong modules. Which module(s) are then responsible to send command to orted to start mpi application? Which event names should I search for? Thank you, Justin - Original Message - > From: r...@open-mpi.

Re: [OMPI devel] remote spawn - have no children

2017-05-03 Thread r...@open-mpi.org
I should have looked more closely as you already have the routed verbose output there. Everything in fact looks correct. The node with mpirun has 1 child, which is the daemon on the other node. The vpid=1 daemon on node 250 doesn’t have any children as there aren’t any more daemons in the system

Re: [OMPI devel] remote spawn - have no children

2017-05-03 Thread r...@open-mpi.org
The orte routed framework does that for you - there is an API for that purpose. > On May 3, 2017, at 12:17 AM, Justin Cinkelj wrote: > > Important detail first: I get this message from significantly modified Open > MPI code, so problem exists solely due to my mistake. > > Orterun on 192.168.1

[OMPI devel] remote spawn - have no children

2017-05-03 Thread Justin Cinkelj
Important detail first: I get this message from significantly modified Open MPI code, so problem exists solely due to my mistake. Orterun on 192.168.122.90 starts orted on remote node 192.168.122.91, than orted figures out it has nothing to do. If I request to start workers on the same 192.168.