In every 'modex' block I use coll->id = orte_process_info.peer_modex; id and in every 'barrier' block I use coll->id = orte_process_info.peer_init_ barrier; id.
P.s. In general (as I wrote in first letter), I use 'modex' term for following code: coll = OBJ_NEW(orte_grpcomm_collective_t); coll->id = orte_process_info.peer_modex; if (ORTE_SUCCESS != (ret = orte_grpcomm.modex(coll))) { error = "orte_grpcomm_modex failed"; goto error; } /* wait for modex to complete - this may be moved anywhere in mpi_init * so long as it occurs prior to calling a function that needs * the modex info! */ while (coll->active) { opal_progress(); /* block in progress pending events */ } OBJ_RELEASE(coll); and 'barrier' for this: coll = OBJ_NEW(orte_grpcomm_collective_t); coll->id = orte_process_info.peer_init_barrier; if (ORTE_SUCCESS != (ret = orte_grpcomm.barrier(coll))) { error = "orte_grpcomm_barrier failed"; goto error; } /* wait for barrier to complete */ while (coll->active) { opal_progress(); /* block in progress pending events */ } OBJ_RELEASE(coll); On Thu, Dec 20, 2012 at 8:57 PM, Ralph Castain <r...@open-mpi.org> wrote: > > On Dec 20, 2012, at 8:29 AM, Victor Kocheganov < > victor.kochega...@itseez.com> wrote: > > Thanks for fast answer, Ralph. > > In my example I use different collective objects. I mean in every > mentioned block I call *coll = OBJ_NEW(orte_grpcomm_**collective_t);* > and *OBJ_RELEASE(coll);* , so all the grpcomm operations use unique > collective object. > > > How are the procs getting the collective id for those new calls? They all > have to match > > > > On Thu, Dec 20, 2012 at 7:48 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> Absolutely it will hang as the collective object passed into any grpcomm >> operation (modex or barrier) is only allowed to be used once - any attempt >> to reuse it will fail. >> >> >> On Dec 20, 2012, at 6:57 AM, Victor Kocheganov < >> victor.kochega...@itseez.com> wrote: >> >> Hi. >> >> I have an issue with understanding *ompi_mpi_init() *logic. Could you >> please tell me if you have any guesses about following behavior. >> >> I wonder if I understand ringh, there is a block in *ompi_mpi_init() >> *function >> for exchanging procs information between processes (denote this block >> 'modex'): >> >> coll = OBJ_NEW(orte_grpcomm_collective_t); >> coll->id = orte_process_info.peer_modex; >> if (ORTE_SUCCESS != (ret = orte_grpcomm.modex(coll))) { >> error = "orte_grpcomm_modex failed"; >> goto error; >> } >> /* wait for modex to complete - this may be moved anywhere in mpi_init >> * so long as it occurs prior to calling a function that needs >> * the modex info! >> */ >> while (coll->active) { >> opal_progress(); /* block in progress pending events */ >> } >> OBJ_RELEASE(coll); >> >> and several instructions after this there is a block for processes >> synchronization (denote this block 'barrier'): >> >> coll = OBJ_NEW(orte_grpcomm_collective_t); >> coll->id = orte_process_info.peer_init_barrier; >> if (ORTE_SUCCESS != (ret = orte_grpcomm.barrier(coll))) { >> error = "orte_grpcomm_barrier failed"; >> goto error; >> } >> /* wait for barrier to complete */ >> while (coll->active) { >> opal_progress(); /* block in progress pending events */ >> } >> OBJ_RELEASE(coll); >> >> So,* *initially* **ompi_mpi_init()* has following structure: >> >> ... >> 'modex' block; >> ... >> 'barrier' block; >> ... >> >> I made several experiments with this code and the following one is of >> interest: if I add sequence of two additional blocks, 'barrier' and >> 'modex', right after 'modex' block, then* **ompi_mpi_init() *hangs in * >> opal_progress()* of the last 'modex' block. >> >> ... >> 'modex' block; >> 'barrier' block; >> 'modex' block; <- hangs >> ... >> 'barrier' block; >> ... >> >> Thanks, >> Victor Kocheganov. >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >