Yeah, that won't work. The id's cannot be reused, so you'd have to assign a 
different one in each case.

On Dec 20, 2012, at 9:12 AM, Victor Kocheganov <victor.kochega...@itseez.com> 
wrote:

> In every 'modex' block I use  coll->id = orte_process_info.peer_modex;  id 
> and in every 'barrier' block I use  coll->id = 
> orte_process_info.peer_init_barrier;  id. 
> 
> P.s. In general (as I wrote in first letter), I use 'modex' term for 
> following code:
>     coll = OBJ_NEW(orte_grpcomm_collective_t);
>     coll->id = orte_process_info.peer_modex;
>     if (ORTE_SUCCESS != (ret = orte_grpcomm.modex(coll))) {
>         error = "orte_grpcomm_modex failed";
>         goto error;
>     }
>     /* wait for modex to complete - this may be moved anywhere in mpi_init
>      * so long as it occurs prior to calling a function that needs
>      * the modex info!
>      */
>     while (coll->active) {
>         opal_progress();  /* block in progress pending events */
>     }
>     OBJ_RELEASE(coll);
> 
> and 'barrier' for this:
> 
>     coll = OBJ_NEW(orte_grpcomm_collective_t);
>     coll->id = orte_process_info.peer_init_barrier;
>     if (ORTE_SUCCESS != (ret = orte_grpcomm.barrier(coll))) {
>         error = "orte_grpcomm_barrier failed";
>         goto error;
>     }
>     /* wait for barrier to complete */
>     while (coll->active) {
>         opal_progress();  /* block in progress pending events */
>     }
>     OBJ_RELEASE(coll);
> 
> On Thu, Dec 20, 2012 at 8:57 PM, Ralph Castain <r...@open-mpi.org> wrote:
> 
> On Dec 20, 2012, at 8:29 AM, Victor Kocheganov <victor.kochega...@itseez.com> 
> wrote:
> 
>> Thanks for fast answer, Ralph.
>> 
>> In my example I use different collective objects. I mean in every mentioned 
>> block I call  coll = OBJ_NEW(orte_grpcomm_collective_t);  
>> and OBJ_RELEASE(coll); , so all the grpcomm operations use unique collective 
>> object. 
> 
> How are the procs getting the collective id for those new calls? They all 
> have to match
> 
>> 
>> 
>> On Thu, Dec 20, 2012 at 7:48 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> Absolutely it will hang as the collective object passed into any grpcomm 
>> operation (modex or barrier) is only allowed to be used once - any attempt 
>> to reuse it will fail.
>> 
>> 
>> On Dec 20, 2012, at 6:57 AM, Victor Kocheganov 
>> <victor.kochega...@itseez.com> wrote:
>> 
>>> Hi.
>>> 
>>> I have an issue with understanding  ompi_mpi_init() logic. Could you please 
>>> tell me if you have any guesses about following behavior.
>>> 
>>> I wonder if I understand ringh, there is a block in ompi_mpi_init() 
>>> function for exchanging procs information between processes (denote this 
>>> block 'modex'):
>>>     coll = OBJ_NEW(orte_grpcomm_collective_t);
>>>     coll->id = orte_process_info.peer_modex;
>>>     if (ORTE_SUCCESS != (ret = orte_grpcomm.modex(coll))) {
>>>         error = "orte_grpcomm_modex failed";
>>>         goto error;
>>>     }
>>>     /* wait for modex to complete - this may be moved anywhere in mpi_init
>>>      * so long as it occurs prior to calling a function that needs
>>>      * the modex info!
>>>      */
>>>     while (coll->active) {
>>>         opal_progress();  /* block in progress pending events */
>>>     }
>>>     OBJ_RELEASE(coll);
>>> and several instructions after this there is a block for processes 
>>> synchronization (denote this block 'barrier'):
>>>     coll = OBJ_NEW(orte_grpcomm_collective_t);
>>>     coll->id = orte_process_info.peer_init_barrier;
>>>     if (ORTE_SUCCESS != (ret = orte_grpcomm.barrier(coll))) {
>>>         error = "orte_grpcomm_barrier failed";
>>>         goto error;
>>>     }
>>>     /* wait for barrier to complete */
>>>     while (coll->active) {
>>>         opal_progress();  /* block in progress pending events */
>>>     }
>>>     OBJ_RELEASE(coll);
>>> So, initially ompi_mpi_init() has following structure:
>>> ...
>>> 'modex' block;
>>> ...
>>> 'barrier' block;
>>> ...
>>> I made several experiments with this code and the following one is of 
>>> interest: if I add sequence of two additional blocks, 'barrier' and 
>>> 'modex', right after 'modex' block, then ompi_mpi_init() hangs in 
>>> opal_progress() of the last 'modex' block.
>>> ...
>>> 'modex' block;
>>> 'barrier' block;
>>> 'modex' block; <- hangs
>>> ...
>>> 'barrier' block;
>>> ...
>>> Thanks,
>>> Victor Kocheganov.
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to