Re: [OMPI devel] Hanging inside ompi_win_create_dynamic

2020-05-22 Thread Gilles Gouaillardet via devel
Luis,

a MPI window creation is a collective operation, that involves the
creation of a communicator that involves (several) non blocking
allgather.

I recommend you first make sure you that creating a windows does not
start a recursion.


Cheers,

Gilles

On Fri, May 22, 2020 at 7:53 PM Luis Cebamanos via devel
 wrote:
>
> Hello OpenMPI devs,
>
> I know this can be a bit vague question, but I am looking for some hint
> that could help me to debug the problem I am facing here.
>
> I am trying to create a dynamic window inside a collective operation.
> Inside the operation, the call to ompi_win_create_dynamic hangs with
> processes waiting in ompi_request_wait_completion:
>
> rc = ompi_comm_nextcid_nb (newcomm, comm, bridgecomm, arg0, arg1,
> send_first, mode, );
> if (OMPI_SUCCESS != rc) {
> return rc;
> }
>
> ompi_request_wait_completion (req);
>
> There is nothing obvious there but I can see from my debugging that
> processes end up in (opal_progress.c):
>
> #if OPAL_HAVE_SCHED_YIELD
> if (opal_progress_yield_when_idle && events <= 0) {
> /* If there is nothing to do - yield the processor - otherwise
>  * we could consume the processor for the entire time slice. If
>  * the processor is oversubscribed - this will result in a best-case
>  * latency equivalent to the time-slice.
>  */
> sched_yield();
> }
> #endif  /* defined(HAVE_SCHED_YIELD) */
>
> I have been debugging it for a while now and I am starting to feel like
> I am driving in circles. Is there anything I should look at in this
> particular situation? What could possible be the caused?
>
> Regards,
> Luis
>
>
> The University of Edinburgh is a charitable body, registered in Scotland, 
> with registration number SC005336.


[OMPI devel] Hanging inside ompi_win_create_dynamic

2020-05-22 Thread Luis Cebamanos via devel
Hello OpenMPI devs,

I know this can be a bit vague question, but I am looking for some hint
that could help me to debug the problem I am facing here.

I am trying to create a dynamic window inside a collective operation.
Inside the operation, the call to ompi_win_create_dynamic hangs with
processes waiting in ompi_request_wait_completion:

rc = ompi_comm_nextcid_nb (newcomm, comm, bridgecomm, arg0, arg1,
send_first, mode, );
if (OMPI_SUCCESS != rc) {
return rc;
}

ompi_request_wait_completion (req);

There is nothing obvious there but I can see from my debugging that
processes end up in (opal_progress.c):

#if OPAL_HAVE_SCHED_YIELD
if (opal_progress_yield_when_idle && events <= 0) {
/* If there is nothing to do - yield the processor - otherwise
 * we could consume the processor for the entire time slice. If
 * the processor is oversubscribed - this will result in a best-case
 * latency equivalent to the time-slice.
 */
sched_yield();
}
#endif  /* defined(HAVE_SCHED_YIELD) */

I have been debugging it for a while now and I am starting to feel like
I am driving in circles. Is there anything I should look at in this
particular situation? What could possible be the caused?

Regards,
Luis


The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.


pEpkey.asc
Description: application/pgp-keys