I am running into a situation that I don’t understand, so thought I would toss it out and see if someone can give me a hint how to deal with what I am seeing. I am making a call to MPI_Wait(), which ends up with the following call sequence: - ompi_request_default_wait() - ompi_request_wait_completion() which goes to while(false == req->req_complete) { opal_condition_wait(&ompi_request_cond, &ompi_request_lock); } The value of ompi_request_cond->c_signaled is 0, so when opal_condition_wait() is called the code goes to while (c->c_signaled == 0) { opal_progress(); OPAL_CR_TEST_CHECKPOINT_READY_STALL(); } Which spins for ever, since c->c_signaled remains 0 (even though the condition for which wait is testing has long since been satisfied).
It looks like opal_condition_signal(), opal_condition_broadcast(), opal_condition_timedwait(), or later on in opal_condition_wait() the value of c_signaled is changed, but not in the loop the code is stuck in. Does anyone on the list know how this code is supposed to work, and if so, are there any hints ? Looking a bit more it seems like ompi_request_complete() needs to be called. Can someone explain the assumptions this routine uses ? if( NULL != request->req_complete_cb ) { request->req_complete_cb( request ); } ompi_request_completed++; request->req_complete = true; if(with_signal && ompi_request_waiting) { /* Broadcast the condition, otherwise if there is already a thread * waiting on another request it can use all signals. */ opal_condition_broadcast(& ompi_request_cond); } return OMPI_SUCCESS; What is the significance of ompi_request_completed – is this counter used to manage something ? What is ompi_request_cond used for ? What is ompi_request_waiting used for ? Thanks, Rich