Re: [Mpi-forum] Persistent Readysend Semantics Question

Smith, Brian E. via mpi-forum Wed, 07 Nov 2018 17:22:55 -0800

Thanks Bill and Pavan.


I was having trouble seeing how (especially) a nonblocking ready send could be 
guaranteed to occur before the matching receive but since I only saw the 
problem on Titan I was wondering if I had missed something.


Apparently, I was just really lucky on the other platforms (or unlucky on Titan)


Thanks!





________________________________
From: mpi-forum <mpi-forum-boun...@lists.mpi-forum.org> on behalf of William 
Gropp via mpi-forum <mpi-forum@lists.mpi-forum.org>
Sent: Wednesday, November 07, 2018 8:06 PM
To: Main MPI Forum mailing list
Cc: William Gropp
Subject: Re: [Mpi-forum] Persistent Readysend Semantics Question

Pavan is correct; the program is buggy.  Here’s an example

process 1       process 2
start(recv)      /* something causes a delay at process 2 */
start(rsend)
wait(all)

                       start(recv)
                      ….

In this case, the rsend on process 1 occurs before the recv is started on 
process 2, and the MPI program is incorrect.  Without some synchronization, 
either explicit or implicit (e.g., an allreduce for time step control), the use 
of Rsend in any form is unlikely to be correct.

Bill



William Gropp
Director and Chief Scientist, NCSA
Thomas M. Siebel Chair in Computer Science
University of Illinois Urbana-Champaign






On Nov 7, 2018, at 12:14 PM, Balaji, Pavan via mpi-forum 
<mpi-forum@lists.mpi-forum.org<mailto:mpi-forum@lists.mpi-forum.org>> wrote:

Brian,

Assuming all processes  are doing the same code as below, I think the user 
program is incorrect and you were just getting lucky with the other 
implementations.

Specifically, there’s nothing stopping the rsend from a process to reach the 
other process before it posted the corresponding recv.  For example, it might 
still be in the second wait all from the previous iteration.

  — Pavan

Sent from my iPhone

On Nov 7, 2018, at 12:09 PM, Smith, Brian E. via mpi-forum 
<mpi-forum@lists.mpi-forum.org<mailto:mpi-forum@lists.mpi-forum.org>> wrote:

Hi all,

(trying again; I thought this address was subscribed to the list but maybe not. 
Sorry if this is a duplicate)

I have a user-provided code that uses persistent ready sends. (Don’t ask. I 
don’t have an answer to “why?”. Maybe it actually helped on some machine 
sometime in the past?)

Anyway, the app fails on Titan fairly consistently (95+% failure) but works on 
most other platforms (BGQ, Summit, generic OMPI cluster, generic Intel MPI 
cluster).
Note – I haven’t tried as many times on the other platforms as on Titan so 
maybe it might fail on one of them occasionally.  I saw zero failures in my 
testing however.

The code is basically this:
MPI_Recv_init()
MPI_Rsend_init()

While(condition)
{
                MPI_Start(recv_request)
                MPI_Start(rsend_request)
                MPI_Waitall(both requests)
                Twiddle_sendbuf_slightly();
}

MPI_Request_free(recv_request)
MPI_Request_free(rsend_request)

MPI_Cart_shift(rotate source/dest different direction now)
MPI_Recv_init() // sending the other direction now, basically
MPI_Rsend_init()
While(condition)
{
                MPI_Start(recv_request)
                MPI_Start(rsend_request)
                MPI_Waitall(both requests)
                Twiddle_sendbuf_slightly();
}

MPI_Request_free(recv_request)
MPI_Request_free(rsend_request)

Is this considered a “correct program”? There’s only a couple paragraphs on 
persistent sends in 800+ pages of standard, and not much more for nonblocking 
ready sends (which is essentially what this becomes). It’s pretty vague 
territory.

I tried splitting the Waitall() into 2 Wait()s, explicitly waiting on the Recv 
request first, then the Rsend request. However, this still fails and suggests 
the requests are  not happening in order:
Rank 2 [Wed Nov  7 08:26:12 2018] [c5-0c0s3n1] Fatal error in PMPI_Wait: Other 
MPI error, error stack:
PMPI_Wait(207).....................: MPI_Wait(request=0x7fffffff5698, 
status=0x7fffffff5630) failed
MPIR_Wait_impl(100)................:
MPIDI_CH3_PktHandler_ReadySend(829): Ready send from source 1 and with tag 1 
had no matching receive

It strongly looks like the send is not always posted before the receive, or at 
least the waitall completes the send sometimes before the recv. I suspect that 
means an implementation bug. Cray might actually be doing something for 
optimizing either persistent communications or ready sends (or both) that we 
never did on BGQ (so it’s not necessarily an MPICH vs OMPI difference at least)

Thoughts?

I’ll open a bug with them at some point but wanted to verify semantics first.

Thanks

Brian Smith
Oak Ridge Leadership Computing Facility

_______________________________________________
mpi-forum mailing list
mpi-forum@lists.mpi-forum.org<mailto:mpi-forum@lists.mpi-forum.org>
https://lists.mpi-forum.org/mailman/listinfo/mpi-forum
_______________________________________________
mpi-forum mailing list
mpi-forum@lists.mpi-forum.org<mailto:mpi-forum@lists.mpi-forum.org>
https://lists.mpi-forum.org/mailman/listinfo/mpi-forum

_______________________________________________
mpi-forum mailing list
mpi-forum@lists.mpi-forum.org
https://lists.mpi-forum.org/mailman/listinfo/mpi-forum

Re: [Mpi-forum] Persistent Readysend Semantics Question

Reply via email to