For SHMEM - In an environment where the ordering of placement of data in memory 
cannot necessarily be guaranteed, it seems like the SHMEM method of polling on 
a memory location is a distinctly bad idea, even though it's true that we've 
gotten away with it for years, largely thanks to rational memory controller 
designers and the restrictions of the PCIe bus.

For MPI, I can see how that would work, but it amounts to doing an RMA 
operation followed by a SEND.  The mechanism I am proposing would eliminate the 
need for the SEND operation.

Essentially, the requirement is as follows:  Ensure consistency of data 
availability at the responder for one-sided operations to ensure that data is 
actually available when the responder side consumer goes to look for it.

Accomplishing this requires two things:
1. a signaling mechanism from the requester to the responder, and
2. a mechanism at the responder side to synchronize the ordering of the signal 
w.r.t. data visibility.

We already have REMOTE_CQ_DATA to use as the signal, defining 
FI_DELIVERY_COMPLETE as suggested could be the needed synchronization mechanism.

I do think that such a feature could be used to satisfy the talk in the MPI 
Forum, assuming it ever comes to anything.

-Paul

-----Original Message-----
From: Sur, Sayantan [mailto:[email protected]] 
Sent: Wednesday, October 21, 2015 10:05 AM
To: Paul Grun; [email protected]
Subject: Re: [ofiwg] A question on FI_DELIVERY_COMPLETE

SHMEM: there is a wait_until call that the responder calls that basically waits 
until the value in the memory location changes. Then there is 
shmem_barrier_all, in which the responder could also have participated.

MPI: In passive mode operation - the requestor (origin) needs to unlock a 
window (or flush), and send a message to the responder (target) that allows the 
target to inspect the data.

There is some talk in MPI Forum to introduce a call that does MPI_Put with 
notify, but it doesn’t exist currently. If it was introduced, maybe it could 
use the feature you’re suggesting?

Thanks,
Sayantan.



On 10/21/15, 9:51 AM, "Paul Grun" <[email protected]> wrote:

>What are those mechanisms that MPI and SHMEM use?
>
>Wouldn't it be useful if the requester could simply use REMOTE_CQ_DATA and be 
>assured that the responder wouldn't get the completion until the data had been 
>placed into cache?
>-Paul
>
>-----Original Message-----
>From: Sur, Sayantan [mailto:[email protected]] 
>Sent: Wednesday, October 21, 2015 9:50 AM
>To: Paul Grun; [email protected]
>Subject: Re: [ofiwg] A question on FI_DELIVERY_COMPLETE
>
>Having the notification at the requester is useful for MPI RMA or SHMEM use 
>cases. This allows MPI/SHMEM to wait for a local event that indicates remote 
>completion. The responder side is passive in these use cases.
>
>Both MPI and SHMEM have different mechanisms to let the responder know when it 
>is able to look at the data.
>
>Thanks,
>Sayantan.
>
>From: 
><[email protected]<mailto:[email protected]>>
> on behalf of Paul Grun <[email protected]<mailto:[email protected]>>
>Date: Wednesday, October 21, 2015 at 9:33 AM
>To: "[email protected]<mailto:[email protected]>" 
><[email protected]<mailto:[email protected]>>
>Subject: [ofiwg] A question on FI_DELIVERY_COMPLETE
>
>Here’s my understanding of how FI_DELIVERY_COMPLETE works on the *responder* 
>end:  If you are doing an RMA operation, and the requester uses CQ_REMOTE_DATA 
>to signal the end of the transfer to the responder, and the responder has 
>FI_DELIVERY_COMPLETE set, then the responder won’t get a completion event 
>until the data is actually visible to the responder.
>
>I ask because the man pages imply that FI_DELIVERY_COMPLETE, which is an 
>operation flag, applies only to the requester side.  But it is much less 
>important to notify the requester that data is visible to the responder, than 
>it is to notify the responder itself.
>
>Comments?
>-Paul
>
>
>Cray Inc.
>Office:    (503) 620-8757
>Mobile:  (503) 703-5382
>
_______________________________________________
ofiwg mailing list
[email protected]
http://lists.openfabrics.org/mailman/listinfo/ofiwg

Reply via email to