Yossi,

I think you raised an interesting corner-case, and a possible bug in the MTL 
implementation. As the request is marked as complete by the CM/PML the cancel 
should never succeed. As the CM/PML is forcing the completion on all bend 
requests, it should also enforce that all completed requests cannot be 
cancelled (instead of leaving this task to the MTL).

I think the cleanest approach will be to allow the MTL itself  o handle the 
complete case, by moving the code you pinpointed to 
(MCA_PML_CM_HVY_SEND_REQUEST_START) from the CM/MTL down in each MTL send case 
(they can check for buffered send requests). This approach will possible allow 
an MTL to implement cancel sends.

  George.

On Aug 4, 2014, at 09:49 , Yossi Etigin <yos...@mellanox.com> wrote:

> Hi,
>  
> Seems like it’s impossible to cancel buffered sends with pml/cm.
>  
> From one hand, pml/cm completes the buffered send immediately 
> (MCA_PML_CM_HVY_SEND_REQUEST_START):
>         if(OMPI_SUCCESS == ret &&                                             
>    \
>            sendreq->req_send.req_send_mode == MCA_PML_BASE_SEND_BUFFERED) {   
>    \
>             sendreq->req_send.req_base.req_ompi.req_status.MPI_ERROR = 0;     
>    \
>             ompi_request_complete(&(sendreq)->req_send.req_base.req_ompi, 
> true); \
>         }
>  
> So, if the user is doing Bsend()/Cancel()/Wait()/Test_canceled(), the Wait() 
> would be a no-op.
> Therefore when mtl_cancel() was called, it had to either cancel/guarantee 
> completion *immediately*, otherwise the return from Test_canceled would be 
> undefined.
> However, it’s not always possible to cancel immediately, because need to make 
> sure the peer has not matched it yet (fox example, with mtl mxm).
>  
> IMHO it’s wrong for pml_cm to complete a buffered send immediately.
> What do you think?
>  
> --Yossi

Reply via email to