>From the MPI standard perspective MPI_Cancel doesn't have to succeed, it
can also gracefully fail. However, the PSM MTL diverges from the MPI
standard and if a request cannot be canceled an error is returned. Here is
a patch to fix this issue.

diff --git a/ompi/mca/mtl/psm/mtl_psm_cancel.c
b/ompi/mca/mtl/psm/mtl_psm_cancel
index 6da3386..277c761 100644
--- a/ompi/mca/mtl/psm/mtl_psm_cancel.c
+++ b/ompi/mca/mtl/psm/mtl_psm_cancel.c
@@ -37,10 +37,8 @@ int ompi_mtl_psm_cancel(struct mca_mtl_base_module_t*
mtl,
     if(PSM_OK == err) {
       mtl_request->ompi_req->req_status._cancelled = true;
       mtl_psm_request->super.completion_callback(&mtl_psm_request->super);
-      return OMPI_SUCCESS;
-    } else {
-      return OMPI_ERROR;
     }
+    return OMPI_SUCCESS;
   } else if(PSM_MQ_INCOMPLETE == err) {
     return OMPI_SUCCESS;
   }

  George.


On Thu, Jan 15, 2015 at 1:30 PM, Adrian Reber <adr...@lisas.de> wrote:

> Doing
>
> MPI_Isend()
>
> followed by a
>
> MPI_Cancel()
>
> fails on my PSM based system with 1.8.4 like this:
>
> n040108:0.1.Cannot cancel send requests (req=0x2b6279787f80)
> n040108:0.0.Cannot cancel send requests (req=0x2b3a3dc92f80)
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> --------------------------------------------------------------------------
> mpirun detected that one or more processes exited with non-zero status,
> thus causing
> the job to be terminated. The first process to do so was:
>
>   Process name: [[58364,1],1]
>   Exit code:    255
> --------------------------------------------------------------------------
>
> Is this something PSM actually cannot do or an Open MPI error?
>
>                 Adrian
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/01/16783.php
>

Reply via email to