Today's trunk compiled with icc fails to complete the check on 2 tests:
opal_lifo and opal_tree.
For opal_tree the output is:
OPAL dss:unpack: got type 9 when expecting type 3
Failure : failed tree deserialization size compare
SUPPORT: OMPI Test failed: opal_tree_t (1 of 12 failed)
and opal_lif
Skimming through the PSM code shows that the return values of the PSM
functions are handled in most cases. Thus, removing the default error
handler might not be such a bad idea.
Did you experience any trouble running with the version without the default
error handler registered?
George.
On Th
It even says so in the code:
ompi/mca/mtl/psm/mtl_psm.c:
/* Default error handling is enabled, errors will not be returned to
* user. PSM prints the error and the offending endpoint's hostname
* and exits with -1 */
Disabling the default PSM error handler makes MPI_Canc
As PSM on master is still broken I applied it on 1.8.4. Unfortunately it
does not work. The error is the same as before.
Looking at your patch I would also expect that this is the correct fix
and I even tried to change ompi_mtl_psm_cancel() to always return
OMPI_SUCCESS. MPI_Cancel() still fails.
thanks George!
2015-01-15 11:43 GMT-07:00 George Bosilca :
> From the MPI standard perspective MPI_Cancel doesn't have to succeed, it
> can also gracefully fail. However, the PSM MTL diverges from the MPI
> standard and if a request cannot be canceled an error is returned. Here is
> a patch to f
>From the MPI standard perspective MPI_Cancel doesn't have to succeed, it
can also gracefully fail. However, the PSM MTL diverges from the MPI
standard and if a request cannot be canceled an error is returned. Here is
a patch to fix this issue.
diff --git a/ompi/mca/mtl/psm/mtl_psm_cancel.c
b/ompi
Doing
MPI_Isend()
followed by a
MPI_Cancel()
fails on my PSM based system with 1.8.4 like this:
n040108:0.1.Cannot cancel send requests (req=0x2b6279787f80)
n040108:0.0.Cannot cancel send requests (req=0x2b3a3dc92f80)
---
Primary job termin