Thanks. So, you'll put back the change?
George Bosilca wrote:
The progress functions do not need to return an error code. If there
is an error they should propagate it back through the descriptors.
The only meaning of the return code of the progress functions is to
know if any event had happened during this round of progress. The
opal_output use it to trigger the yield (if one if necessary).
Anyway, this is a good catch. We're doing a big mixup there. Attached
you will find a patch that clean this problem. As expected, there is
no performance impact ...
Here is the patch:
Index: btl_sm_component.c
===================================================================
--- btl_sm_component.c (revision 22176)
+++ btl_sm_component.c (working copy)
@@ -361,7 +361,7 @@
sm_fifo_t *fifo = NULL;
mca_btl_sm_hdr_t *hdr;
int my_smp_rank = mca_btl_sm_component.my_smp_rank;
- int peer_smp_rank, j, rc = 0;
+ int peer_smp_rank, j, rc = 0, events = 0;
/* first, deal with any pending sends */
/* This check should be fast since we only need to check one
variable. */
@@ -399,7 +399,7 @@
continue;
}
- rc++;
+ events++;
/* dispatch fragment by type */
switch(((uintptr_t)hdr) & MCA_BTL_SM_FRAG_TYPE_MASK) {
case MCA_BTL_SM_FRAG_SEND:
@@ -480,5 +480,5 @@
break;
}
}
- return rc;
+ return events;
}