The progress functions do not need to return an error code. If there
is an error they should propagate it back through the descriptors. The
only meaning of the return code of the progress functions is to know
if any event had happened during this round of progress. The
opal_output use it to trigger the yield (if one if necessary).
Anyway, this is a good catch. We're doing a big mixup there. Attached
you will find a patch that clean this problem. As expected, there is
no performance impact ...
george.
Here is the patch:
Index: btl_sm_component.c
===================================================================
--- btl_sm_component.c (revision 22176)
+++ btl_sm_component.c (working copy)
@@ -361,7 +361,7 @@
sm_fifo_t *fifo = NULL;
mca_btl_sm_hdr_t *hdr;
int my_smp_rank = mca_btl_sm_component.my_smp_rank;
- int peer_smp_rank, j, rc = 0;
+ int peer_smp_rank, j, rc = 0, events = 0;
/* first, deal with any pending sends */
/* This check should be fast since we only need to check one
variable. */
@@ -399,7 +399,7 @@
continue;
}
- rc++;
+ events++;
/* dispatch fragment by type */
switch(((uintptr_t)hdr) & MCA_BTL_SM_FRAG_TYPE_MASK) {
case MCA_BTL_SM_FRAG_SEND:
@@ -480,5 +480,5 @@
break;
}
}
- return rc;
+ return events;
}
On Oct 30, 2009, at 15:22 , Eugene Loh wrote:
What is the significance of the btl_sm_component_progress() return
code rc? It appears to be incremented each time something is read
off the FIFO, but also it's the return code from writing to a FIFO.
This seems kind of dual purpose.
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel