Thanks.  So, you'll put back the change?

George Bosilca wrote:

The progress functions do not need to return an error code. If there is an error they should propagate it back through the descriptors. The only meaning of the return code of the progress functions is to know if any event had happened during this round of progress. The opal_output use it to trigger the yield (if one if necessary).

Anyway, this is a good catch. We're doing a big mixup there. Attached you will find a patch that clean this problem. As expected, there is no performance impact ...

Here is the patch:

Index: btl_sm_component.c
===================================================================
--- btl_sm_component.c    (revision 22176)
+++ btl_sm_component.c    (working copy)
@@ -361,7 +361,7 @@
      sm_fifo_t *fifo = NULL;
      mca_btl_sm_hdr_t *hdr;
      int my_smp_rank = mca_btl_sm_component.my_smp_rank;
-    int peer_smp_rank, j, rc = 0;
+    int peer_smp_rank, j, rc = 0, events = 0;

      /* first, deal with any pending sends */
/* This check should be fast since we only need to check one variable. */
@@ -399,7 +399,7 @@
              continue;
          }

-        rc++;
+        events++;
          /* dispatch fragment by type */
          switch(((uintptr_t)hdr) & MCA_BTL_SM_FRAG_TYPE_MASK) {
              case MCA_BTL_SM_FRAG_SEND:
@@ -480,5 +480,5 @@
                  break;
          }
      }
-    return rc;
+    return events;
  }

Reply via email to