On 7/16/2012 1:45 PM, Barrett, Brian W wrote:
It's unlikely that I will have time to fix this in the short term.  The
scheduling code is fairly localized in nbc.c if Oracle has some time to
spend looking at these issues.  If not, it might be best to remove the
libnbc code from 1.7, as it's unfortunately clear that it's not as ready
for integration as we believed
Or both! That is, I agree the code looks manageable and I'm inclined to take a whack at it. Nevertheless, the NBC stuff in v1.7 is in a painful state. Without CMRs, it does perhaps more harm than good.
  and I don't have time to fix the code base.

Brian

On 7/16/12 2:50 PM, "Eugene Loh"<eugene....@oracle.com>  wrote:

The NBC functionality doesn't fare very well on SPARC.  One of the
problems is with data alignment.  An NBC schedule is a number of
variously sized fields laid out contiguously in linear memory  (e.g.,
see nbc_internal.h or nbc.c) and words don't have much natural
alignment.  On SPARC, the "default" (for some definition of that word)
is to sigbus when a word is not properly aligned.  In any case (even
non-SPARC), one might argue misalignment and subsequent exception
handling is nice to avoid.

Here are two specific issues.

*)  Schedule layout uses single-char delimiters between "round
schedules".  So, even if the first "round schedule" has nice alignment,
the second will have single-byte offsets for its components.

*)  8-byte pointers can fall on 4-byte boundaries.  E.g., say a schedule
starts on some "nice" alignment.  The first words of the schedule will be:

     int            total size of the schedule
     int            number of elements in the first round schedule
     enum           type of function
     void *         pointer to some buffer

So, with -m64, that 8-byte pointer is on a 12-byte boundary.

Any input/comments on how to proceed?
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Reply via email to