Re: [OMPI devel] Remote key sizes

Nathan T. Hjelm Tue, 8 Nov 2011 10:36:24 -0500


On Tue, 8 Nov 2011 06:36:03 -0800, Rolf vandeVaart <rvandeva...@nvidia.com>
wrote:
>>  george.
>>
>>PS: Regarding the hand-copy instead of the memcpy, we tried to avoid
> using
>>memcpy in performance critical codes, especially when we know the size of
>>the data and the alignment. This relieves the compiler of adding ugly
> intrinsics,
>>allowing it to nicely pipeline to load/stores. Anyway, with both
> approaches
>>you will copy more data than needed for all BTLs except uGNI.
> 
> I was looking at a case in a BTL I was working on where I actually need
64
> bytes (yes, bytes) as the remote key size as opposed to the current 16
> bytes (128 bits).
> Not sure how I can handle that yet.  (I assume configure is my friend,
but
> even in that case, all headers will need to carry around the extra data.)
>


I have been thinking about this a little bit. What I think should be done
(and I am sure George will disagree) is to allow BTLs to define how long a
segment is. The PML would then just memcpy the segments into the send
buffer (instead of copying each member).

For example mca_btl_base_segment_t would become:

struct mca_btl_base_segment_t {
    size_t seg_len;
};

since the pml needs the segment size (it does not need anything else).

and then each btl would define its own segment like:
struct mca_btl_ugni_segment_t {
    struct mca_btl_base_segment_t base;
    gni_mem_handle_t seg_key;
};

and we would add:
size_t btl_segment_len;

to the mca_btl_base_module_t or the base frag so the pml knows how much it
needs to copy.

This design would address George's criticism of the length of the seg_key
and also allow BTLs to do what they need to. It would require a memcpy but
I disagree this would slow the critical path. Even if it does it would be
relatively minor (i think) and the flexibility is worth more in the long
run.

-Nathan

Re: [OMPI devel] Remote key sizes

Reply via email to