Hi all,

I'm working on extending the OFI MTL to support FI_REMOTE_CQ_DATA (1) to extend 
the number of ranks currently supported by the MTL. Currently limited to only 
16 bits included in the OFI tag (2). After the feature is implemented there 
will be no limitation for providers that support FI_REMOTE_CQ_DATA and 
FI_DIRECTED_RECEIVE (3). However, there will be a fallback mode for providers 
that do not support these features and I would like to get consensus on the 
default tag distribution. This is my proposal:

* 01234567 01| 234567 01234567 0123| 4567 |01234567 01234567 01234567 01234567
* context_id   |    source rank                 |proto|          message tag

#define MTL_OFI_CONTEXT_MASK            (0xFFC0000000000000ULL)
#define MTL_OFI_SOURCE_MASK             (0x003FFFF0000000000ULL)
#define MTL_OFI_SOURCE_BITS_COUNT       (18) /* 262,143 ranks */
#define MTL_OFI_CONTEXT_BITS_COUNT      (10) /* 1,023 communicators */
#define MTL_OFI_TAG_BITS_COUNT          (32) /* no restrictions */
#define MTL_OFI_PROTO_BITS_COUNT        (4)


-          More ranks and fewer context ids than the current implementation.

-          Moved the protocol bits from the most significant bits because some 
providers may reserve starting from there (see mem_tag_format (4)) and sync 
send will not work.


Today we had a call with Howard (LANL), John and Hamuri (HPE) and briefly 
talked about this, and also thought about sending this email as a query to find 
other developers keeping an eye on OFI support in OMPI.


(1)    https://ofiwg.github.io/libfabric/master/man/fi_cq.3.html


(3)    https://ofiwg.github.io/libfabric/master/man/fi_getinfo.3.html

(4)    https://ofiwg.github.io/libfabric/master/man/fi_endpoint.3.html

devel mailing list

Reply via email to