> It is a predefined attribute and should be automatically set by the MPI layer
> using the pml_max_tag field of the selected PML.
In the MLTs this is set at registration time in
struct mca_mtl_base_module_t {
…
int mtl_max_tag; /**< maximum tag value. note that negative tags
must be allowed */
However, I found something that seems to be wrong or I’m missing somethings.
The current implementation supports 32 bits but is setting (1UL << 30).
Shouldn’t this actually be (1 <<31) -1 ?
>As I mentioned PML (OB1) only supports 16 bits tags (in fact 15 because
>negative tags are reserved for OPI internal usage). I do not recall any
>complaints about this limit. Targeting consistency across PMLs provide
>user-friendliness, thus a default of 16 bits for tag and then everything else
>for the cid might be a sensible choice
Ok, 26 bits cid | 18 bits source rank | 4 bits prot|16 bits tag. For the
default and a build time option to use the one I proposed in my first email.
Thanks,
_MAC
From: devel [mailto:[email protected]] On Behalf Of George
Bosilca
Sent: Sunday, March 04, 2018 9:08 AM
To: Open MPI Developers <[email protected]>
Subject: Re: [OMPI devel] Default tag for OFI MTL
On Sat, Mar 3, 2018 at 6:35 PM, Cabral, Matias A
<[email protected]<mailto:[email protected]>> wrote:
Hi George,
Thanks for the feedback, appreciated. Few questions/comments:
> Regarding the tag with your proposal the OFI MTL will support a wider range
> of tags than the OB1 PML, where we are limited to 16 bits. Just make sure you
> correctly expose your tag limit via the MPI_TAG_UB.
I will take a look at MPI_TAG_UB.
It is a predefined attribute and should be automatically set by the MPI layer
using the pml_max_tag field of the selected PML.
> I personally would prefer a solution where we can alter the distribution of
> bits between bits in the cid and tag at compile time.
Sure, I can do this. What would you suggest for plan B? Fewer tag bits and more
cid ones? Numbers?
As I mentioned PML (OB1) only supports 16 bits tags (in fact 15 because
negative tags are reserved for OPI internal usage). I do not recall any
complaints about this limit. Targeting consistency across PMLs provide
user-friendliness, thus a default of 16 bits for tag and then everything else
for the cid might be a sensible choice.
George.
>. We can also envision this selection to be driven by an MCA parameter, but
>this might be too costly
I did think about it. However, as you say, I’m not yet convinced it is worth it:
a) I will be soon reviewing synchronous send protocol. Not reviewed
thoroughly yet, but I’m quite sure I can reduce it to use 2 bits (maybe just
1). Freeing 2 (or 3) more bits for cids or ranks.
b) Most of the providers TODAY effectively support FI_REMOTE_CQ_DATA and
FI_DIRECTED_RECV (psm2, gni, verbs;ofi_rxm, sockets). This is just a fallback
for potential new ones. FI_DIRECTED_RECV is necessary to discriminate the
source at RX time when the source is not in the tag.
c) I will include build_time_plan_B you just suggested ;)
Thanks, again.
_MAC
From: devel
[mailto:[email protected]<mailto:[email protected]>]
On Behalf Of George Bosilca
Sent: Saturday, March 03, 2018 6:29 AM
To: Open MPI Developers
<[email protected]<mailto:[email protected]>>
Subject: Re: [OMPI devel] Default tag for OFI MTL
Hi Matias,
Relaxing the restriction on the number of ranks is definitively a good thing.
The cost will be reflected on the number of communicators and tags, and we must
be careful how we balance this.
Assuming context_id is the communicator cid, with 10 bits you can only support
1024. A little low, even lower than MVAPICH. The way we allocate cid is very
sparse, and with a limited number of possible cid, we might run in troubles
very quickly for the few applications that are using a large number of
communicators, and for the resilience support. Yet another reason to revisit
the cid allocation in the short term.
Regarding the tag with your proposal the OFI MTL will support a wider range of
tags than the OB1 PML, where we are limited to 16 bits. Just make sure you
correctly expose your tag limit via the MPI_TAG_UB.
I personally would prefer a solution where we can alter the distribution of
bits between bits in the cid and tag at compile time. We can also envision this
selection to be driven by an MCA parameter, but this might be too costly.
George.
On Sat, Mar 3, 2018 at 2:56 AM, Cabral, Matias A
<[email protected]<mailto:[email protected]>> wrote:
Hi all,
I’m working on extending the OFI MTL to support FI_REMOTE_CQ_DATA (1) to extend
the number of ranks currently supported by the MTL. Currently limited to only
16 bits included in the OFI tag (2). After the feature is implemented there
will be no limitation for providers that support FI_REMOTE_CQ_DATA and
FI_DIRECTED_RECEIVE (3). However, there will be a fallback mode for providers
that do not support these features and I would like to get consensus on the
default tag distribution. This is my proposal:
* Default: No FI_REMOTE_CQ_DATA
* 01234567 01| 234567 01234567 0123| 4567 |01234567 01234567 01234567 01234567
* context_id | source rank |proto| message tag
#define MTL_OFI_CONTEXT_MASK (0xFFC0000000000000ULL)
#define MTL_OFI_SOURCE_MASK (0x003FFFF0000000000ULL)
#define MTL_OFI_SOURCE_BITS_COUNT (18) /* 262,143 ranks */
#define MTL_OFI_CONTEXT_BITS_COUNT (10) /* 1,023 communicators */
#define MTL_OFI_TAG_BITS_COUNT (32) /* no restrictions */
#define MTL_OFI_PROTO_BITS_COUNT (4)
Notes:
- More ranks and fewer context ids than the current implementation.
- Moved the protocol bits from the most significant bits because some
providers may reserve starting from there (see mem_tag_format (4)) and sync
send will not work.
Thoughts?
Today we had a call with Howard (LANL), John and Hamuri (HPE) and briefly
talked about this, and also thought about sending this email as a query to find
other developers keeping an eye on OFI support in OMPI.
Thanks,
_MAC
(1) https://ofiwg.github.io/libfabric/master/man/fi_cq.3.html
(2)
https://github.com/open-mpi/ompi/blob/master/ompi/mca/mtl/ofi/mtl_ofi_types.h#L70
(3) https://ofiwg.github.io/libfabric/master/man/fi_getinfo.3.html
(4) https://ofiwg.github.io/libfabric/master/man/fi_endpoint.3.html
_______________________________________________
devel mailing list
[email protected]<mailto:[email protected]>
https://lists.open-mpi.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
[email protected]<mailto:[email protected]>
https://lists.open-mpi.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
[email protected]
https://lists.open-mpi.org/mailman/listinfo/devel