On Sat, Mar 3, 2018 at 6:35 PM, Cabral, Matias A <matias.a.cab...@intel.com>
wrote:

> Hi George,
>
>
>
> Thanks for the feedback, appreciated.  Few questions/comments:
>
>
>
> > Regarding the tag with your proposal the OFI MTL will support a wider
> range of tags than the OB1 PML, where we are limited to 16 bits. Just make
> sure you correctly expose your tag limit via the MPI_TAG_UB.
>
>
>
> I will take a look at MPI_TAG_UB.
>

It is a predefined attribute and should be automatically set by the MPI
layer using the pml_max_tag field of the selected PML.


> > I personally would prefer a solution where we can alter the
> distribution of bits between bits in the cid and tag at compile time.
>
>
>
> Sure, I can do this. What would you suggest for plan B? Fewer tag bits and
> more cid ones? Numbers?
>

As I mentioned PML (OB1) only supports 16 bits tags (in fact 15 because
negative tags are reserved for OPI internal usage). I do not recall any
complaints about this limit. Targeting consistency across PMLs provide
user-friendliness, thus a default of 16 bits for tag and then everything
else for the cid might be a sensible choice.

George.


> >. We can also envision this selection to be driven by an MCA parameter,
> but this might be too costly
>
>
>
> I did think about it. However, as you say, I’m not yet convinced it is
> worth it:
>
> a)      I will be soon reviewing synchronous send protocol. Not reviewed
> thoroughly yet, but I’m quite sure I can reduce it to use 2 bits (maybe
> just 1). Freeing 2 (or 3) more bits for cids or ranks.
>
> b)      Most of the providers TODAY effectively support FI_REMOTE_CQ_DATA
> and FI_DIRECTED_RECV (psm2, gni, verbs;ofi_rxm, sockets). This is just a
> fallback for potential new ones.  FI_DIRECTED_RECV is necessary to
> discriminate the source at RX time when the source is not in the tag.
>
> c)       I will include build_time_plan_B you just suggested ;)
>
>
>
> Thanks, again.
>
>
>
> _MAC
>
>
>
> *From:* devel [mailto:devel-boun...@lists.open-mpi.org] *On Behalf Of *George
> Bosilca
> *Sent:* Saturday, March 03, 2018 6:29 AM
> *To:* Open MPI Developers <devel@lists.open-mpi.org>
> *Subject:* Re: [OMPI devel] Default tag for OFI MTL
>
>
>
> Hi Matias,
>
>
>
> Relaxing the restriction on the number of ranks is definitively a good
> thing. The cost will be reflected on the number of communicators and tags,
> and we must be careful how we balance this.
>
>
>
> Assuming context_id is the communicator cid, with 10 bits you can only
> support 1024. A little low, even lower than MVAPICH. The way we allocate
> cid is very sparse, and with a limited number of possible cid, we might run
> in troubles very quickly for the few applications that are using a large
> number of communicators, and for the resilience support. Yet another reason
> to revisit the cid allocation in the short term.
>
>
>
> Regarding the tag with your proposal the OFI MTL will support a wider
> range of tags than the OB1 PML, where we are limited to 16 bits. Just make
> sure you correctly expose your tag limit via the MPI_TAG_UB.
>
>
>
> I personally would prefer a solution where we can alter the distribution
> of bits between bits in the cid and tag at compile time. We can also
> envision this selection to be driven by an MCA parameter, but this might be
> too costly.
>
>   George.
>
>
>
>
>
>
>
>
>
> On Sat, Mar 3, 2018 at 2:56 AM, Cabral, Matias A <
> matias.a.cab...@intel.com> wrote:
>
> Hi all,
>
>
>
> I’m working on extending the OFI MTL to support FI_REMOTE_CQ_DATA (1) to
> extend the number of ranks currently supported by the MTL. Currently
> limited to only 16 bits included in the OFI tag (2). After the feature is
> implemented there will be no limitation for providers that support
> FI_REMOTE_CQ_DATA and FI_DIRECTED_RECEIVE (3). However, there will be a
> fallback mode for providers that do not support these features and I would
> like to get consensus on the default tag distribution. This is my proposal:
>
>
>
> * Default: No FI_REMOTE_CQ_DATA
>
> * 01234567 01| 234567 01234567 0123| 4567 |01234567 01234567 01234567
> 01234567
>
> * context_id   |    source rank                 |proto|          message
> tag
>
>
>
> #define MTL_OFI_CONTEXT_MASK            (0xFFC0000000000000ULL)
>
> #define MTL_OFI_SOURCE_MASK             (0x003FFFF0000000000ULL)
>
> #define MTL_OFI_SOURCE_BITS_COUNT       (18) /* 262,143 ranks */
>
> #define MTL_OFI_CONTEXT_BITS_COUNT      (10) /* 1,023 communicators */
>
> #define MTL_OFI_TAG_BITS_COUNT          (32) /* no restrictions */
>
> #define MTL_OFI_PROTO_BITS_COUNT        (4)
>
>
>
> Notes:
>
> -          More ranks and fewer context ids than the current
> implementation.
>
> -          Moved the protocol bits from the most significant bits because
> some providers may reserve starting from there (see mem_tag_format (4)) and
> sync send will not work.
>
>
>
> Thoughts?
>
>
>
> Today we had a call with Howard (LANL), John and Hamuri (HPE) and briefly
> talked about this, and also thought about sending this email as a query to
> find other developers keeping an eye on OFI support in OMPI.
>
>
>
> Thanks,
>
> _MAC
>
>
>
>
>
> (1)    https://ofiwg.github.io/libfabric/master/man/fi_cq.3.html
>
> (2)    https://github.com/open-mpi/ompi/blob/master/ompi/mca/mtl/of
> i/mtl_ofi_types.h#L70
>
> (3)    https://ofiwg.github.io/libfabric/master/man/fi_getinfo.3.html
>
> (4)    https://ofiwg.github.io/libfabric/master/man/fi_endpoint.3.html
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel
>
>
>
> _______________________________________________
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel
>
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Reply via email to