I haven't look at that yet. Would be great to get the new osc component
working over both btls and mtls. I know portals supports atomics but I
don't know whether psm does.

-Nathan

On Thu, Nov 06, 2014 at 08:45:15PM +0200, Mike Dubman wrote:
>    btw, do you plan to add atomics API to MTL layer as well?
>    On Thu, Nov 6, 2014 at 5:23 PM, Nathan Hjelm <hje...@lanl.gov> wrote:
> 
>      At the moment I select the lowest latency BTL that can reach all of the
>      ranks in the communicator used to create the window. I can add code to
>      round-robin windows over the available BTLs on multi-rail systems.
> 
>      -Nathan
>      On Wed, Nov 05, 2014 at 06:38:25PM -0800, Paul Hargrove wrote:
>      >    All atomics must be done through not just "the same btl" but the
>      same btl
>      >    MODULE,  since atomics from two IB HCAs, for instance, are not
>      necessarily
>      >    coherent. So, how is the "best" one to be selected?
>      >
>      >    -Paul [Sent from my phone]
>      >
>      >    On Nov 5, 2014 7:15 AM, "Nathan Hjelm" <hje...@lanl.gov> wrote:
>      >
>      >      In the new osc component I don't try to handle that case. All
>      atomics
>      >      have to be done through the same btl (including atomics on self).
>      I did
>      >      this because with the default setup of Gemini they can not be
>      mixed. If
>      >      it is possible to mix them with other networks I would be happy
>      to add
>      >      an atomic flag for that.
>      >
>      >      -Nathan
>      >
>      >      On Wed, Nov 05, 2014 at 03:41:58AM -0500, Joshua Ladd wrote:
>      >      >    Quick question. Out of curiosity, how do you handle the
>      (common)
>      >      case of
>      >      >    mixing network atomics with CPU atomics? Say for a single
>      target
>      >      with two
>      >      >    initiators, one initiator is on host with the target, so
>      goes
>      >      through the
>      >      >    SM BTL, and the other initiator is off host, so goes through
>      the
>      >      network
>      >      >    BTL.
>      >      >
>      >      >    Josh
>      >      >    On Tue, Nov 4, 2014 at 6:46 PM, Nathan Hjelm
>      <hje...@lanl.gov>
>      >      wrote:
>      >      >
>      >      >      What: Completely revamp the BTL RDMA interface (btl_put,
>      btl_get)
>      >      to
>      >      >      better match what is needed for MPI one-sided.
>      >      >
>      >      >      Why: I am preparing to push an enhanced MPI-3 one-sided
>      component
>      >      that
>      >      >      makes use of network rdma and atomic operations to provide
>      a fast
>      >      truely
>      >      >      one-sided implementation. Before I can push this component
>      I want
>      >      to
>      >      >      change the btl interface to:
>      >      >
>      >      >       - Provide access to network atomic operations. I only
>      need add
>      >      and
>      >      >         cswap but the interface can be extended to any number
>      of
>      >      operations.
>      >      >
>      >      >         The new interface provides three new functions:
>      btl_atomic_op,
>      >      >         btl_atomic_fop, and btl_atomic_cswap. Additionally
>      there are
>      >      two new
>      >      >         btl_flags to indicate available atomic support:
>      >      >         MCA_BTL_FLAGS_ATOMIC_OPS, and
>      MCA_BTL_FLAGS_ATOMIC_FOPS. The
>      >      >         btl_atomics_flags field has been added to indicate
>      which
>      >      atomic
>      >      >         operations are supported (see
>      mca_btl_base_atomic_op_t). At
>      >      this time
>      >      >         I only added support for 64-bit integer atomics but I
>      am open
>      >      to
>      >      >         adding support for 32-bit as well.
>      >      >
>      >      >       - Provide an interface that will allow simultaneous
>      put/get
>      >      operations
>      >      >         without extra calls into the btl. The current interface
>      >      requires the
>      >      >         btl user to call prepare_src/prepare_dst before every
>      rdma
>      >      >         operation. In some cases this is a complete waste
>      (vader, sm
>      >      with
>      >      >         CMA, knem, or xpmem).
>      >      >
>      >      >         I seperated the registration of memory from the segment
>      info.
>      >      More
>      >      >         information is provided below. The new put/get
>      functions have
>      >      the
>      >      >         following signatures:
>      >      >
>      >      >      typedef int (*mca_btl_base_module_put_fn_t) (struct
>      >      >      mca_btl_base_module_t *btl,
>      >      >          struct mca_btl_base_endpoint_t *endpoint, void
>      >      *local_address,
>      >      >          uint64_t remote_address, struct
>      >      mca_btl_base_registration_handle_t
>      >      >      *local_handle,
>      >      >          struct mca_btl_base_registration_handle_t
>      *remote_handle,
>      >      size_t
>      >      >      size, int flags,
>      >      >          int order, mca_btl_base_rdma_completion_fn_t cbfunc,
>      void
>      >      >      *cbcontext, void *cbdata);
>      >      >
>      >      >      typedef int (*mca_btl_base_module_get_fn_t) (struct
>      >      >      mca_btl_base_module_t *btl,
>      >      >          struct mca_btl_base_endpoint_t *endpoint, void
>      >      *local_address,
>      >      >          uint64_t remote_address, struct
>      >      mca_btl_base_registration_handle_t
>      >      >      *local_handle,
>      >      >          struct mca_btl_base_registration_handle_t
>      *remote_handle,
>      >      size_t
>      >      >      size, int flags,
>      >      >          int order, mca_btl_base_rdma_completion_fn_t cbfunc,
>      void
>      >      >      *cbcontext, void *cbdata);
>      >      >
>      >      >      typedef void (*mca_btl_base_rdma_completion_fn_t)(
>      >      >          struct mca_btl_base_module_t* module,
>      >      >          struct mca_btl_base_endpoint_t* endpoint,
>      >      >          void *local_address,
>      >      >          struct mca_btl_base_registration_handle_t
>      *local_handle,
>      >      >          void *context,
>      >      >          void *cbdata,
>      >      >          int status);
>      >      >
>      >      >         I may modify the completion function to provide more
>      >      information on
>      >      >         the completed operation (size).
>      >      >
>      >      >       - Allow the registration of an entire region even if the
>      region
>      >      can not
>      >      >         be modified with a single rdma operation. At this time
>      >      prepare_src
>      >      >         and prepare_dst may modify the size and register a
>      smaller
>      >      >         region. This will not work.
>      >      >
>      >      >         This is done in the new interface through the new
>      >      btl_register_mem,
>      >      >         and btl_deregister_mem interfaces. The btl_register_mem
>      >      interface
>      >      >         returns a registration handle of size
>      >      btl_registration_handle_size
>      >      >         that can be used as either the local_handle or
>      remote_handle
>      >      to any
>      >      >         rdma/atomic function. BTLs that do not provide these
>      functions
>      >      do not
>      >      >         require registration for rdma/atomic operations.
>      >      >
>      >      >      typedef struct mca_btl_base_registration_handle_t
>      >      >      *(*mca_btl_base_module_register_mem_fn_t)(
>      >      >          struct mca_btl_base_module_t* btl, struct
>      >      mca_btl_base_endpoint_t
>      >      >      *endpoint, void *base,
>      >      >          size_t size, uint32_t flags);
>      >      >
>      >      >      typedef struct mca_btl_base_registration_handle_t
>      >      >      *(*mca_btl_base_module_register_mem_fn_t)(
>      >      >          struct mca_btl_base_module_t* btl, struct
>      >      mca_btl_base_endpoint_t
>      >      >      *endpoint, void *base,
>      >      >          size_t size, uint32_t flags);
>      >      >
>      >      >       - Expose the limitations of the put and get operations so
>      the
>      >      caller
>      >      >         can make decisions before trying a get or put
>      operation. Two
>      >      >         examples: the Gemini interconnect has an alignment
>      restriction
>      >      on
>      >      >         get, openib devices may have a limit on how large a
>      single
>      >      get/put
>      >      >         operation can be. The current interface sort of gives
>      the put
>      >      limit
>      >      >         but it is tied to the rdma pipeline protocol.
>      >      >
>      >      >         This is done in the new interface by providing
>      btl_get_limit,
>      >      >         btl_get_alignment, btl_put_limit, and
>      btl_put_alignment.
>      >      Operations
>      >      >         that violate these restrictions should return
>      >      OPAL_ERR_BAD_PARAM
>      >      >         (operation over limit) or OPAL_ERR_NOT_SUPPORTED
>      (operation
>      >      not
>      >      >         supported due to alignment restructions with either the
>      source
>      >      or
>      >      >         destination buffer).
>      >      >
>      >      >      This is a big change and I do not expect everyone to like
>      100% of
>      >      these
>      >      >      changes. I welcome any feedback people have.
>      >      >
>      >      >      When: Tuesday, Nov 17, 2015. This is during SC so there
>      will be
>      >      time for
>      >      >      face-to-face discussion if anyone has any concerns or
>      would like
>      >      to see
>      >      >      something changed.
>      >      >
>      >      >      The proposed new btl interface as well as updated versions
>      of:
>      >      pml/ob1,
>      >      >      btl/openib, btl/self, btl/scif, btl/sm, btl/tcp, btl/ugni,
>      and
>      >      btl/vader
>      >      >      can be found in my btlmod branch at:
>      >      >
>      >      >      https://github.com/hjelmn/ompi/tree/btlmod
>      >      >
>      >      >      Other btls (smcuda, and usnic) still need to be updated to
>      >      provide the
>      >      >      new interface. Unmodified btl will not build.
>      >      >
>      >      >      If there are no objections I will push the btl
>      modifications into
>      >      the
>      >      >      master two weeks from today (Nov 17). Please take a look
>      and let
>      >      me know
>      >      >      what you think.
>      >      >
>      >      >      _______________________________________________
>      >      >      devel mailing list
>      >      >      de...@open-mpi.org
>      >      >      Subscription:
>      http://www.open-mpi.org/mailman/listinfo.cgi/devel
>      >      >      Link to this post:
>      >      >     
>      http://www.open-mpi.org/community/lists/devel/2014/11/16193.php
>      >
>      >      > _______________________________________________
>      >      > devel mailing list
>      >      > de...@open-mpi.org
>      >      > Subscription:
>      http://www.open-mpi.org/mailman/listinfo.cgi/devel
>      >      > Link to this post:
>      >      http://www.open-mpi.org/community/lists/devel/2014/11/16195.php
>      >
>      >      _______________________________________________
>      >      devel mailing list
>      >      de...@open-mpi.org
>      >      Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>      >      Link to this post:
>      >      http://www.open-mpi.org/community/lists/devel/2014/11/16198.php
> 
>      > _______________________________________________
>      > devel mailing list
>      > de...@open-mpi.org
>      > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>      > Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2014/11/16224.php
> 
>      _______________________________________________
>      devel mailing list
>      de...@open-mpi.org
>      Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>      Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2014/11/16230.php
> 
>    --
>    Kind Regards,
>    M.

> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/11/16240.php

Attachment: pgpbjMs0UuUt_.pgp
Description: PGP signature

Reply via email to