In that case we should find a way to eliminate this behavior. I will
take a look later this week and see if there is a workable solution.

-Nathan

On Wed, Mar 11, 2015 at 11:41:00AM -0600, Howard Pritchard wrote:
>    My experience with DMA engines located on the other side of a PCI-e 16x
>    gen3 bus from the cpus is that for a couple of ranks doing large
>    transfers between each other on a node, using the DMA engine looks good. 
>    But once there are multiple ranks exchanging data (like up to 32 ranks on
>    a dual socket haswell node, not using HT),  using the DMA engine of the
>    NIC is not such a good idea.
>    Howard
>    2015-03-11 10:57 GMT-06:00 Nathan Hjelm <[email protected]>:
> 
>      Definitely a side-effect though it could be beneficial in some cases as
>      the RDMA engine in the HCA may be faster than using memcpy (larger than
>      a certain size). I don't know how to best fix this as I need all RDMA
>      capable BTLs to listed for RMA. I though about adding another list to
>      track BTLs that have both RMA and atomics but that would increase the
>      memory footprint of Open MPI by a factor of nranks.
> 
>      -Nathan
> 
>      On Thu, Feb 26, 2015 at 11:59:41PM +0000, Rolf vandeVaart wrote:
>      >    This message is mostly for Nathan, but figured I would go with the
>      wider
>      >    distribution. I have noticed some different behaviour that I assume
>      >    started with this change.
>      >
>      >   
>      
> https://github.com/open-mpi/ompi/commit/4bf7a207e90997e75ba1c60d9d191d9d96402d04
>      >
>      >    I am noticing that the openib BTL will also be used for on-node
>      >    communication even though the sm (or smcuda) BTL is also available.
>      I
>      >    think with the aforementioned change that the openib BTL is listed
>      as an
>      >    available BTL that supports RDMA. While looking through the
>      debugger and
>      >    looking at the bml_endpoint, it appears that the sm BTL is listed
>      as the
>      >    eager and send BTL, but the openib is listed as the RDMA btl.
>      Looking at
>      >    the logic in pml_ob1_sendreq.h, it looks like we can end up
>      selecting the
>      >    openib btl for some of the communication. I ran with some various
>      >    verbosity and saw that this was happening. With v1.8, we only
>      appear to
>      >    use the sm (or smcuda) btl.
>      >
>      >    I am wondering if this was intentional with this change or maybe a
>      side
>      >    effect.
>      >
>      >    Rolf
>      >
>      >     
>      ----------------------------------------------------------------------
>      >
>      >    This email message is for the sole use of the intended recipient(s)
>      and
>      >    may contain confidential information.  Any unauthorized review,
>      use,
>      >    disclosure or distribution is prohibited.  If you are not the
>      intended
>      >    recipient, please contact the sender by reply email and destroy all
>      copies
>      >    of the original message.
>      >
>      >     
>      ----------------------------------------------------------------------
> 
>      > _______________________________________________
>      > devel mailing list
>      > [email protected]
>      > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>      > Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2015/02/17065.php
> 
>      _______________________________________________
>      devel mailing list
>      [email protected]
>      Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>      Link to this post:
>      http://www.open-mpi.org/community/lists/devel/2015/03/17127.php

> _______________________________________________
> devel mailing list
> [email protected]
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/03/17128.php

Attachment: pgpH2L48vwH2x.pgp
Description: PGP signature

Reply via email to