In that case we should find a way to eliminate this behavior. I will take a look later this week and see if there is a workable solution.
-Nathan On Wed, Mar 11, 2015 at 11:41:00AM -0600, Howard Pritchard wrote: > My experience with DMA engines located on the other side of a PCI-e 16x > gen3 bus from the cpus is that for a couple of ranks doing large > transfers between each other on a node, using the DMA engine looks good. > But once there are multiple ranks exchanging data (like up to 32 ranks on > a dual socket haswell node, not using HT), using the DMA engine of the > NIC is not such a good idea. > Howard > 2015-03-11 10:57 GMT-06:00 Nathan Hjelm <[email protected]>: > > Definitely a side-effect though it could be beneficial in some cases as > the RDMA engine in the HCA may be faster than using memcpy (larger than > a certain size). I don't know how to best fix this as I need all RDMA > capable BTLs to listed for RMA. I though about adding another list to > track BTLs that have both RMA and atomics but that would increase the > memory footprint of Open MPI by a factor of nranks. > > -Nathan > > On Thu, Feb 26, 2015 at 11:59:41PM +0000, Rolf vandeVaart wrote: > > This message is mostly for Nathan, but figured I would go with the > wider > > distribution. I have noticed some different behaviour that I assume > > started with this change. > > > > > > https://github.com/open-mpi/ompi/commit/4bf7a207e90997e75ba1c60d9d191d9d96402d04 > > > > I am noticing that the openib BTL will also be used for on-node > > communication even though the sm (or smcuda) BTL is also available. > I > > think with the aforementioned change that the openib BTL is listed > as an > > available BTL that supports RDMA. While looking through the > debugger and > > looking at the bml_endpoint, it appears that the sm BTL is listed > as the > > eager and send BTL, but the openib is listed as the RDMA btl. > Looking at > > the logic in pml_ob1_sendreq.h, it looks like we can end up > selecting the > > openib btl for some of the communication. I ran with some various > > verbosity and saw that this was happening. With v1.8, we only > appear to > > use the sm (or smcuda) btl. > > > > I am wondering if this was intentional with this change or maybe a > side > > effect. > > > > Rolf > > > > > ---------------------------------------------------------------------- > > > > This email message is for the sole use of the intended recipient(s) > and > > may contain confidential information. Any unauthorized review, > use, > > disclosure or distribution is prohibited. If you are not the > intended > > recipient, please contact the sender by reply email and destroy all > copies > > of the original message. > > > > > ---------------------------------------------------------------------- > > > _______________________________________________ > > devel mailing list > > [email protected] > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/02/17065.php > > _______________________________________________ > devel mailing list > [email protected] > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/03/17127.php > _______________________________________________ > devel mailing list > [email protected] > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/03/17128.php
pgpH2L48vwH2x.pgp
Description: PGP signature
