Hello George,

Are you referring to OMPI_MCA_btl_tcp_links parameter?

If so, the graph matching will have no issue matching the increased links (It 
will just increase the time it takes for each process to do so – refer to the 
graph in the PR (Total Time Spent in add_procs per Process). Theoretically, 
there is no difference in increasing the num links compared to not having any 
additional links – since this only increases the scale of the bipartite flow 
problem. If the matching logic is sound, this should still select the best 
interfaces. The tests I performed to gather the graph data for the first graph 
was simulated using the num links variable.

Please let me know if I misunderstood something.

Thanks,
William Zhang

From: George Bosilca <bosi...@icl.utk.edu>
Date: Thursday, January 9, 2020 at 12:02 PM
To: Open MPI Developers <devel@lists.open-mpi.org>
Cc: "Zhang, William" <wilzh...@amazon.com>
Subject: Re: [OMPI devel] Open MPI BTL TCP interface mapping

Will,

The 7134 issue is complex in its interactions with the rest of the TCP BTL, and 
I could not find the time to look at it careful enough (or test it on AWS). But 
maybe you can address my main concern here. #7134 interfaces selection will 
have an impact on the traffic distribution among the different sockets by 
altering the interfaces selection on the links we have in the TCP BTL (that 
allows us to increase the bandwidth by multiplexing the streams between peers). 
I have the feeling they are not nicely collaborating to increase the total 
bandwidth, but that instead they will prevent each other from functioning 
efficiently.

  George.


On Thu, Jan 9, 2020 at 2:36 PM Zhang, William via devel 
<devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>> wrote:
Hello devel,

Thanks George for reviewing: https://github.com/open-mpi/ompi/pull/7167

Can I get a review (not from Brian) for this patch as well: 
https://github.com/open-mpi/ompi/pull/7134

These PR’s fix common matching bugs that users utilizing the tcp btl encounter. 
It has been proven to fix issue https://github.com/open-mpi/ompi/issues/7115 – 
it’s also the first utilization of the Reachability framework, which can 
provide valuable reference material.

Thanks,
William Zhang

P.S.
I will start increasing the frequency of these reminders, since these PR’s are 
2+ months old.

Reply via email to