Ok, that makes total sense.  I'm leaning towards us fixing this in the OFI MTL 
rather than making everyone load.  I agree with you that it probably doesn't 
matter, but let's not create a corner case.  I'm also going to follow up with 
the dev who wrote this code, but my guess is that we should add a note in the 
header docs somewhere.  We'll take that action item.

Brian

From: Ralph Castain <r...@open-mpi.org>
Date: Friday, March 20, 2020 at 9:46 AM
To: "Barrett, Brian" <bbarr...@amazon.com>
Cc: OpenMPI Devel <devel@lists.open-mpi.org>
Subject: RE: [EXTERNAL] [OMPI devel] Add multi nic support for ofi MTL using 
hwloc


If you call "hwloc_topology_load", then hwloc merrily does its discovery and 
slams many-core systems. If you call "opal_hwloc_get_topology", then that is 
fine - it checks if we already have it, tries to get it from PMIx (using shared 
mem for hwloc 2.x), and only does the discovery if no other method is available.

IIRC, we might have decided to let those who needed the topology call 
"opal_hwloc_get_topology" to ensure the topo was available so that we don't 
load it unless someone actually needs it. However, I get the sense we wound up 
always needing the topology, so it was kind of a moot point.


Given that all we do is setup a shmem link (since hwloc 2 is now widely 
available), it shouldn't matter. However, if you want to stick with the "only 
get it if needed" approach, then just add a call to "opal_hwloc_get_topology" 
prior to using the topology and close that PR as "unneeded"




On Mar 20, 2020, at 9:35 AM, Barrett, Brian 
<bbarr...@amazon.com<mailto:bbarr...@amazon.com>> wrote:

But does raise the question; should we call get_topology() for belt and 
suspenders in OFI?  Or will that cause your concerns from the start of this 
thread?

Brian

From: Ralph Castain <r...@open-mpi.org<mailto:r...@open-mpi.org>>
Date: Friday, March 20, 2020 at 9:31 AM
To: OpenMPI Devel <devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>
Cc: "Barrett, Brian" <bbarr...@amazon.com<mailto:bbarr...@amazon.com>>
Subject: RE: [EXTERNAL] [OMPI devel] Add multi nic support for ofi MTL using 
hwloc


https://github.com/open-mpi/ompi/pull/7547 fixes it and has an explanation as 
to why it wasn't catching us elsewhere in the MPI code




On Mar 20, 2020, at 9:22 AM, Ralph Castain via devel 
<devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>> wrote:

Odd - the topology object gets filled in during init, well before the fence (as 
it doesn't need the fence, being a purely local op). Let me take a look




On Mar 20, 2020, at 9:15 AM, Barrett, Brian 
<bbarr...@amazon.com<mailto:bbarr...@amazon.com>> wrote:

PMIx folks -

When using mpirun for launching, it looks like opal_hwloc_topology isn't filled 
in at the point where we need the information (mtl_ofi_component_init()).  This 
would end up being before the modex fence, since the goal is to figure out 
which address the process should publish.  I'm not sure that makes a difference 
here, but wanted to figure out if this was expected and, if so, if we had 
options for getting the right data from PMIx early enough in the process.  
Sorry, this is part of the runtime changes I haven't been following closely 
enough.

Brian

-----Original Message-----
From: devel 
<devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>> on 
behalf of Ralph Castain via devel 
<devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>
Reply-To: Open MPI Developers 
<devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>
Date: Wednesday, March 18, 2020 at 2:08 PM
To: "Zhang, William" <wilzh...@amazon.com<mailto:wilzh...@amazon.com>>
Cc: Ralph Castain <r...@open-mpi.org<mailto:r...@open-mpi.org>>, OpenMPI Devel 
<devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>>
Subject: RE: [EXTERNAL] [OMPI devel] Add multi nic support for ofi MTL using 
hwloc




  Excellent - thanks! Now if only the OpenMP people would be so 
reasonable...sigh.




On Mar 18, 2020, at 10:26 AM, Zhang, William 
<wilzh...@amazon.com<mailto:wilzh...@amazon.com>> wrote:

Hello,

We're getting the topology info using the opal_hwloc_topology object, we won't 
be doing our own discovery.

William

On 3/17/20, 11:54 PM, "devel on behalf of Ralph Castain via devel" 
<devel-boun...@lists.open-mpi.org<mailto:devel-boun...@lists.open-mpi.org>on 
behalf of devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org>> wrote:




 Hey folks

 I saw the referenced "new feature" on the v5 feature spreadsheet and wanted to 
ask a quick question. Is the OFI MTL going to be doing its own hwloc topology 
discovery for this feature? Or is it going to access the topology info via PMIx 
and the OPAL hwloc abstraction?

 I ask because we know that having every proc do its own topology discovery is 
a major problem on large-core systems (e.g., KNL or Power9). If OFI is going to 
do an hwloc discovery operation, then we need to ensure this doesn't happen 
unless specifically requested by a user willing to pay that price (and it was 
significant).

 Can someone from Amazon (as the item is assigned to them) please clarify?
 Ralph

Reply via email to