Odd - the topology object gets filled in during init, well before the fence (as it doesn't need the fence, being a purely local op). Let me take a look
> On Mar 20, 2020, at 9:15 AM, Barrett, Brian <bbarr...@amazon.com> wrote: > > PMIx folks - > > When using mpirun for launching, it looks like opal_hwloc_topology isn't > filled in at the point where we need the information > (mtl_ofi_component_init()). This would end up being before the modex fence, > since the goal is to figure out which address the process should publish. > I'm not sure that makes a difference here, but wanted to figure out if this > was expected and, if so, if we had options for getting the right data from > PMIx early enough in the process. Sorry, this is part of the runtime changes > I haven't been following closely enough. > > Brian > > -----Original Message----- > From: devel <devel-boun...@lists.open-mpi.org> on behalf of Ralph Castain via > devel <devel@lists.open-mpi.org> > Reply-To: Open MPI Developers <devel@lists.open-mpi.org> > Date: Wednesday, March 18, 2020 at 2:08 PM > To: "Zhang, William" <wilzh...@amazon.com> > Cc: Ralph Castain <r...@open-mpi.org>, OpenMPI Devel > <devel@lists.open-mpi.org> > Subject: RE: [EXTERNAL] [OMPI devel] Add multi nic support for ofi MTL using > hwloc > > > > > Excellent - thanks! Now if only the OpenMP people would be so > reasonable...sigh. > > >> On Mar 18, 2020, at 10:26 AM, Zhang, William <wilzh...@amazon.com> wrote: >> >> Hello, >> >> We're getting the topology info using the opal_hwloc_topology object, we >> won't be doing our own discovery. >> >> William >> >> On 3/17/20, 11:54 PM, "devel on behalf of Ralph Castain via devel" >> <devel-boun...@lists.open-mpi.org on behalf of devel@lists.open-mpi.org> >> wrote: >> >> >> >> >> Hey folks >> >> I saw the referenced "new feature" on the v5 feature spreadsheet and >> wanted to ask a quick question. Is the OFI MTL going to be doing its own >> hwloc topology discovery for this feature? Or is it going to access the >> topology info via PMIx and the OPAL hwloc abstraction? >> >> I ask because we know that having every proc do its own topology discovery >> is a major problem on large-core systems (e.g., KNL or Power9). If OFI is >> going to do an hwloc discovery operation, then we need to ensure this >> doesn't happen unless specifically requested by a user willing to pay that >> price (and it was significant). >> >> Can someone from Amazon (as the item is assigned to them) please clarify? >> Ralph >> >> >> >> > > > >