I see that MPI_Get_hw_resource_info() was introduced in MPI-4.1 p445:13.

I'm a little confused by the description of this routine.  p445:30-32 says (the 
PDF won't copy-n-paste this section for some reason, so I'm copy-n-pasting from 
the corresponding LaTeX source):

This information is stored as (\mpiarg{key},\mpiarg{value}) pairs where each 
key is the name of a hardware resource type and its value is set to 
\infoval{true} if the calling \MPI/ process is restricted to a single instance 
of a hardware resource of that type and \infoval{false} otherwise.  The order 
in which the keys are stored in \mpiarg{hw\_info} is unspecified.  This 
procedure will return different information for \MPI/ processes that are 
restricted to different hardware resources. Otherwise, info objects with 
identical (\mpiarg{key}, \mpiarg{value}) pairs are returned.

  1.  I'm not quite sure what the "true" and "false" values mean.
     *   E.g., what -- precisely -- does "a single instance of a hardware 
resource of that type" mean?
     *   For example, my company makes a piece of hardware that can have 
thousands of virtual NICs on it, and those virtual NICs might even migrate 
around to different pieces of hardware (e.g., they can migrate between 
different fiber optic outputs on the same NIC).  MPI processes are assigned to 
a virtual NIC, not a hardware NIC.  Am I allowed to include a reference to 
these virtual NICs in the keys/values that are returned (since the Linux device 
name refers to a virtual entity, not necessarily a specific set of hardware)?  
If so, how do I determine the true/false value to assign?
     *   The text states that the info keys/values are specific to the point of 
time when the call is made.  p446:11-12 even explicitly states that the process 
and/or its hardware restrictions may change over time.  So even if I grokked 
what "restricted to a single instance of a hardware resource of that type" is 
intended to mean, if things can change -- and they can -- what is the point of 
giving a true or false value to the user?
     *
Is the intent that keys will include a specific, unique reference to an 
instance of "hardware" (e.g., a PCI address)?  If so, then the value of "true" 
and "false" becomes even more nebulous (or meaningless).  E.g., if I list a key 
containing "cisco-nic-12bc83fde9" to indicate a specific NIC, what is the exact 
"hardware resource of that type", and/or how would an application know that 
"cisco-nic-12bc83fde9" and "cisco-nic-bbbbbbbbb" are of the same "hardware 
resource type"?
     *   I can imagine that there could be many different scenarios here; can 
someone provide some guidance on what exactly an implementation is supposed to 
do here?  This text seems to be... ambiguous.
  2.  The AtoI in p445:42-46 says that we should use URIs with a type of 
"openmpi://" or "hwloc://" or "pmix://" or "openmpi://" or "slurm://" or ...
     *   All of these are software models (although hwloc's data refers to 
either hardware or to software devices that correspond to some form of hardware 
-- although that's not always clear, either).
     *   The use of software models in the text is confusing, because the 
routine has "hw" in its name, strongly implying that there's supposed to be a 
direct tie-in to hardware.
     *   What is the intent here?
  3.  I'm not quite sure what the limitation of "This procedure will return 
different information for MPI processes that are restricted to different 
hardware resources" means.
     *   What if a) an MPI implementation returns an Info with a single key 
denoting the NIC, and b) the NIC is a generic Ethernet NIC (there's only one 
NIC in the node).
     *   On that NIC, from a fine-grained perspective, the MPI processes use 
different hardware resources, but from a coarse-grained perspective of the 
identification of "NIC", multiple MPI processes use the "same" NIC.
     *   Per the text, is an MPI implementation prohibited from returning the 
same value "blah://the_nic" in multiple MPI processes?  Or is an implementation 
required​ to return the same value "blah://the_nic" in all MPI processes on 
that node?  I really can't tell which way it's supposed to go.

In short, I find the text description of this function to be suitably ambiguous 
such that I could put anything I want in the info keys and corresponding 
values, and be able to justify it with one of a bunch of different 
interpretations of the text on pages 445-446.

--
Jeff Squyres
_______________________________________________
mpi-forum mailing list
mpi-forum@lists.mpi-forum.org
https://lists.mpi-forum.org/mailman/listinfo/mpi-forum

Reply via email to