With all due respect, I think this still dodges the key question. Are we now saying that every user will be *required* to provide this info? If not, then what is the default?
Let’s face it: the default is what 90+% of the world is going to use. This all seems rather complex to expect the average user to figure out. > On Oct 21, 2015, at 8:09 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > REVISION 2 (based on feedback in last 24 hours). > > Changes: > > - NETWORK instead of NETWORK_TYPE > - Shared memory and process loopback are not affected by this CLI > - Change the OPAL API usage. > > I actually like points 1-8 below quite a bit. If implemented in ALL > BTLs/MTLs/etc., it can solve the "how do I disable XYZ across all of Open > MPI?" problem nicely. > > Point 9 -- what does QUALIFIER mean/how is it used? -- still needs work (no > real updates since rev 1 of this proposal). I am thinking that QUALIFIER > (somehow) can be used to figure out which OMPI code path to use for a given > network (e.g., BTL vs. MTL, etc.). > > ----- > > mpirun --[enable|disable] NETWORK[:QUALIFIER][,NETWORK[:QUALIFIER]]* > # Or "--[net|nonet]", or some other name if "enable|disable" is too general. > # Suggestions welcome. > > 1. The intent of these CLI options is to easily enable/disable specific > network types and/or specific interfaces. > > 2. The use of shared memory and process loopback is assumed (and is not > affected by these CLI options -- the "expert" level must be used if specific > control over shared memory / loopback is desired). > > 3. Both forms take a comma-delimited list of 1 or more items. > > 4. --enable would work similar to our "include" MCA params: OMPI will *only* > use the network type(s) listed (but will still use shared memory and process > loopback). > > 5. --disable would work similar to our "exclude" MCA params: OMPI will use > all network types *except* those listed (but will still used shared memory > and process loopback). > > 6. NETWORK values can generally be one of three things: > > - a human-recognizable name (e.g., "ib", "ethernet", ...etc.) > - a Linux interface device name (e.g., "eth0", "usnic_0", "mlx4_0", > optionally specifying a specific port if desired and relevant, such as > "mlx4_0:1") > - a network address (e.g., "10.20.0.0/16", which specifies a specific > network interface+port) > > 7. NETWORK and QUALIFIER values are parsed (by orterun/etc.) and distributed > to MPI processes. > > 8. MPI processes can query the NETWORK values during BTL/MTL/etc. > initialization and selection. > > It may be sufficient to have a simple "did the user specify this NETWORK > value?" (case insensitive) query function that just returns a boolean. > > For example, the TCP BTL could look like this (only showing "enable" logic > for simplicity -- adding "disable" logic is an exercise left for the reader): > > ----- > if (opal_network_value("eth") || opal_network_value("ethernet")) { > want_all_ip_interfaces = true; > } else { > foreach IP_interface { > // Search for strings like "eth0" or "10.10.0.0/16" > if (opal_network_value(ip_interface_name) || > opal_network_value(CIDR of ip_interface_name)) { > push(@desired_interfaces, ip_interface_name); > } > } > } > > foreach IP_interface { > if (want_all_ip_interfaces || @desired_interfaces contains ip_interface) > { > make a module for that IP interface > } > } > ----- > > The usnic BTL would likely be quite similar to the TCP BTL, but also look for > strings like "usnic_0". > > The openib BTL could look like this: > > ----- > if (opal_network_value("ib") || opal_network_value("infiniband")) { > want_all_ib_interfaces = true; > } else if (opal_network_value("roce") { > want_all_roce_interfaces = true; > } else if (opal_network_value("iwarp") { > want_all_iwarp_interfaces = true; > } else if (opal_network_value("eth") || opal_network_value("ethernet")) { > want_all_roce_interfaces = true; > want_all_iwarp_interfaces = true; > } else { > foreach verbs_interface { > // Search for strings like "mlx4_0" or "10.50.0.0/16" for > RoCE/iWARP/IB with IPoIB enabled. > // Could also search for IB subnet IDs, if desired...? > if (opal_network_value(verbs_interface_name) || > opal_network_value(subnet ID of verbs_interface_name) || > opal_network_value(IP CIDR of verbs_interface_name)) { > push(@desired_interfaces, verbs_interface_name); > } > } > } > > foreach verbs_interface { > make_module = false; > if (@desired_interfaces contains verbs_interface) { > make_module = true; > } else if (verbs_interface is IB && want_all_ib_interfaces) > make_module = true; > } else if (verbs_interface is RoCE && want_all_roce_interfaces) > make_module = true; > } else if (verbs_interface is iWARP && want_all_iwarp_interfaces) > make_module = true; > } > if (make_module) { > make a module for that verbs interface > } > } > ----- > > I imagine that the MXM MTL, Yalla PML, and hcoll and FCA colls, could be > similar, but slightly simpler since they (assumedly) don't care about iWARP > interfaces. > > PSM / PSM2 / uGNI / Portals / etc. can all do similar things. > > The key here is that ALL BTLs, MTLs, OSC, and COLL modules -- anything that > talks directly to the network -- will need to use this opal_network_value() > API. > > 9. The ":QUALIFIER" value is optional for each NETWORK_TYPE specified, and > can be used to disambiguate when a given network type can be reached multiple > ways in OMPI. E.g., it can help choose between the openib BTL, the MXM MTL, > and the Yalla PML. E.g.: > > mpirun --enable ib:btl > mpirun --enable ib:mtl > mpirun --enable ib:yalla > > That being said, I don't like these names (btl, mtl, yalla) because they mean > nothing to non-OMPI experts. But I like the concept that a QUALIFIER can > (somehow) help choose between the different OMPI code paths. > > Here's another example: > > mpirun --enable eth:tcp > mpirun --enable eth:usnic > > These QUALIFIER values are a *little* better, but not much -- the user still > has to know that they exist to know to choose one of them ("tcp" and > "usnic"). But note that usNIC will someday have tag matching support, so it > will be able to be used through the OFI MTL, too. Hence, "eth:usnic" won't > be unique... > > ...thoughts? > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/10/18232.php