On 2019-11-15 18:06, Alastair McKinstry wrote:
Hi all,
Do you think it is worth setting :
export OMPI_MCA_btl_base_warn_component_unused=0
in the defaults for OpenMPI ?
Hi Alistair, I think it could be reasonable to do that. Debian is "the
universal operating system", so our standard build is intended for a
wide variety of systems. Obviously many of those systems will not have
OpenFabric/Infiniband transport and we would not expect them to have it.
Especially for computational workstations, which are not HPC but for
which MPI is still very useful. So the warning is not particularly
helpful for the general case. For the specific systems where the
hardware does support OpenFabric, the warning won't appear anyway.
The only case where the warning is useful is as a diagnostic where
OpenFabric transport is available, but cannot be used by OpenMPI for
some reason (e.g. if ib_core kernel module is not available). But the
default Debian Linux image does provide ib_core (builds with
CONFIG_MLX5_INFINIBAND=m) so we don't expect it to not work. If a HPC
centre supporting OpenFabric hardware is building its own kernels then
we can expect them to know what they're doing.
Perhaps if you set OMPI_MCA_btl_base_warn_component_unused=0 by default,
then it would be worth adding a comment to README.Debian suggesting to
set OMPI_MCA_btl_base_warn_component_unused=1 if a user or administrator
wants to verify that no components like OpenFabric have gone unused.
Are there any other components other than OpenFabrics that
OMPI_MCA_btl_base_warn_component_unused=0 might hide?
Another question, in common cloud computing (e.g. Amazon), or full HPC
systems, how likely is it that the installation will have OpenFabric
hardware anyway? How often would these systems not have it?
Drew