On 02/01/2017 12:23 PM, Alvarez, Damian wrote:
I think that is a fair assumption to make. IMO, we shouldn’t expect EB, or any
other tool for that matter, to solve all of our problems and just work
optimally out of the box in every single case, without operator intervention.
The EB toolchains are tested in a limited number of environments. If your
environment differs from the tested ones expect things to eventually go awry.
Besides that, I think this is more the responsibility of the OpenMPI guys (or
any other MPI runtime), to make their build system smart enough to print a big
fat warning (or fail) if something weird happens. For example, there are
OmniPath HCAs on the system, but no PSM2 libraries. So instead of failing back
to TCP, notify the user and refuse to continue unless forced. But I don’t think
that will happen anytime soon. So, for the time being, we should try to be
careful.
Damian
On 01/02/17 11:44, "[email protected] on behalf of Jure Pečar"
<[email protected] on behalf of [email protected]> wrote:
this state assumes you have some knowledge about the hardware you're
building on
I fully agree with Damian. Regarding OpenMPI I'm certainly confused
about all the different components that may or may not be built in, what
you might need them for, and what's the subset needed for a particular
hardware. The FAQ is slightly helpful:
https://www.open-mpi.org/faq/?category=building
/Ole