On 01/02/2017 12:38, Ole Holm Nielsen wrote:
On 02/01/2017 12:23 PM, Alvarez, Damian wrote:
I think that is a fair assumption to make. IMO, we shouldn’t expect
EB, or any other tool for that matter, to solve all of our problems
and just work optimally out of the box in every single case, without
operator intervention. The EB toolchains are tested in a limited
number of environments. If your environment differs from the tested
ones expect things to eventually go awry.
Besides that, I think this is more the responsibility of the OpenMPI
guys (or any other MPI runtime), to make their build system smart
enough to print a big fat warning (or fail) if something weird
happens. For example, there are OmniPath HCAs on the system, but no
PSM2 libraries. So instead of failing back to TCP, notify the user
and refuse to continue unless forced. But I don’t think that will
happen anytime soon. So, for the time being, we should try to be
careful.
Damian
On 01/02/17 11:44, "[email protected] on behalf of
Jure Pečar" <[email protected] on behalf of
[email protected]> wrote:
this state assumes you have some knowledge about the hardware
you're building on
I fully agree with Damian. Regarding OpenMPI I'm certainly confused
about all the different components that may or may not be built in,
what you might need them for, and what's the subset needed for a
particular hardware. The FAQ is slightly helpful:
https://www.open-mpi.org/faq/?category=building
We briefly discussed this during the EasyBuild conf call this afternoon,
see also the notes at
https://github.com/hpcugent/easybuild/wiki/Conference-call-notes-20170201#performance-issues-with-out-of-the-box-mpi-installations
.
The consensus seemed to be that:
* automated site/system-specific configuring and performance tuning of
installations performed with EasyBuild, in particular MPI libraries, is
beyond the scope of EasyBuild;
* it is very unlikely fully automated tuning would provide performance
results close to that of manual tuning/configuring, and like Damian
already mentioned, this is really the job of the MPI libraries
themselves, not EasyBuild (we can't solve all the world's problems);
* we could/should make it more clear that certain installations may
require more attention w.r.t. custom tuning/configuration, in particular
MPI installations;
this could be done by expecting that some custom tweaking is
provided, and by emitting a big fat warning if this is missing entirely
(incl. pointers to documentation on how to configure/tune things properly)
* the latter implies that we need significantly better support for
site/system-specific customizations to easyconfigs that are included in
EasyBuild, i.e. supporting a way to provide *only* customizations (as
opposed to maintaining tweaked easyconfigs in a Git repository) for
particular software packages, which brings back the long-standing idea
of .ebp (easyconfig patch files), cfr.
https://github.com/hpcugent/easybuild-framework/issues/544
I hope we can discuss this further during the EasyBuild User Meeting
next week, and come up with a good design for dealing with this.
regards,
Kenneth