On 01/02/2017 12:38, Ole Holm Nielsen wrote:
On 02/01/2017 12:23 PM, Alvarez, Damian wrote:
I think that is a fair assumption to make. IMO, we shouldn’t expect EB, or any other tool for that matter, to solve all of our problems and just work optimally out of the box in every single case, without operator intervention. The EB toolchains are tested in a limited number of environments. If your environment differs from the tested ones expect things to eventually go awry.

Besides that, I think this is more the responsibility of the OpenMPI guys (or any other MPI runtime), to make their build system smart enough to print a big fat warning (or fail) if something weird happens. For example, there are OmniPath HCAs on the system, but no PSM2 libraries. So instead of failing back to TCP, notify the user and refuse to continue unless forced. But I don’t think that will happen anytime soon. So, for the time being, we should try to be careful.

Damian

On 01/02/17 11:44, "[email protected] on behalf of Jure Pečar" <[email protected] on behalf of [email protected]> wrote:

this state assumes you have some knowledge about the hardware you're building on

I fully agree with Damian. Regarding OpenMPI I'm certainly confused about all the different components that may or may not be built in, what you might need them for, and what's the subset needed for a particular hardware. The FAQ is slightly helpful: https://www.open-mpi.org/faq/?category=building

We briefly discussed this during the EasyBuild conf call this afternoon, see also the notes at https://github.com/hpcugent/easybuild/wiki/Conference-call-notes-20170201#performance-issues-with-out-of-the-box-mpi-installations .

The consensus seemed to be that:

* automated site/system-specific configuring and performance tuning of installations performed with EasyBuild, in particular MPI libraries, is beyond the scope of EasyBuild;

* it is very unlikely fully automated tuning would provide performance results close to that of manual tuning/configuring, and like Damian already mentioned, this is really the job of the MPI libraries themselves, not EasyBuild (we can't solve all the world's problems);

* we could/should make it more clear that certain installations may require more attention w.r.t. custom tuning/configuration, in particular MPI installations; this could be done by expecting that some custom tweaking is provided, and by emitting a big fat warning if this is missing entirely (incl. pointers to documentation on how to configure/tune things properly)

* the latter implies that we need significantly better support for site/system-specific customizations to easyconfigs that are included in EasyBuild, i.e. supporting a way to provide *only* customizations (as opposed to maintaining tweaked easyconfigs in a Git repository) for particular software packages, which brings back the long-standing idea of .ebp (easyconfig patch files), cfr. https://github.com/hpcugent/easybuild-framework/issues/544

I hope we can discuss this further during the EasyBuild User Meeting next week, and come up with a good design for dealing with this.


regards,

Kenneth

Reply via email to