Hi Trey,
On 23/08/14 05:14, Trey Dockendorf wrote:
I'm having some issues building BLACS, which is in one of the toolchains I wish
to use.
I created a gmvapich2-1.8.2 toolchain (GCC-4.8.3 and MVAPICH2-2.0). I built MVAPICH2
using the options for SLURM and using "srun" so mpirun was NOT built:
MVAPICH2-2.0-GCC-4.8.3.eb options:
withchkpt = True
withhwloc = True
rdma_type = 'gen2'
blcr_path = '/usr'
blcr_inc_path = '/usr/include'
blcr_lib_path = '/usr/lib64'
configopts = "--with-pm=no --with-slurm=/usr"
BLACS-1.1-gmvapich2-1.8.2.eb is same as BLACS-1.1-gmvapich2-1.7.9a2.eb [1]
except I have the following toolchain line
toolchain = {'name': 'gmvapich2', 'version': '1.8.2'}
When I run eb I get the following:
== 2014-08-22 21:55:02,128 main.run ERROR EasyBuild crashed with an error (at
easybuild/tools/run.py:382 in parse_cmd_output): cmd "mpirun -n 2
./EXE/xtc_CsameF77" exited with exitcode 127 and output:
/bin/bash: mpirun: command not found
I found in easybuild/tools/toolchain/mpi.py where mpirun is being set to use
since this is MVAPICH2 [2]. However using the configopts I have above for
MPVAPICH2 requires the use of srun. Is there any way to override what is being
set?
If I comment out the "configopts" line my MVAPICH2 eb file then BLACS builds
just fine, but I'd much prefer to have srun used instead as that simplifies our user
documentation and our slurm.conf mpi settings.
You're basically running into an assumption made in the framework (@
[2]) that is apparently faulty, i.e. that mpirun is always available.
We should probably enhance the mpi_cmd_for function such that it check
whether mpirun is available, as falls back to different commands if not,
including srun.
It's not terribly difficult to do this I think, if you're familiar with
Python at least: rather than hardcoding "mpirun" when defining the
"mpi_cmds" dictionary, it should
insert whichever command is available first (I think?).
The mpi_cmd_for function only uses very simple mpirun commands (e.g. it
only passes "-n" in the case of MVAPICH2), so maybe you can get away
with defining mpirun as an alias for srun when running eb, as a workaround?
mpi_cmd_for is actually a poor mans solution, we should be relying on
our own mympirun "wrapper script" (see
https://github.com/hpcugent/vsc-mympirun).
But that wouldn't have helped in this case, since it doesn't know about
srun either (yet).
regards,
Kenneth
Thanks,
- Trey
[1]
https://github.com/hpcugent/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/b/BLACS/BLACS-1.1-gmvapich2-1.7.9a2.eb
[2]
https://github.com/hpcugent/easybuild-framework/blob/78690b0771ca971326fd81c20f1b25ed18d801a9/easybuild/tools/toolchain/mpi.py#L177
=============================
Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: [email protected]
Jabber: [email protected]