On Wed, 4 Oct 2017, Mike Cammilleri wrote:

Hi Everyone,

I'm in search of a best practice for setting up Environment Modules for our 
Slurm 16.05.6 installation (we have not had the time to upgrade to 17.02 yet). 
We're a small group and had no explicit need for this in the beginning, but as 
we are growing larger with more users we clearly need something like this.

I see there are a couple ways to implement Environment Modules and I'm 
wondering which would be the cleanest, most sensible way. I'll list my ideas 
below:

1. Install Environment Modules package and relevant modulefiles on the slurm 
head/submit/login node, perhaps in the default /usr/local/ location. The 
modulefiles modules would define paths to various software packages that exist 
in a location visible/readable to the compute nodes (NFS or similar). The user 
then loads the modules manually at the command line on the submit/login node 
and not in the slurm submit script - but specify #SBATCH --export=ALL and 
import the environment before submitting the sbatch job.

2. Install Environment Modules packages in a location visible to the entire 
cluster (NFS or similar), including the compute nodes, and the user then 
includes their 'module load' commands in their actual slurm submit scripts 
since the command would be available on the compute nodes - loading software 
(either local or from network locations depending on what they're loading) 
visible to the nodes

3. Another variation would be to use a configuration manager like bcfg2 to make 
sure Environment Modules and necessary modulefiles and all configurations are 
present on all compute/submit nodes. Seems like that's potential for a mess 
though.

Is there a preferred approach? I see in the archives some folks have strange 
behavior when a user uses --export=ALL, so it would seem to me that the cleaner 
approach is to have the 'module load' command available on all compute nodes 
and have users do this in their submit scripts. If this is the case, I'll need 
to configure Environment Modules and relevant modulefiles to live in special 
places when I build Environment Modules (./configure --prefix=/mounted-fs 
--modulefilesdir=/mounted-fs, etc.).

We've been testing with modules-tcl-1.923

Thanks for any advice,
mike


For ease of use of end users, I would recommend either 2 or 3.  In addition to 
making things
more consistent between interactive and batch use, it also makes it more 
manageable if particular
jobs require specific versions. E.g., if I have a job that requires the specific version X of the "foo" library and the specific version Y of the "bar" library, that can be specified in
the job script.

Between 2 and 3, I would generally recommend 2.  Some of that depends on how 
you handle software
installs.  If most of the software is in a shared filesystem, it makes sense to 
keep module
files there as well; if on other hand software is installed locally using a 
configuration mgmt
system, then it might make sense to do that for modules as well.  Basically, if 
one adds a
new software package, do you want to go through the overhead of config mgr to 
push out the
module files?

Note, however, that the available software on the compute and login nodes might 
be different.
E.g., you might have an PNG image viewer available as a module on the login 
nodes that would
not be useful on the compute nodes.  If desired, you could do something like 
have two module
directories; one shared between compute and login nodes, one for login nodes 
only.

Tom Payerle
DIT-ACIGS/Mid-Atlantic Crossroads       paye...@umd.edu
5825 University Research Court          (301) 405-6135
University of Maryland
College Park, MD 20740-3831

Reply via email to