Dear Stefan,

"Dr. Stefan Harfst" <[email protected]> writes:

> Dear Loris,
>
> will you not overwrite the existing module if you use --force?
> I.e. the module will always point to the software directory of the
> architecture you have build last. 

I am mounting the directory on each nodes with the architecture-specific
software under

  /sw/sc/easybuild

with each node mounting the correct branch of the NFS directory for its
CPU.  But as you say, --force will overwrite modules, so I need to be
able to load EasyBuild separately from the other software.

> We have done something similar and yes, we treat EasyBuild itself 
> differently. I will try to outline our setup:
> * we have a basepath /cm/shared/uniol
> * in the basepath we have modules and sw
> * initially, our module path only consists of /cm/shared/uniol/modules/core 
> in this path we put the EasyBuild modules and modules we call hpc-env
> * if you load hpc-env/<ver> the module path is extended with paths like 
> /cm/shared/uniol/<arch>/<ver>/<cat>, where <arch> can be zen3, zen4, ice, sky 
> or so. <ver> is actually referring to the GCCcore version but this is not so 
> relevant. <cat> are the module classes like bio, chem and so on
> * the hpc-env module uses an environment variable to know which is the 
> architecture of the current node, so it shows only that part of the module 
> tree

This is an approach that I have seen used elsewhere.  My thinking in
using the mount approach instead was that the users don't have to load
an extra module and that the modules don't have some slightly arbitrary
path, such as '/cm/shared/' (or in our case it is actually
'/trinity/shared/') baked into the 'root' variable.

> * the sw-directory has the same structure, so .../sw/<arch>/<ver> and we use 
> --subdir-modules and --subdir-software to install the software and module in 
> the right paths (we have setup an alias that also identifies <arch> from the 
> env variable)
> * we also have a SYSTEM arch for everything build with toolchain = SYSTEM 
> (../sw/SYSTEM and correspondingly ../modules/SYSTEM), we link the 
> SYSTEM-modules into all <arch>/<ver> (hidden, if there is non-SYSTEM version)

I hadn't thought about dealing with SYSTEM-modules separately.  How does
this work when you build an architecture-dependent, piece of software
with a specific toolchain and '--robot', but which has a dependency
which uses the SYSTEM-toolchain?  Everything will be installed in the
same architecture-dependent subdirectory, won't it?

> * Easybuild itself is installed directly in sw but it could also go into 
> SYSTEM (or your generic).

OK, so treating EasyBuild differently from the bulk of the software seems
indeed to be way to go.

> It means that you need a lot of space and every module is installed
> several times, but you are only reusing a working Easyconfig and it
> can be somewhat automatized.

We are already resigned to the fact that a lot of space will be needed,
but it looks this sort of approach is the only way to ensure that each
node will see software that will definitely run on it.

>
> Hope this helps.

Thanks for the detailed explanation of your approach.

Cheers,

Loris

PS: See you in Oldenburg at the AK Supercomputing, maybe?

> Best wishes
> Stefan
>
> --
> Scientific Computing
>
> Carl von Ossietzky University Oldenburg
> School of Mathematics and Natural Sciences
> 26111 Oldenburg, Germany
>
> Office: W03 1-139
> Phone: +49-441-798 3147
> E-Mail: [email protected]
> www: http://www.uni-oldenburg.de/fk5/wr
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected] <[email protected]> Im 
> Auftrag von Loris Bennett
> Gesendet: Donnerstag, 21. September 2023 17:12
> An: easybuild <[email protected]>
> Betreff: [easybuild] Installing EasyBuild as 'generic' architecture?
>
> ACHTUNG! Diese E-Mail kommt von Extern! WARNING! This email originated 
> off-campus.
>
> Hi,
>
> EasyBuild plus all our software built with EasyBuild is currently installed 
> under
>
>   /nfs/easybuild/software/EasyBuild
>
> However, I am reinstalling software compiled for different microarchitectures 
> in a directory structure which looks like the following:
>
>   /nfs/easybuild/arch
>   ├── generic
>   └── x86_64
>       ├── amd
>       │   └── zen3
>       └── intel
>           ├── cascadelake
>           └── skylake_avx512
>
> At the moment I am essentially using something like
>
>   eb --prefix=/nfs/easybuild/arch/x86_64/intel/cascadelake --force ...
>
> to rebuild modules.  However, at some point I will want to avoid using 
> '--force', but IIUC that will mean that I have to 'unuse' the original module 
> path
>
>   /nfs/easybuild/software/EasyBuild/modules
>
> in order to build modules which already exist in the 
> non-architecture-specific path.  This in turn, since EasyBuild module itself 
> is in that path, would deactivate the EasyBuild module.
>
> Does that mean I have to reinstall EasyBuild under
>
>   /nfs/easybuild/arch/generic
>
> which I guess would be the most consistent solution, or is there an 
> alternative?
>
> Cheers,
>
> Loris
>
> --
> Dr. Loris Bennett (Herr/Mr)
> ZEDAT, Freie Universität Berlin
>
-- 
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin

Reply via email to