At our cluster, the different nodes mount the /home/modules folder from 
different file systems, so they each contain the modules for the appropriate 
architecture.  The advantage is, that if a script or something else (like a 
Python venv) contains a path to a file in a module, it will always be to an 
executable built for the right architecture.

If the folder names are different, then I do not think that you can build a 
venv on one architecture, and use it on another.  And that is often very 
important for our production runs, where different jobs as part of the same 
project are submitted to different types of nodes.

https://wiki.fysik.dtu.dk/Niflheim_system/EasyBuild_modules/#setting-the-cpu-hardware-architecture

Best regards

Jakob



> On 3 Nov 2023, at 11.08, Loris Bennett <loris.benn...@fu-berlin.de> wrote:
> 
> Hi,
> 
> We need to manage an heterogeneous cluster and I am looking at how to
> organise building the software in this context.  My current idea is the
> following:
> 
>  1. Software is created within the following directory tree
> 
>  /nfs/easybuild/arch/x68_64/amd
>                         .../amd/zen3
>  /nfs/easybuild/arch/x68_64/amd
>                         .../intel
>                         .../intel/cascadelake
>                         .../intel/skylake_avx512
>                         .../generic 
> 
>  The paths below 'arch' correspond to those produced by
> 
>  https://github.com/EESSI/software-layer/blob/2023.06/eessi_software_subdir.py
> 
>  2. When each node is booted, a systemd service creates the following
>  directory
> 
>  /sw/sc/easybuild
> 
>  and in that the following links 
> 
>  generic -> /nfs/easybuild/arch/x86_64/generic
>  optimized -> /nfs/easybuild/arch/x86_64/intel/skylake_avx512
> 
>  3. Binary only software is installed via an administration node by
>  running EasyBuild with
> 
>  --prefix=/sw/sc/generic
> 
>  Software optimized for a specific architecture is built by sending a
>  job via Slurm to a node with the architecture needed and using
> 
>  --prefix=/sw/sc/optimized
> 
> Does this sound plausible?  Have I overlooked anything?
> 
> One thing I am not quite clear on is the following:
> 
> What would be the best way to determine whether an EC specifies a binary
> EasyBlock or whether the toolchain is 'SYSTEM' and thus the software
> should be built in 'generic'?  Or would it be better to say disk space
> is cheap and just install binary and 'SYSTEM' packages for each
> architecture in order to simplify things?
> 
> Any help/comments much appreciated.
> 
> Cheers,
> 
> Loris
> 
> -- 
> Dr. Loris Bennett (Herr/Mr)
> ZEDAT, Freie Universität Berlin

Reply via email to