Dear Alan,

Regarding the deduplication, I assume you are talking about CVMFS.
Currently our software is in an XFS file system, where it seems
deduplication would have to be done in some sort of offline manner.
However, maybe I shouldn't worry about that too much as disk space isn't
much of a problem these days.

Regarding the EESSI bot, compared to my naϊve link setting, that looks
like a whole world of complexity that I would have to get my head round.
Is there a killer argument why I might want to do that, rather than
being lazy and sticking with my links?

Cheers,

Loris

Alan O'Cais <[email protected]> writes:

> Dear Loris,
>
> This is indeed to a large extent the same approach as EESSI. Our filesystem 
> layer has de-duplication so there is no need for us to make the distinction 
> in 3 and
> the resulting discussion (there's no cost to multiple installations of the 
> same thing). You should take a look at the EESSI bot as it can handle the 
> workflow you
> describe.
>
> Alan 
>
> On 03-Nov-23 11:08 AM, Loris Bennett wrote:
>
>  Hi,
>
> We need to manage an heterogeneous cluster and I am looking at how to
> organise building the software in this context.  My current idea is the
> following:
>
>   1. Software is created within the following directory tree
>
>   /nfs/easybuild/arch/x68_64/amd
>                          .../amd/zen3
>   /nfs/easybuild/arch/x68_64/amd
>                          .../intel
>                          .../intel/cascadelake
>                          .../intel/skylake_avx512
>                          .../generic 
>
>   The paths below 'arch' correspond to those produced by
>
>   
> https://github.com/EESSI/software-layer/blob/2023.06/eessi_software_subdir.py
>
>   2. When each node is booted, a systemd service creates the following
>   directory
>
>   /sw/sc/easybuild
>
>   and in that the following links 
>
>   generic -> /nfs/easybuild/arch/x86_64/generic
>   optimized -> /nfs/easybuild/arch/x86_64/intel/skylake_avx512
>
>   3. Binary only software is installed via an administration node by
>   running EasyBuild with
>
>   --prefix=/sw/sc/generic
>
>   Software optimized for a specific architecture is built by sending a
>   job via Slurm to a node with the architecture needed and using
>
>   --prefix=/sw/sc/optimized
>
> Does this sound plausible?  Have I overlooked anything?
>
> One thing I am not quite clear on is the following:
>
> What would be the best way to determine whether an EC specifies a binary
> EasyBlock or whether the toolchain is 'SYSTEM' and thus the software
> should be built in 'generic'?  Or would it be better to say disk space
> is cheap and just install binary and 'SYSTEM' packages for each
> architecture in order to simplify things?
>
> Any help/comments much appreciated.
>
> Cheers,
>
> Loris
>
-- 
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin

Reply via email to