Hi Alan,
Alan O'Cais <[email protected]> writes:
> Just to mention, an approach like Stefans does not necessarily mean
> users have to load another module to get access to software. For
> users, you only care about where the software for the architecture
> lives. In the default profile you can define this with, e.g.,:
>
> export
> DEFAULT_ARCHITECTURE_MODULEPATH=${SITE_EB_PREFIX}/${ARCH_SUBDIR}/modules/core
> module use ${DEFAULT_ARCHITECTURE_MODULEPATH}
>
> where ${SITE_EB_PREFIX} is common to all archs and ${ARCH_SUBDIR}
> needs to be defined/derived for each architecture.
>
> Then, developers could still use `hpc-env/<ver>` modules if those
> include the Lmod/Tmod equivalent of `module unuse
> ${DEFAULT_ARCHITECTURE_MODULEPATH}` at the beginning.
Thanks for pointing that out. I thought I was going to have to
reinstall EasyBuild itself in the 'generic' path, but since this is a
subdirectory of =${SITE_EB_PREFIX} it seems easier to
do something like
export EB_MODULEPATH=${SITE_EB_PREFIX}/modules
module use ${EB_MODULEPATH}
where
${SITE_EB_PREFIX}/modules
essentially just contains EasyBuild and then have
export
DEFAULT_ARCHITECTURE_MODULEPATH=${SITE_EB_PREFIX}/${ARCH_SUBDIR}/modules/core
module use ${DEFAULT_ARCHITECTURE_MODULEPATH}
for the architecture-specific software.
The drawback of this is, however, that I will still have the problem of
having to use '--force' and then remove the non-architecture-specific
packages once the corresponding architecture-specific packages have been
build. However, maybe I can get around that with a bit of creative soft
linking.
Cheers,
Loris
> Alan
>
> On 22-Sep-23 9:15 AM, Loris Bennett wrote:
>> Dear Stefan,
>>
>> "Dr. Stefan Harfst" <[email protected]> writes:
>>
>>> Dear Loris,
>>>
>>> will you not overwrite the existing module if you use --force?
>>> I.e. the module will always point to the software directory of the
>>> architecture you have build last.
>> I am mounting the directory on each nodes with the architecture-specific
>> software under
>>
>> /sw/sc/easybuild
>>
>> with each node mounting the correct branch of the NFS directory for its
>> CPU. But as you say, --force will overwrite modules, so I need to be
>> able to load EasyBuild separately from the other software.
>>
>>> We have done something similar and yes, we treat EasyBuild itself
>>> differently. I will try to outline our setup:
>>> * we have a basepath /cm/shared/uniol
>>> * in the basepath we have modules and sw
>>> * initially, our module path only consists of
>>> /cm/shared/uniol/modules/core in this path we put the EasyBuild
>>> modules and modules we call hpc-env
>>> * if you load hpc-env/<ver> the module path is extended with paths
>>> like /cm/shared/uniol/<arch>/<ver>/<cat>, where <arch> can be zen3,
>>> zen4, ice, sky or so. <ver> is actually referring to the GCCcore
>>> version but this is not so relevant. <cat> are the module classes
>>> like bio, chem and so on
>>> * the hpc-env module uses an environment variable to know which is
>>> the architecture of the current node, so it shows only that part of
>>> the module tree
>> This is an approach that I have seen used elsewhere. My thinking in
>> using the mount approach instead was that the users don't have to load
>> an extra module and that the modules don't have some slightly arbitrary
>> path, such as '/cm/shared/' (or in our case it is actually
>> '/trinity/shared/') baked into the 'root' variable.
>>
>>> * the sw-directory has the same structure, so .../sw/<arch>/<ver>
>>> and we use --subdir-modules and --subdir-software to install the
>>> software and module in the right paths (we have setup an alias that
>>> also identifies <arch> from the env variable)
>>> * we also have a SYSTEM arch for everything build with toolchain =
>>> SYSTEM (../sw/SYSTEM and correspondingly ../modules/SYSTEM), we
>>> link the SYSTEM-modules into all <arch>/<ver> (hidden, if there is
>>> non-SYSTEM version)
>> I hadn't thought about dealing with SYSTEM-modules separately. How does
>> this work when you build an architecture-dependent, piece of software
>> with a specific toolchain and '--robot', but which has a dependency
>> which uses the SYSTEM-toolchain? Everything will be installed in the
>> same architecture-dependent subdirectory, won't it?
>>
>>> * Easybuild itself is installed directly in sw but it could also go into
>>> SYSTEM (or your generic).
>> OK, so treating EasyBuild differently from the bulk of the software seems
>> indeed to be way to go.
>>
>>> It means that you need a lot of space and every module is installed
>>> several times, but you are only reusing a working Easyconfig and it
>>> can be somewhat automatized.
>> We are already resigned to the fact that a lot of space will be needed,
>> but it looks this sort of approach is the only way to ensure that each
>> node will see software that will definitely run on it.
>>
>>> Hope this helps.
>> Thanks for the detailed explanation of your approach.
>>
>> Cheers,
>>
>> Loris
>>
>> PS: See you in Oldenburg at the AK Supercomputing, maybe?
>>
>>> Best wishes
>>> Stefan
>>>
>>> --
>>> Scientific Computing
>>>
>>> Carl von Ossietzky University Oldenburg
>>> School of Mathematics and Natural Sciences
>>> 26111 Oldenburg, Germany
>>>
>>> Office: W03 1-139
>>> Phone: +49-441-798 3147
>>> E-Mail: [email protected]
>>> www: http://www.uni-oldenburg.de/fk5/wr
>>>
>>> -----Ursprüngliche Nachricht-----
>>> Von: [email protected] <[email protected]> Im
>>> Auftrag von Loris Bennett
>>> Gesendet: Donnerstag, 21. September 2023 17:12
>>> An: easybuild <[email protected]>
>>> Betreff: [easybuild] Installing EasyBuild as 'generic' architecture?
>>>
>>> ACHTUNG! Diese E-Mail kommt von Extern! WARNING! This email originated
>>> off-campus.
>>>
>>> Hi,
>>>
>>> EasyBuild plus all our software built with EasyBuild is currently installed
>>> under
>>>
>>> /nfs/easybuild/software/EasyBuild
>>>
>>> However, I am reinstalling software compiled for different
>>> microarchitectures in a directory structure which looks like the following:
>>>
>>> /nfs/easybuild/arch
>>> ├── generic
>>> └── x86_64
>>> ├── amd
>>> │ └── zen3
>>> └── intel
>>> ├── cascadelake
>>> └── skylake_avx512
>>>
>>> At the moment I am essentially using something like
>>>
>>> eb --prefix=/nfs/easybuild/arch/x86_64/intel/cascadelake --force ...
>>>
>>> to rebuild modules. However, at some point I will want to avoid using
>>> '--force', but IIUC that will mean that I have to 'unuse' the original
>>> module path
>>>
>>> /nfs/easybuild/software/EasyBuild/modules
>>>
>>> in order to build modules which already exist in the
>>> non-architecture-specific path. This in turn, since EasyBuild module
>>> itself is in that path, would deactivate the EasyBuild module.
>>>
>>> Does that mean I have to reinstall EasyBuild under
>>>
>>> /nfs/easybuild/arch/generic
>>>
>>> which I guess would be the most consistent solution, or is there an
>>> alternative?
>>>
>>> Cheers,
>>>
>>> Loris
>>>
>>> --
>>> Dr. Loris Bennett (Herr/Mr)
>>> ZEDAT, Freie Universität Berlin
>>>
>
--
Dr. Loris Bennett (Herr/Mr)
ZEDAT, Freie Universität Berlin